In this video, we're building a regex engine from scratch - one that can handle the core features like character classes, quantifiers, alternation,, capturing & non capturing groups and more. By the end, you'll understand both how to read regex patterns and how the matching engine actually works under the hood. Let's get started.
Please note that this is one implementation of regex engine (similar to Python's re module). There are other implementations using other techniques like Thompson's NFA but since it doesn't support features like back referencing, I chose to go this route.
Source Code: https://github.com/RivaanRanawat/rege...
Discord Server - / discord
Resources:
How does the lexer work - https://tinyurl.com/2uc5ubd5
How does the parser work - https://tinyurl.com/mtzfkhn2
How does the matcher engine work- https://tinyurl.com/445r6k5p
AST Nodes - https://tinyurl.com/mvb67ubt
Example of how regex works - https://tinyurl.com/53724nn5
Timestamps:
(00:00:00) Introduction
(00:00:32) How does the regex engine work - Brief overview
(00:05:02) Project setup - Python Virtual Environment
(00:08:14) The Need For Lexer & How does the lexer work?
(00:23:09) Understanding Regex & Writing down all Token types
(01:16:13) Coding up Lexer/Tokenizer
(01:58:17) Testing the Lexer & Examining the Output
(02:03:00) The Need for Parser & How does a Parser work?
(02:24:25) Useless Theory - Recursive Descent Parser, Top down parsing, LL(k) grammar
(02:29:35) AST Nodes
(03:00:37) Coding up Parser, Creating Abstract Syntax Tree (AST)
(04:28:50) Summarizing Parser
(04:31:39) Testing the Parser
(04:35:13) Coding up the Matcher Engine
(06:21:08) Testing the match Function
(06:24:26) search function
(06:26:43) findall function
(06:31:40) Conclusion
Connect With Me Here:
GitHub: https://github.com/rivaanranawat
Linkedin: / rivaan-ranawat
Medium: / namanrivaan
X: https://x.com/RanawatRivaan
Instagram: / optimalcoding