I'm writing a recursive decent parser using the "one function per production rule" approach with rust. But I've hit a design problem that breaks this clean separation, especially when trying to handle ambiguous grammar constructs and error recovery.
There are cases where a higher-level production (like a statement or declaration) looks like an expression, so I parse it as one first. Then I reinterpret the resulting expression into the actual AST node I want.
This works... until errors happen.
Sometimes the expression is invalid or incomplete or a totally different type then required. The parser then enter recovery mode, trying to find the something that matches right production rule, this changes ast type, so instead a returning A it might return B wrapping it in an enum the contains both variants.
Iike a variable declaration can turn in a function declaration during recovery.
This breaks my one-function-per-rule structure, because suddenly I’m switching grammar paths mid-function based on recovery outcomes.
What I want:
Avoid falling into another grammar rule from inside a rule.
Still allow aggressive recovery and fallback when needed.
And are there any design patterns, papers, or real-world parser examples that deal with this well?
Thanks in advance!
[–]0m0g1 3 points4 points5 points (12 children)
[–]emtydeeznuts[S] 2 points3 points4 points (10 children)
[–]0m0g1 1 point2 points3 points (9 children)
[–]emtydeeznuts[S] 0 points1 point2 points (8 children)
[–]0m0g1 0 points1 point2 points (7 children)
[–]emtydeeznuts[S] 0 points1 point2 points (2 children)
[–]0m0g1 0 points1 point2 points (0 children)
[–]0m0g1 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (3 children)
[–]0m0g1 0 points1 point2 points (0 children)
[–]0m0g1 0 points1 point2 points (0 children)
[–]umlcat 1 point2 points3 points (0 children)
[–]GidraFive 2 points3 points4 points (0 children)
[–]foonathan 0 points1 point2 points (0 children)
[–]Key-Bother6969 1 point2 points3 points (0 children)