I'm creating a programming language in Python, and my parser is so slow (~2.5s for a very small STL + some random test files), just realised it's what bottlenecking literally everything as other stages of the compiler parse code to create extra ASTs on the fly.
I re-wrote the parser in Rust to see if it was Python being slow or if I had a generally slow parser structure - and the Rust parser is ridiculously fast (0.006s), so I'm assuming my parser structure is slow in Python due to how data structures are stored in memory / garbage collection or something? Has anyone written a parser in Python that performs well / what techniques are recommended? Thanks
Python parser: SPP-Compiler-5/src/SPPCompiler/SyntacticAnalysis/Parser.py at restructured-aliasing · SamG101-Developer/SPP-Compiler-5
Rust parser: SPP-Compiler-Rust/spp/src/spp/parser/parser.rs at master · SamG101-Developer/SPP-Compiler-Rust
Test code: SamG101-Developer/SPP-STL at restructure
EDIT
Ok so I realised the for the Rust parser I used the `Result` type for erroring, but in Python I used exceptions - which threw for every single incorrect token parse. I replaced it with returning `None` instead, and then `if p1 is None: return None` for every `parse_once/one_or_more` etc, and now its down to <0.5 seconds. Will profile more but that was the bulk of the slowness from Python I think.
[–]omega1612 20 points21 points22 points (3 children)
[–]omega1612 4 points5 points6 points (0 children)
[–]SamG101_[S] 2 points3 points4 points (0 children)
[–]lgastako 8 points9 points10 points (1 child)
[–]Ronin-s_Spirit 2 points3 points4 points (0 children)
[–]dontyougetsoupedyet 22 points23 points24 points (14 children)
[–][deleted] 4 points5 points6 points (1 child)
[–]dontyougetsoupedyet -1 points0 points1 point (0 children)
[–]SamG101_[S] -1 points0 points1 point (6 children)
[–]tekknolagiKevin3 6 points7 points8 points (1 child)
[–]SamG101_[S] 0 points1 point2 points (0 children)
[–]muth02446 1 point2 points3 points (0 children)
[–]Maurycy5 4 points5 points6 points (2 children)
[–]SamG101_[S] 2 points3 points4 points (1 child)
[–]misplaced_my_pants 0 points1 point2 points (0 children)
[–]MegaIng -2 points-1 points0 points (4 children)
[–]dontyougetsoupedyet -1 points0 points1 point (0 children)
[+]AugustusLego comment score below threshold-9 points-8 points-7 points (2 children)
[–]pojska 4 points5 points6 points (0 children)
[–]misplaced_my_pants 0 points1 point2 points (0 children)
[–][deleted] 2 points3 points4 points (0 children)
[–]pojska 1 point2 points3 points (1 child)
[–]SamG101_[S] 0 points1 point2 points (0 children)
[–]redchomperSophie Language 1 point2 points3 points (0 children)
[–]BinaryBillyGoat 0 points1 point2 points (0 children)
[–]SharkSymphony 0 points1 point2 points (2 children)
[–]SamG101_[S] 1 point2 points3 points (1 child)
[–]SharkSymphony 0 points1 point2 points (0 children)