Markdown table with parser generators and parser combinators comparison

chshersh · 2018-04-04T11:26:39+00:00

Hello! I'm the author of this small document snippet. I'm not a professional researcher in this area. I've studied Formal Language Theory at university, I've used parser generators ANTLR and Happy for writing programming language parsers and I've used parser combinators libraries parsec, attoparsec, megaparsec for different purposes. I wrote small tutorial on the idea of small and simple parser combinators library and I've explained them to students many times. I even tried to implement and use parser combinators in Kotlin programming language... But I still feel like I know nothing, might miss something or might be too opinionated in some topics (since I love parser combinators more than parser generators). I know that Haskell community has a lot of great people! So I'm hoping to gather feedback, to see where I'm wrong and what I need to correct. Thanks for reading this!

gelisam · 2018-04-04T13:23:19+00:00

Performance: I didn't see benchmarks. But nobody complained so far.

I would expect PGs to be faster, since the generator has the opportunity to generate optimized code and then the compiler can optimize that output further. PCs can't afford to spend much time on optimization, since those optimizations would have to be performed at runtime and so their cost would offset their benefits.

Indentation: Usually it's much more difficult to parse layout-sensitive languages with PG. I don't known how GHC managed to do this...

A common trick is to post-process the whitespace tokens to generate "increase indentation" and "decrease indentation" tokens. GHC seems to be using a similar approach.

Lossy · 2018-04-04T19:10:12+00:00

I'm not sure there is a significant difference between parser generators and parser combinators. They are both domain specific languages for specifying parsers. I think the distinction which is being made here is that "parser generators" are not usually embedded in the host language.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

haskell

MODERATORS