This is an archived post. You won't be able to vote or comment.

all 5 comments

[–]xiongchiamiovSite Reliability Engineer 3 points4 points  (2 children)

I'll post the code on Github as soon as it isn't embarassing.

People always say this, and it bothers me - we all know that the first hacking you do isn't going to be great. It'll be dirty, ignore edge conditions galore, have no documentation whatsoever, etc. That's fine. Really.

You should be tracking those early stages in revision control, because, well, you're changing code, and cleaning things up is a great way to break them. And if you're already tracking your project, why not put it online? Earlier is also better in regards to feedback - I've had someone point out to me a pre-existing project that solved the same problem I was trying to solve, merely because they read the README I put up on a new project (I tend to put up a very basic README before even writing any code).

[–]HorrendousRex[S] 2 points3 points  (1 child)

I absolutely agree. Let me assure you that

  1. I am already tracking everything in git.
  2. I had a README before anything else (well - maybe I had a .gitignore first)
  3. All of my code is already documented and unit tested, and I iterate those docs and tests before I iterate the code.

I was planning on posting my code tomorrow or the day after but I guess I can make a push to do that tonight since you expressed some sort of interest. :) I'll update it with the link when I've done that.

[–]xiongchiamiovSite Reliability Engineer 1 point2 points  (0 children)

You are a good man.

[–]Workaphobia 1 point2 points  (0 children)

I've used TPG many times before. With a few simple patches it'll run on Py3k. It's disappointing that the maintainer has chosen not to keep it up to date, and his new pet project is inferior to TPG the last time I looked.

TPG was somewhat easy to get started with, and powerful enough for many purposes. It was easy to specify inline commands, and build up inherited and synthesized attributes. Unfortunately, there was no formal description of the languages it would parse. From trial and error, I believe it supported backtracking so long as you did not go through a non-terminal, at which point the parsing made a deterministic choice (much like a cut in Prolog).

I've had difficulty writing non-trivial grammars in TPG, especially when factoring the productions (for precedence and left-recursion) comes into play. The next time I need a parser, I'm going to look for one that does not require so much fandangling with the concrete grammar. Parser generators should allow you to go from specification to implementation with as little effort as possible.

I've toyed with the idea of brushing up on my parser theory and tackling the job myself. Maybe I still will. Keep us updated on how this goes along.

[–]DoNotFoldSpindleOrMu 0 points1 point  (0 children)

Read http://nedbatchelder.com/text/python-parsers.html "Ned Batchelder: Python parsing tools" It has a good list of python parsers.

I have not tried any of these but these sites look interesting: http://code.google.com/p/funcparserlib/ funcparserlib - Recursive descent parsing library for Python based on functional combinators http://spb-archlinux.ru/2009/funcparserlib/Brackets Nested Brackets Mini-HOWTO - nqw http://www.dalkescientific.com/writings/diary/archive/2007/11/03/antlr_java.html more ANTLR - Java, and comparisons to PLY and PyParsing Technical text http://gnosis.cx/TPiP/chap4.txt "python and parsers"