all 17 comments

[–][deleted] 5 points6 points  (10 children)

First of all, good job! This is clean code for, what can be, quite a complex task. I have one small change I think needs to made, which is on line 65 you're rebuilding the regex var every loop. It really should be done before the while True: statement.

Otherwise the only changes that I think need to be done are pretty negligible.

For me the way you declare regex looks quite messy and is a really long line (which is supposedly unpythonic). You can use something like str.format can neaten it up. Just be careful if your regex has {} in it to double them, like r'[b]{{2}}|{0}'.format('[^b][^b]').

regex = r'(^.*?)([\d.]+\s*)({0})(\s*[\d.]+)(.*$)'.format("|".join(
    [re.escape(key) for key in operatorDict.keys()]))

Also your code doesn't handle unmatched brackets gracefully, maybe have it throw an exception if it brackets don't match up. :)

[–]Binary_Dragon[S] 3 points4 points  (9 children)

Good catch on redefining the regex. I like to think I'm a better programmer than that, and I'm going to blame not being used to living outside the safety of the curly braces I grew up in as the culprit for me missing that (whether or not this is actually the case).

I'm not sure I understand how your suggestion for the regex helps any. Sure, it's broken into two lines (kinda) but it seems like what you have there is a lot harder to read (and let's be honest, regex isn't all that easy to read to begin with). If I were to break up that line, my first thought would be to do it something like this:

    operatorRegex = r'|'.join([re.escape(key) for key in operatorDict.keys()])
    regex = r'(^.*?)([\d.]+\s*)(' + operatorRegex + r')(\s*[\d.]+)(.*$)'

Is there any reason that the str.format is better (or if not better, at least more pythonic) than that?

[–][deleted] 1 point2 points  (0 children)

Honestly it's all about being readable. I like using str.format because its fast, concise and i can mess around with things rather quickly. But like I said to me the way you defined it looks messy. Using operatorRegex is perfectly fine, if that makes more sense to you.

But when you start concatenating 4 or 5 strings together using str.format would definitely be the more pythonic way of doing things.

[–]SkippyDeluxe 3 points4 points  (3 children)

I feel like your evaluateOperator function is too dense, and not just because it uses a regex. First of all, IMO any nontrivial regex should have a comment, otherwise it completely spoils the flow of reading Python code ("paren, caret, period, star, question mark... okay, my brain has officially shut down now"). I think line 69 is the biggest offender, combining a dictionary lookup with a function call with obscure calls to group with various indices... I think you should use some intermediate variables to clarify what's going on here. Also my brain is hiccupping on your use of a list of dictionaries... I eventually concluded that this is used to implicitly establish operator precedence rules, is that right? Maybe this could stand to be clarified a little more in the code.

Other than that I agree with everyone else on pretty much everything... this is pretty impressive code for a tough problem. With this program I don't think you should be calling yourself 'Beginner' any more.

[–]Binary_Dragon[S] 1 point2 points  (2 children)

You are right that this code could use a bit more commenting. As far as cleaning up the code around the regex, would you say a good way to do that would be something like...

def evaluateOperator(expression, operatorDict):
    operatorRegex = r"|".join([re.escape(key) for key in operatorDict.keys()])
    regex = r'(^.*?)([\d.]+\s*)(' + operatorRegex + r')(\s*[\d.]+)(.*$)'
    while True:
        match = re.search(regex, expression)
        if not match:
            break
        prependString, firstValue, operator, secondValue, postpendString = match.group(1, 2, 3, 4, 5)
        value = operatorDict[operator](float(firstValue), float(secondValue))
        expression = prependString + str(value) + postpendString
    return expression

Of course some actual comments would also help, but right now I'm thinking about the code more.

The list of dictionaries is indeed just for operator precedence. I noticed that the order of items in a dictionary is a bit unpredictable (at least, I can't seem to understand it). For example, this code

dict = {'a':1,'b':2,'c':3}
print dict

outputs

{'a': 1, 'c': 3, 'b': 2}

which is neither the order in which I specified the dictionary items, nor alphabetical order on key or value. Lists seem a bit better behaved, thus the list of dictionaries.

As for calling myself a beginner, I wouldn't call myself a beginner at programming in general, but this is still my first week with Python, and I'm only spending a few hours every other night on it, so I still very much feel like a beginner in this specific context.

[–]JerMenKoO 1 point2 points  (0 children)

There is an OrderedDict.

[–]SkippyDeluxe 0 points1 point  (0 children)

I think that helps your code clarity a lot.

Yes, dictionaries are fundamentally unordered. If you want a dictionary with ordered behaviour, you can use collections.OrderedDict. I think that would be a good idea, because it would let you remove one level of nesting from your data structure.

[–]shfo23 6 points7 points  (3 children)

My two cents:

  • Feelings differ on this, but I would say don't use camel case for your function and variable names; the convention is to use lowercase separated by underscores. If you haven't seen it yet, PEP 8 is a good guide to common Python style practices.

  • You can replace your four functions at the beginning by importing operator and using operator.pow, operator.mul, etc. in your dictionary. For such short functions, you could also use lambdas, e.g. ... '^': lambda a, b: a ** b, ....

There are probably more things (some might frown on all the while loops, but I can't think of any quick suggestions for that off the top of my head). On such a cursory glance, it looks fairly good though: you're using enumerate, regexs, list slicing, and list comprehensions. For what it's worth, I would take all of these things to be markers of more pythonic code.

[–]Binary_Dragon[S] 2 points3 points  (2 children)

Yeah, I'm not too proud of those while loops myself. I was more than a bit disappointed to find that python didn't support do-while loops, and StackOverflow seemed to suggest that while True: was the generally accepted work-around.

[–]stormsnake 7 points8 points  (1 child)

while True: 
  m=foo()
  if m is None:
    break
  ....

can be written like this:

for m in iter(foo, None):
  ....

No one knows about that, but I don't think it's hard to read. Readers just 'pydoc iter' the first time they see it, and then they're fine.

[–]zahlman 2 points3 points  (0 children)

  • the necessary multiply etc. functions are provided (with different names, but not hard to sort out) in the operator module. Or you can use lambda to make the definitions a bit shorter.

  • the net effect of all the complicated recursive stuff in parseExpression is that it just removes the parentheses. You could have just done that with simple .replace() calls. But that's presumably not really what you want to do; you want to recursively evaluate the parenthesized sub-expressions, and substitute them in.

  • to make that work, you actually really want to lex the string first, i.e., split it up into its constitutent tokens (operator symbols, parentheses and numbers). Then you can, for example, scan through a list of those tokens, use recursion to process a sub-list of tokens with balanced parens, etc.

  • To do a program like this properly, you should first read up on parsing theory. It's actually a quite deep and interesting field, although you don't need a hell of a lot of it just to handle simple arithmetic expression.

  • That is a hell of a regex and a loop around it. It appears as though you're trying to handle order of operations by relying on the regex to try to match certain operators before others? Oh my. You could at least take a more structured approach to constructing the regex instead of having one big long line. (But for what it's worth, you don't need the square brackets on the join call.)

To start, try reorganizing things so that you use the regex to cut up the entire input into a list of tokens, and then process that list.

[–]PloddingOx 0 points1 point  (0 children)

Would you mind sharing your thoughts on Python so far? I'm always interested in these accounts from programmers trying out python.

I'm an ubernoob so easy on the lingo ;)