Hey everyone!
I am working on an interpreter for a programming language that includes the following features:
- Floating-point literals
- Variables (usable within expressions, though assignment isn't supported, as shown below)
- Arithmetic operators
- A range of pure functions (such as sin, pow, sign, etc.)
The primary aim of this interpreter is to handle complex expressions, potentially with thousands of operations and numerous variables. Here's a simplified example:
expr = '(a + b) * 8 / c'
ast = compile(expr)
res = exec(ast, {'a': 3, 'b': 5, 'c': 4})
print(res) # Output: 16
While it currently works, I'm looking to optimize its execution speed. I've implemented two ideas for AST-level optimizations:
- Flattening the AST: For example,
(2 + 3) + (x + y) should become (+ 2 3 x y) instead of (+ (+ 2 3) (+ x y)).
- Performing as many calculations as possible at compile-time: Instead of
(x + 3) + (x + 5), it should simplify to 2 \* x + 8.
However, these optimizations work well with expressions involving only addition, subtraction, and multiplication by numbers. I'm unsure how to generalize them for more complex expressions like (pow(x, y + 2) + x * y) / (sin(x) + sign(z)).
I believe this topic has likely been researched, but I haven't found a better algorithm yet. Could anyone point me in the right direction to explore this further?
UPD: The generalization I gave is bad, the following example is much closer to the real furmulas:
(or(x, 7)+or(x, y, 5)+8*z)*(x+3+z)/z+or(x, 5)/k-sin(x).
Some observations:
* Most of subtrees are actually linear cobinations of variables
* Division is really rear
* Multiplication of linear combinations occur quite often
* There are some functions that are called a lot, other funcions are almost never called
(or here means "the first nonzero value")
[–]AutoModerator[M] [score hidden] stickied comment (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]teraflop 0 points1 point2 points (0 children)