Optimizing expression evaluation

Ivaylo Fiziev
Mar 24
3 min read

Expressions can be found everywhere in your code. They are used to express conditions, formulas, constants etc. Expression evaluation is one of the most important things when it comes to implementing a programming language.

Effective expression evaluation improves performance while it guarantees the correctness and the consistency of the results.

Now how do you achieve that?

It depends on the type of the interpreter you are building. With an AST interpreter, like the one we have for SCL, there are two ways to optimize expressions.

On the level of the interpreter - The idea is to use optimize data access and avoid doing redundant work. Here is a list of some typical optimizations:
Variable slots - Access variables using an index instead of using names. This saves the hash table lookup overhead.
Store function pointers in the AST node - Access functions using a pointer stored in the AST node that denotes a function call. This also saves hash table lookups.
Store argument / field metadata in the AST node - Access metadata using a pointer stored on the AST node that denotes it.
Store constant values in the AST node - This is related to literal values and type ids. Once resolved, constant values can be stored in the AST node that denotes them. This saves parsing time.
Store type information in the AST node - Once resolved, type information can be stored in the AST node where needed. This saves hash table lookups and API calls.
Constant folding - Expressions that do not change their value are quite common. Evaluate them once and reuse the result.

#a := 10 + 1;  	// constant expressions
#b := 3.14 * 2;	//
#c := 16#FF;		//
#d := 'foo';		//

Short circuit evaluation for logical AND/OR expressions - avoid evaluating one of the operands if the other one guarantees the result.

true or false		// evaluate the first operand only
false and true	//

All these are implemented in official v2606 release.

On the code level - The idea is to structure your code in a way that makes it optimal for interpretation.
1. Cache repeated calculations - use a temp variable to store the value of repeated expressions:

IF (#A * #B > 100) AND (#A * #B < 200) THEN

becomes:

#temp := #A * #B;
IF (#temp > 100) AND (#temp < 200) THEN

Simplify expressions - complex expressions result in a deeper abstract syntax tree. Keep them simple.

IF (#x <> 0) AND (#y / #x > 2) THEN

becomes:

IF #x <> 0 THEN
    IF #y / #x > 2 THEN

Avoid expression evaluation in loops - this is probably the biggest performance killer. Especially with nested loops and array indexing.

FOR #i := 1 TO 100 DO
    #result[#i] := #input[#i] * (#A * #B + #C / #D);
END_FOR;

becomes:

#temp := #A * #B + #C / #D;
FOR #i := 1 TO 100 DO
    #result[#i] := #input[#i] * #temp;
END_FOR;

FOR #i := 1 TO 100 DO
    IF #enable THEN
        #result[#i] := #data[#i] * #factor;
    END_IF;
END_FOR;

becomes:

IF #enable THEN
    FOR #i := 1 TO 100 DO
        #result[#i] := #data[#i] * #factor;
    END_FOR;
END_IF;

As a rule of thumb: The more complex the expression, the more time it takes to evaluate. The more you evaluate expressions, the slower the script will be. I believe this is not a surprise to anyone.

AST interpreters are limited by the size of the tree. Byte code interpreters offer much better performance since they work on a linear list of instructions. A byte code interpreter however takes much more time and effort to develop. You need a compiler and a virtual machine that runs the compiled code. Java, Python and .Net Framework are examples of such interpreters.

Optimizing expression evaluation

Recent Posts

Comments

Subscribe Form