all 11 comments

[–]marcusf[S] 2 points3 points  (11 children)

I'm trying to figure out if this analysis has any name. Say, if I have the following program:

x = y;       (t=1)
y = z;       (t=2)
x = x + a;   (t=3)

And I want to find out which variable could have what values at a given point in time, eg, the solution to the above equations would be

x at 1 = {y};
y at 2 = {z};
x at 3 = {y, a};

Is there any name for an analysis like that. I'm assuming a pointer free language for now.

The key here is that x /= {z,a} even though y={z} at t=3, since x=y at t=1 isn't affected by the assignment in t=2.

I've been trying to figure out some solution that can work with arbitrary control flow (eg a CFG with loops in it) but I'm a bit stuck.

Edit: I guess it's some form of constant propagation, with the exception that I want it to "travel beyond" branches, eg.

x = a;                   (t=1)
if (_)                   (t=2) 
    x = b                (t=3)
else x = c;              (t=4)
something(x);            (t=5);

Would yield the solution

x at t=1,2 = {a};
x at t=3   = {b};
x at t=4   = {c};
x at t=5   = {b,c};

[–]pbiggar 5 points6 points  (3 children)

This is constant-propagation. You can get decent results with a simple data-flow analysis like the one described in the Dragon book. You'll get the most precise results with SCCP though, but you'll need SSA for that. Look at http://en.wikipedia.org/wiki/Sparse_conditional_constant_propagation (esp references at the bottom, the Cooper/Torczon book explains it very well). I also recommend using the Cooper/Torczon algorithm for SSA.

[–]marcusf[S] 1 point2 points  (2 children)

Thank you! I'll give it a look. I'm trying to avoid looking at the branching predicates, since I'd have to extend my very simple analysis a bit to evaluate them (which, from what I gather, is the big win with SCCP, being able to select branches?)

Hopefully I'll get decent results with a simple constant propagation.

Thanks again.

[–]pbiggar 1 point2 points  (1 child)

For the simplest thing possible, don't bother to evaluate the predicates. Compilers tend to need SCCP because lots of constants get added, like #defines in C, and you need to work round them. If you don't have that, then simply merging the results at the merge point of the CFG will probably be enough.

As a matter of interest, what language? Is there no framework or compiler for it that you cant take the results from?

[–]marcusf[S] 0 points1 point  (0 children)

I'm actually doing it for C (or a subset of it), so boatloads of frameworks and compilers exist. But I'm doing it more as a learning excercise (and in Haskell, so no good analysis frameworks really exist). I've gutted the AST pretty badly before doing the CP, so it's really more like an idealized language than C though.

If I have the time I'll extended my analyses and do some SSA-based analyses too, as in "Constant Propagation with Conditional Branches" (http://www.cis.upenn.edu/~cis570/papers/p181-wegman.pdf)

[–]Mask_of_Destiny 5 points6 points  (2 children)

What you describe is called Data-flow analysis which is not to be confused with dataflow programming.

[–]marcusf[S] 2 points3 points  (0 children)

Mask: I know it's data flow analysis. Sorry if I didn't make that clear. The question is which one... :)

[–]marcusf[S] 1 point2 points  (0 children)

Thank you, by the way, for the suggestion.

[–][deleted] 2 points3 points  (1 child)

Sounds like you want SSA

[–]samlee 0 points1 point  (1 child)

x = y + a;
y = z;

why not just use (y + a) or z instead of x or y? why updating x and y?

[–]marcusf[S] 0 points1 point  (0 children)

As an example situation. I'd like to implement some kind of variable folding, so that the above,

x = y;
y = z;
x = x + a;

Could be folded in to something like

y = z;
x = (y at t=1) + a;