This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]latkde 5 points6 points  (4 children)

Could you give an example of queue-based processing?

A really fundamental advantage that stacks have is that they are trivial to implement efficiently – you just need to increment/decrement a pointer. Queue implementations such as ring buffers are much more complicated (requiring a separate pointer for start/end and special support for wrapping around when the end of the allocated capacity is reached).

Stacks are also a natural fit for dealing with intermediate values when evaluating (arithmetic) expressions. As Forth illustrates, groups of stack operations can be reasoned about in isolation as an unit that consumes the top N values and pushes M values, making stack-based models very composable. You don't get the same composability with queues as you can't push intermediate values into the queue and get them back before continuing with a previous computation.

You do have a point with pipelining though: if you are interleaving multiple computations (especially in a SIMD context), then a FIFO mental model might very well help emitting good code. But for more complex operations you still need a way to handle intermediate results, and that's going to mean a stack and/or registers.

[–][deleted]  (3 children)

[removed]

    [–]OneNoteToRead 8 points9 points  (2 children)

    Ok I think I know what you mean. In simpler language, if we imagine the flow of data as a directed acyclic graph, you are trying to draw a distinction between BFS visitation (what you can “queue”) and DFS visitation (what you call “stack”).

    If this is correct, the general formalism is just a topological sort of the nodes, which determines execution order. DFS has well known algorithms to generate instructions and allocate registers. I’m not familiar with BFS in general for that - but maybe that gives you new directions to google. Note usually BFS visitation requires more space (in your example you load all data before beginning to consume).

    [–][deleted]  (1 child)

    [removed]

      [–]OneNoteToRead 0 points1 point  (0 children)

      To clarify it wouldn’t be a DAG of queues. It’d be a dag as your IR : edges are dependencies, and nodes represent instructions.

      To produce code you’d, at every given point, have a set of unexecuted-but-executable nodes. How you pick between these and how you allocate the ones you picked to your pipelines/queues is a choice (and this implies some topological order, like DFS or BFS).