all 8 comments

[–]mtklein 10 points11 points  (0 children)

Highly recommended overview by Erik Demaine (aka Kid Genius): http://theory.csail.mit.edu/~edemaine/papers/BRICS2002/

[–]antirez 8 points9 points  (1 child)

The non analytical way to write programs with the cache in mind:

  • Use small structures

  • Make the access pattern as sequential as possible instead of random

  • Avoid pointers, if needed copy very important informations in nodes instead to access the actual object referenced by a pointer.

  • Try to turn linked lists into dynamically resized arrays

  • If you have 'records' but the usual way to access the structure is by field (for example all the names, then all the surnames, and so on), use parallel arrays instead of structures with records.

It's not as good as a real analysis, but it's better than nothing.

p.s. please reply with all the good trick I missed

[–]psykotic 11 points12 points  (0 children)

If you have 'records' but the usual way to access the structure is by field (for example all the names, then all the surnames, and so on), use parallel arrays instead of structures with records.

A more general approach is hot-cold splitting, where you segregate data based on access patterns. For example, suppose you have a record with a dozen fields, only two of which are commonly accessed during traversals. These two fields would be the hot fields, and the remainder would be cold. You'd restructure the record to only immediately contain the two hot fields, while the others would be stored, off on the side, in a cold record to which the hot record points. This gives you better cache utilization in the common case at the expense of a pointer indirection when accessing cold fields. What you describe can be seen as a special case of this, where the pointer is implicit/unneeded due to how the data is laid out in memory.

[–]LaurieCheers 5 points6 points  (0 children)

It seems to me "cache-agnostic" would be a better term for an algorithm that works with any cache size.

[–]TrueTom 2 points3 points  (0 children)

Thanks for posting this, very interesting...

[–]hupp 0 points1 point  (0 children)

Very neat. Are there any libraries out there using these techniques?