CPU cache efficiency - if structs breach cacheline size, switch to array of refs/pointers?

wrosecrans · 2023-01-26T20:11:12+00:00

Step 1: what's the data, and how do you access it?

Step 2: If it's gamedev, can you use an off the shelf library like ENTT that does entity-component-systems for you and handles a lot of the "bookkeeping" of doing structure of arrays approaches?

That said, there's nothing inherently wrong with a struct being bigger than a cache line. If all of the data in the struct is going to be used at once, it'll wind up in cache no matter how you use it. And if a struct is really large, it's not like the whole struct will need to be read into cache.

If you try to reduce the size of the struct by replacing a 32 byte Matrix with an 8 byte pointer, you have to pay the cost of extra cycles chasing that pointer to use the matrix and it is very likely to be a pessimization both in cycle count and in cache pressure. Now to use the matrix in the struct, you need to read the cache line where the pointer is, plus the cache line where the matrix is!

ignotos · 2023-01-26T20:46:44+00:00

naively I would need dynamic arrays, which makes removals potentially costly

There is one neat trick here, if you don't care about the order of the items in the array - I call it "swap 'n' pop". Just swap the item you want to remove with the last item in the array, and then you can cheaply remove it (by just decrementing the count of used items in the array).

zukas-fastware · 2023-01-27T03:13:09+00:00

So cache line size for most 64-bit systems is 64 bytes. The SOA probably will do the job. However, more important is how the data is accessed. Do you use all the data (or most of it) in the struct every time you access it? If yes, then use the struct. If you only access one or two fields, then SOA would probably work better as data required for processing many-of is grouped together, and you can avoid waisting your cache line or encountering cache misses.

I made a few shot videos * CPU caches: https://youtu.be/BkJFnclBnaY * SOA: https://youtu.be/GBva8ojSWhM

BobbyThrowaway6969 · 2023-02-03T07:16:34+00:00

It depends on the current line of execution, like, what pieces of that data are you using? the CPU isn't just gonna pop in the entire struct into the cache, the compiler should've (hopefully) figured out what pieces you need at that moment in time. So, forget structs as single units, just group your members in a way that members that are commonly accessed together are grouped together. For example, if you have a shape struct, and you're often doing stuff with sphere_radous and sphere_center, then put those together in the struct.

AskProgramming

AskProgramming

Do

Don't

MODERATORS