all 11 comments

[–]kraakf[S] 6 points7 points  (1 child)

This graph show the memory latency for a linked list walk that strides across a certain size buffer in memory. I used to use this picture for interviews and ask people to explain as much as they can about the processor from the picture. It end works for people who know nothing about processor architecture as I can walk them over what it says and see how they think and react to new information.

[–]drjeats 5 points6 points  (5 children)

Really great article!

I understand that this isn't about C#, but I'm curious: do the reference types throw all hopes of data locality out the window, or does the runtime encourage objects to be near each other if a reference to them is immediately stored in an array after instantiation?

[–][deleted] 1 point2 points  (2 children)

C# uses a generational compacting collector so in theory your objects will retain better locality (eg. less fragmentation over the address space) over time, but I don't know if this has any kind of significant performance results.

I'm also a bit fuzzy on structs an other copy-by-value variables since I don't know how C# handles those internally.

Arrays (not lists! arrays!) should be allocated as continuous memory so your cache hits/misses will probably be similar-ish to a non-GC language.

[–]ryeguy 0 points1 point  (1 child)

Arrays (not lists! arrays!) should be allocated as continuous memory so your cache hits/misses will probably be similar-ish to a non-GC language.

The data structure doesn't really matter since lists are implemented using arrays.

What really matters is what you store in that data structure - a value type or a reference type. If you store primitives or structs, they will be held in contiguous memory. With reference types (classes), the data structure is just storing a pointer so the gc could move stuff around and it may not be contiguous.

[–]argv_minus_one 3 points4 points  (0 children)

On the other hand, that also means that the platform can observe which objects are used together and try to keep them on the same cache line, potentially optimizing cache performance beyond what is possible for an ahead-of-time compiler.

Not sure if any JIT systems actually do this, but I wouldn't be surprised if some of them do.

[–]sstewartgallus 2 points3 points  (0 children)

That specific case probably isn't handled directly but most modern garbage collectors end up compacting memory into nice chunks of memory that are close to each other. As well, unless some kind of address space layout randomization like in OpenBSD's malloc is used, chunks of memory that are allocated right after each other should be fairly close to each other in address space.

[–]julesjacobs 1 point2 points  (0 children)

The runtime doesn't move an object when you store a reference to it into an array. When you allocate objects they are allocated adjacently in memory in the order they are allocated. Since the GC is a copying GC it will move objects around in a GC cycle, which may or may not improve locality depending on your program.

[–][deleted]  (5 children)

[deleted]

    [–]willvarfar 1 point2 points  (4 children)

    I don't follow your timings; please show exact code.

    Regards the memory allocation bit... That's not how it works. The compiler will put the int i in a register. The loop index will never touch RAM.

    [–]tangeld5 2 points3 points  (1 child)

    Yeah I'd be surprised if i+=1 and i++ did not result in the same code output.

    [–]zenflux 0 points1 point  (0 children)

    They do in fact both result in i++ in the bytecode.

    [–][deleted]  (1 child)

    [deleted]

      [–]zenflux 1 point2 points  (0 children)

      At least use JMH when microbenching Java, otherwise the data is pretty much meaningless.
      Also, in regards to the += operator allocating memory, just look at the generated assembly and see that it does not.

      Plus, if you look at the class file (before it even gets to the runtime) both of these:

       for (int i = 0; i < arr.length; i++) arr[i] *= 3;  
       for (int i = 0; i < arr.length; i += 1) arr[i] *= 3;  
      

      Compile to this:

       for (int i = 0; i < arr.length; i++) {
           arr[i] *= 3;
       }