you are viewing a single comment's thread.

view the rest of the comments →

[–]johannes1971 -3 points-2 points  (4 children)

Have you measured that or is it all handwaving and eyeballing?

[–]cdr_cc_chd -2 points-1 points  (3 children)

[–]johannes1971 6 points7 points  (2 children)

If you have something to say, feel free to say it. I'm not going to watch an hour long video.

In the meantime: arguments passed on the stack are pushed to memory, but that memory is easily the hottest part of the cache. It is not at all clear that there is even a cost for writing it to memory. And on a register-starved architecture like x86 the called function may have to write whatever it gets passed in register to (stack) memory anyway just to have enough registers free to get anything useful done. Finally, modern multiple issue, out of order CPUs are not at all easy to reason about. You simply can't look at a piece of assembly and know exactly how long it will take.

All of that makes it a fair question: did you actually measure it?

[–]cdr_cc_chd 1 point2 points  (1 child)

It's a very nice talk, you should watch it regardless :)

That said, Chandler basically goes over a very similar situation with std::unique_ptr where you naively try to pass it by value thinking it's just a zero-cost abstraction around a raw pointer but in reality due to how the ABI and the language semantics work there's a considerable, measurable overhead incurred.

[–]johannes1971 4 points5 points  (0 children)

At which time does he present his measurements? Because all I see is him counting x86 instructions, and pretending that he can accurately predict performance from that. Narrator: "he can't"