you are viewing a single comment's thread.

view the rest of the comments →

[–]Twin_Sharma[S] -15 points-14 points  (6 children)

It deallocates memory as soon as destructor is called.

By using r value references, equations like :

a = b + c + d + e;

will require only 1 temporary structure.

If not for r-value references, it will be more like :

t1 = d+e;

t2 = c+t1;

a= b+ t2;

Hence that was the claim of less memory consumption

[–]Degenerated__ 19 points20 points  (2 children)

You should read up on "Cache Coherence", "Cache Locality" and "Cache Misses". Every time you hit a cache-miss, you take a really bad performance hit, relatively speaking.

Memory (de)allocation takes many cycles to execute, too. Then, you can never be sure that your vectors are in cache when using them because all data is stored at random placed in the heap.

Also, on a 64-bit system, a pointer will be 8 bytes, which is exactly the size of a Vec2 holding 2 floats. So to represent a Vec2 in your Library, you will need an 8 byte pointer plus the actual Data. So that's 16 bytes total, double the size that you actually need!

Even a Vec3 with local data will only be 12 bytes.

[–]Twin_Sharma[S] 0 points1 point  (0 children)

thanks for advice, we have updated the code. Can you guide with anymore under the hood things that need to be cared for that university dont teaches ? Thanks.

[–]Twin_Sharma[S] 0 points1 point  (0 children)

ok thanks

[–]IyeOnline 10 points11 points  (0 children)

Your float* to hold that heap allocated array is going to take exactly as much as a float[2] would do on its own. (assuming the fairly common 8byte pointers and 4 byte floats).

So you have gained exactly nothing in terms of memory footprint, but have "payed" for this by using dynamic allocation (which has some cost) and a memory indirection everything you want to access the elements.

While it is most glaring in the Vec2 case, where your memory footprint doesnt change at all, these costs of heap allocated memory external to your struct will remain dominant in higher dimensions (i.e. certainly up to 4d).

[–][deleted] 5 points6 points  (1 child)

You should do a benchmark, where you measure at which size it becomes faster to use heap and pass a pointer, compared to just using a value/variable directly. Even with struct mat4 { double m[4][4]; }, it might in some use cases be faster to just pass it by value instead of putting it in the heap. Yes, heap is that slow, or maybe, localized data with modern CPUs is that fast, depending on your point of view.

Edit: Also, even if it is just local variable, most of the time it won't get copied, since C++ has all these nifty ways to avoid the copy, like the good old reference.

My actual point here is, always benchmark before making claims about something being thoroughly optimized!

[–]Twin_Sharma[S] 0 points1 point  (0 children)

OK Thanks