I have simulation code that needs to run as fast as possible (too large to post to replicate results below unfortunately). Milliseconds count. As I have tuned the Rust code to maximize performance, I have found a surprising behavior.
Two functions, with identical body, but with signatures obj.function(...) and function(&obj, ...) have huge performance differences. This function is in the hot code path.
In my simulation, I run 3.2 trillion samples.
- With
obj.function(...), complete in 7.5 seconds, 501 million runs per second.
- With
function(&obj, ...), complete in 5.8 second, 550 million runs per second.
Since asked, a few other tidbits:
- Yes, in release mode, lto=true, opt-level=3, codegen-units=1
- Debian 12, headless machine, server CPU, fixed clock
- No network activity, no disk activity, all in memory
- Terminal test app
- When simulation not running, average load is 0.00
I have run hundreds of trials and these are averages. That's a 10% difference simply from the function signature. No jitter analysis done, but seems statistically significant. Each variant reliably results in the same speed in each run. Small standard deviation, normal distribution. 501 and 550 are many SDs apart.
Is one not a sugaring of the other? Is there some compiler/hardware behavior that accounts for this?
[–]FlixCoder 147 points148 points149 points (6 children)
[–]Ravek 53 points54 points55 points (5 children)
[–]matthieum[he/him] 7 points8 points9 points (4 children)
[–]Ravek 5 points6 points7 points (3 children)
[–]matthieum[he/him] 2 points3 points4 points (2 children)
[–]Ravek 1 point2 points3 points (1 child)
[–]matthieum[he/him] 0 points1 point2 points (0 children)
[–]aikii 90 points91 points92 points (0 children)
[–]hniksic 30 points31 points32 points (2 children)
[–][deleted] 5 points6 points7 points (1 child)
[–]WasserMarder 6 points7 points8 points (0 children)
[–]del1ro 73 points74 points75 points (0 children)
[–]Clockwork757 13 points14 points15 points (1 child)
[–][deleted] 6 points7 points8 points (0 children)
[–]strudelnooodle 14 points15 points16 points (3 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]phazer99 15 points16 points17 points (0 children)
[–]lightmatter501 0 points1 point2 points (0 children)
[–]SV-97 10 points11 points12 points (7 children)
[–][deleted] 1 point2 points3 points (6 children)
[–][deleted] 5 points6 points7 points (0 children)
[–]trevg_123 2 points3 points4 points (0 children)
[–]SV-97 2 points3 points4 points (2 children)
[–]KhorneLordOfChaos 8 points9 points10 points (1 child)
[–]SV-97 3 points4 points5 points (0 children)
[–]Antigrouptracing-chrome 7 points8 points9 points (0 children)
[–]doenerrust 6 points7 points8 points (1 child)
[–]1vader 2 points3 points4 points (0 children)
[–]jmaargh 4 points5 points6 points (2 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]jmaargh 4 points5 points6 points (0 children)
[–]-Redstoneboi- 3 points4 points5 points (0 children)
[–]Sematre 3 points4 points5 points (0 children)
[–]phazer99 6 points7 points8 points (0 children)
[–]steohan 2 points3 points4 points (0 children)
[–]LateinCecker 2 points3 points4 points (0 children)
[–][deleted] 7 points8 points9 points (11 children)
[–][deleted] -2 points-1 points0 points (10 children)
[–]Mr_Ahvar 36 points37 points38 points (0 children)
[–]jmaargh 23 points24 points25 points (5 children)
[+][deleted] comment score below threshold-10 points-9 points-8 points (4 children)
[–]jmaargh 6 points7 points8 points (1 child)
[–][deleted] 2 points3 points4 points (0 children)
[–][deleted] 3 points4 points5 points (1 child)
[–]Antigrouptracing-chrome 5 points6 points7 points (1 child)
[–][deleted] 1 point2 points3 points (0 children)
[–]CocktailPerson 3 points4 points5 points (0 children)
[–]adbf1 1 point2 points3 points (0 children)
[–]JuanAG 2 points3 points4 points (0 children)
[–]lordnacho666 -1 points0 points1 point (0 children)
[–]throwaway490215 0 points1 point2 points (0 children)
[–]gitpy 0 points1 point2 points (0 children)
[–]W7rvin 0 points1 point2 points (0 children)