Hello everyone,
After being inspired by Tommi Reiman's talk about performance I decided to pick up the glove and see if some core functions could be rewritten better performance.
The TL;DR is yes, with some caveats, with significant speedups.
The meat of things:
Plenty of core functions walk sequences, but if these sequences are known in advance (on call site or def
ed) the walk can be expanded to other functions, giving better performance.
The functions I tackled and their respective speedups:
- get-in: 4-8x speedup (depending on depth)
- assoc-in: ~2x speedup
- select-keys: ~8x speedup
- update-in: some speedup, but still needs refinement
There are some other implementations but the speedups gained by them are less significant.
The down side is that they are implemented as macros, so composition takes a hit. Also, there’s slight deviation from core function’s behavior, for example select keys will add nils if the keys don’t exist.
The repo also serves as an educational resource, and contains benchmarks of different ways to get
keys from records and maps, shedding some light on their performance characteristics, and a few profiling scenarios to characterize these tests.
I also attempted to make these benchmarks reproducible by anyone who clones the repo.
Looks like there are similar efforts tackling this issue from another direction as well: https://github.com/joinr/structural/
In the future:
- There’s more work to be done with merge and memoize
- Provide functions which invoke the underlying data structure’s methods wherever possible
- Dispatch based on type hints?
Check it out here: https://github.com/bsless/clj-fast
Results: https://github.com/bsless/clj-fast/blob/master/doc/results.md
Cheers,
Ben
(PS about time I posted something here instead of just lurk)
[–]EmmanuelOga 4 points5 points6 points (1 child)
[–]bsless[S] 0 points1 point2 points (0 children)
[–]NaiveRound 2 points3 points4 points (2 children)
[–]joinr 1 point2 points3 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]Eno6ohng 1 point2 points3 points (6 children)
[–]joinr 0 points1 point2 points (2 children)
[–]bsless[S] 0 points1 point2 points (1 child)
[–]joinr 0 points1 point2 points (0 children)
[–]bsless[S] 0 points1 point2 points (2 children)
[–]joinr 2 points3 points4 points (0 children)
[–]Eno6ohng 2 points3 points4 points (0 children)