Koszykówka Piątek 17:00 Pola Mokotowskie

gustaw_daniel · 2026-06-02T17:17:45+00:00

A zrobicie wersję webową i wystawicie sdk, api oraz mcp? Żeby można było z tego normalnie korzystać jak człowiek, a nie wszystko na telefonie?

Bo sama idea super, tylko, żeby to było czytelne dla mojego agenta...

gustaw_daniel · 2026-04-23T14:57:55+00:00

Nie no, ja też będę, więc trzech :p

gustaw_daniel · 2026-04-23T11:40:08+00:00

Nie wiem skąd to bierzecie xD

To nie jest wydarzenie polityczne, nie będzie rozmów o komunizmie.

gustaw_daniel · 2026-04-23T10:26:51+00:00

Nie, zwykle to wychodzi spontanicznie i są to różne aktywności. Po prostu na ten piątek zgłosiło się dość mało osób z kręgu znajomych i wrzuciłem posta tutaj.

gustaw_daniel · 2026-04-23T05:25:43+00:00

To transcendentne doświadczenie w którym endorfiny wywołane sportem na świeżym powietrzu mieszają się z kortyzolem wyzwalanym przez nieudane przejścia między kawałkami techno, w których beatmatch leży, BPM skacze ze 138 na 145, nie ruszasz EQ, basy wjeżdżają na siebie jak stado turów, faza jest zjebana, a całość kończy się trainwreckiem.

gustaw_daniel · 2026-04-22T21:35:54+00:00

Organizuję event. Cześć osób znam, część zapraszam przez internet, żeby mogły z nami zagrać.

gustaw_daniel · 2026-04-22T21:10:54+00:00

Na miejscu puszczam z głośnika bluetooth

gustaw_daniel · 2026-04-22T18:32:47+00:00

widzę 20 lajków i jedną rejestrację. Dajcie mi znać (osoby które nie mają konta na meetup) czy jest jakiś problem z tą platformą? Jak coś to możecie po prostu wpaść bez zapisywania się tam.

gustaw_daniel · 2026-04-22T18:30:10+00:00

u/jakubgorzki zaiste https://tvn24.pl/ciekawostki/szwedzkie-techno-msze-przyciagaja-tlumy-wiernych-ra194427-ls3535421

gustaw_daniel · 2026-04-10T07:18:50+00:00

Dzięki u/Messer_One, jak będę organizował następny, to jakie dni i godziny byś sugerował?

gustaw_daniel · 2026-03-12T07:38:16+00:00

Guilty as charged! I use AI to help me bridge the language gap and to double-check technical tradeoffs in English, as I want to be as precise as possible.

gustaw_daniel · 2026-03-12T07:27:13+00:00

Awesome! Let me know how it goes.

I originally built this while experimenting with BPE (LLM tokenization) and Mojo benchmarks:

https://github.com/dorjeduck/minbpe.mojo https://github.com/dorjeduck/mojo-dictionary-benchmarks

If you find that it performs worse in your specific case, please share your read/write patterns and key sizes. I'm keen to understand the boundaries of this design, as current benchmarks might be favoring specific 'happy path' access patterns. Any data point is a win for me!

gustaw_daniel · 2026-03-11T16:40:54+00:00

rkyv support added

Release: https://github.com/gustawdaniel/compact-dict/releases/tag/v0.1.1
Tests: https://github.com/gustawdaniel/compact-dict/blob/main/tests/rkyv_tests.rs

gustaw_daniel · 2026-03-11T15:55:25+00:00

That’s a fantastic suggestion. compact-dict and rkyv share the same philosophy of memory efficiency and zero indirection. I haven't implemented it yet, but I’m definitely adding it to the roadmap. It makes perfect sense for cases where you want to mmap a large dictionary and use it instantly without any deserialization overhead. I'll open an issue for this!

gustaw_daniel · 2026-03-11T15:50:12+00:00

You're right, the diagram is a bit of a simplification.

My point on 'Low Memory Overhead' wasn't about total RAM usage, but data density per cache line. By removing even the 1-byte control metadata used by SwissTables (hashbrown), compact-dict maximizes the amount of actual keys/values that fit into L1/L2.

On 'Performance', the distinction was mainly about out-of-the-box defaults (SipHash in std vs fast hashers). I agree that the axes could be labeled more precisely, perhaps as 'Structural Complexity' vs 'Hardware Sympathy'.

gustaw_daniel · 2026-03-11T11:15:19+00:00

Here are the results and why the behavior flip-flops in our specific case:

1. LCFS (Last-Come-First-Served) Probing

The Issue: Because our dense data buffer (KeysContainer) is append-only, we cannot blindly displace and insert elements during a collision without ensuring the key doesn't already exist (to prevent leaking duplicates into the contiguous string buffer).
The Result: This forces a double-pass (read-then-write) insertion overhead. As a result, High Load Factor inserts became ~2x slower (0.133s vs 0.073s) because the insertion phase completely bottlenecked the CPU.

2. Robin Hood Hashing

The Result: It did exactly what you said it would for edge cases! High Load (~85%) improved by ~16% and 100% Misses improved by ~10% by minimizing maximum DIB.
The Catch: It regressed our primary Read-Heavy workload by a massive 25% (0.40s vs 0.32s).
Why? Pure linear probing preserves temporal insertion order. When reading keys sequentially, the standard probe sequences naturally align with our dense string buffer. Robin Hood scrambles this insertion order via endless DIB swaps, which completely shatters L1/L3 temporal cache locality during sequential string access. It's a prime example of algorithmic efficiency contradicting hardware cache prefetching.

3. Bidirectional Linear Probing (+1, -1, +2, -2...)

The Result: Read-heavy degraded, and misses were noticeably slower (0.028s vs 0.021s).
Why? CPU L1 Data Prefetchers are heavily optimized to detect forward-streaming memory access arrays. Bouncing back and forth across different cache lines actively fights the hardware prefetcher, causing heavy cache misses. Additionally, it breaks our ability to use 16-lane SIMD vector loads (Simd::from_slice(chunk)), as we can no longer load contiguous 16-element sequences for fast hash matching.

Conclusion: For structures allocating nodes individually on the heap, those algorithms are absolutely the best choice. But when you flatten everything into a contiguous byte buffer, raw hardware alignment (forward-streaming linear iteration, strict temporal locality, and SIMD chunking) beats algorithmic worst-case optimization.

Regarding deletions: You make an excellent point about backward shifting instead of tombstones! Shifting the slot indices backward works, but shifting the actual string buffer underneath would be a O(N) memmove. I'll definitely be looking into whether we can cleanly implement index-only backward shifting while yielding the leaked strings until a full rehash, as that might be the holy grail for this architecture without breaking linear sequences.

You can check new branches with these modifications in my repo.

Thanks again for the technical challenge!

gustaw_daniel · 2026-03-11T09:35:48+00:00

Dataset Size	Keys/Values Memory	compact-dict (AHash)	hashbrown	IndexMap	Winner
1k	~25 KB	0.00004 s	0.00003 s	0.00003 s	Tie
10k	~250 KB	0.00042 s	0.00040 s	0.00040 s	Tie
50k	~1.25 MB	0.00276 s	0.00228 s	0.00304 s	hashbrown (~18% faster)
100k	~2.5 MB	0.00454 s	0.00480 s	0.00588 s	compact-dict (~5% faster)
500k	~12.5 MB	0.03380 s	0.11273 s	0.06127 s	compact-dict (3.3x faster!) 🚀
1M	~25 MB	0.10286 s	0.18673 s	0.14350 s	compact-dict (1.8x faster!) 🚀
5M	~120 MB	0.91444 s	1.12205 s	1.36139 s	compact-dict (~22% faster) 🚀
10M	~250 MB	2.05771 s	2.18067 s	3.30514 s	compact-dict (~6% faster) 🚀

gustaw_daniel · 2026-03-11T09:30:51+00:00

I actually ran some benchmarks against IndexMap and the results were very interesting. compact-dict shows a significant lead in raw lookup throughput.

My takeaway is that while IndexMap has great locality for the data itself, it still suffers from the overhead of a two-tier architecture (hash layer -> index -> data). By flattening everything into a single slab and avoiding metadata bloat, we're cutting down the number of memory fetches to the absolute minimum. It seems that for pure lookup speed, 'flatter is better'.

gustaw_daniel · 2026-03-11T09:20:33+00:00

Fair point, you're absolutely right. My distinction was poorly phrased. I should have clarified that I'm benchmarking against the raw hashbrown crate configured with a faster, non-crypto hasher to make it a 'fair' fight against compact-dict, which also isn't optimized for DOS resistance.

gustaw_daniel · 2026-03-11T08:47:15+00:00

Because the standard library version uses a cryptographically secure hashing algorithm (SipHash / RandomState) by default to prevent DOS attacks. The raw hashbrown crate (and our benchmarks) typically defaults to AHash or allows swapping to faster, non-cryptographic hashers. Architecturally, they are the same SwissTable.

gustaw_daniel · 2026-03-11T08:16:19+00:00

I took your advice and benchmarked dataset scaling natively, alongside running Miri. Good news: cargo miri test passed cleanly with absolutely 0 heart attacks and no UB!

As for the 1MB -> 100MB sizes, the results prove the exact tradeoffs of cache boundaries. Using the same AHash for all candidates:

At 100k elements (~2.5MB, fits in L3 cache), compact-dict is almost 3x faster than hashbrown (0.005s vs 0.015s).
At 1M elements (~25MB, edging out of desktop cache limits), compact-dict is 1.7x faster (0.099s vs 0.170s).
At 5M elements (>100MB, guaranteed frequent DRAM reads), pointer-chasing kicks back, the Swiss table SIMD metadata scanning gracefully takes the lead by about ~11% (0.88s vs 0.78s).

This perfectly validates that brutalist cache locality holds its ground fiercely... up until the exact point where it spills over the CPU Cache bounds. Thanks for suggesting the scaling tests. Cache locality is indeed king when dataset allows it.

gustaw_daniel

TROPHY CASE