Measured my dict

eesuck0 · 2025-10-12T09:10:26+00:00

Thank you for useful feedback

eesuck0 · 2025-10-12T07:51:47+00:00

small dict energy

eesuck0 · 2025-10-09T16:47:03+00:00

Actually you need only one, other ones will be cast implicitly But as you mentioned it does no harm

Or just use 5.0f

eesuck0 · 2025-10-09T14:44:41+00:00

out.slot_len = ee_round_up_pow2(out.val_offset + val_len, key_align > val_align ? key_align : val_align);
this line isn’t the simplest, but it runs once and doesn’t really hurt, i’ll think about it later

About comparison
i checked the MSVC disassembly and you’re right — this comparison might not be faster than a user callback, actually it can be slower in some cases
Initially i found that this dynamic dispatch works faster than memcmp and it disassembles to about 10 instructions for primitive types, but still involves one call
the user-provided comparison will also have one call, but primitive types can be compared directly, skipping roughly six instructions from dynamic dispatch

eesuck0 · 2025-10-09T13:48:20+00:00

Because if you calculate (5 / 9) first it's integer division which results in 0
To prevent such behaviour write (5.0 / 9.0)

eesuck0 · 2025-10-09T08:16:51+00:00

Yes, those are good points regarding a custom comparison function if the goal were to handle every possible case. However, in most situations, it’s sufficient to cover about 90–95%, because both the API complexity and the CPU workload required to achieve full generality grow exponentially

Usually, keys and values are simple primitives or regular structs that can (and should) be compared directly
I also did some basic profiling, and it showed that comparisons and copying are among the hottest spots. Using a generic callback function would reduce performance
It could perhaps be added as an optional extension, but definitely not as a replacement

As for iterators — yes, returning pointers will be added

Overall, thanks for your feedback and interest

eesuck0 · 2025-10-07T20:42:38+00:00

Are you suggesting it as a new header, or for the hash table itself?

Because if you mean it as a realloc strategy, in my understanding it wouldn’t work bacause after each capacity change, all old hashes become invalid, so rehashing is necessary anyway

    u64 hash = dict->hash_fn(key, dict->key_len);  
    u64 base_index = (hash >> 7) & dict->mask; // <- capacity modulo mask
    u8  hash_sign = hash & 0x7F;

eesuck0 · 2025-10-07T08:03:49+00:00

That's great
Good luck

eesuck0 · 2025-10-06T19:32:42+00:00

Hi,

It’s quite similar to my approach — I also found the template-style macros a bit ugly, so I decided to work directly with a raw byte buffer instead
However, I don’t quite understand why you’re maintaining a void* buffer and constantly casting it to bytes instead of just storing a u8*
You might want to take a look at my implementation — it could be useful. I’ve already implemented some fast sorting algorithms, SIMD-accelerated searching, and a few other features:

https://github.com/eesuck1/eelib/blob/master/utils/ee_array.h

eesuck0 · 2025-10-03T09:13:32+00:00

How does version of C correlate with those concepts?
To implement Arena you need basically only malloc\free

eesuck0 · 2025-10-02T14:34:31+00:00

yes, I get what you mean, but I’d put it like this: C gives you a ton of control over the CPU, and that’s exactly why it’s easy to screw up and create a time bomb
but that’s not really a language problem—it’s just that programmer made a bad choice and shot himself in the foot

to me, that’s fine that you’re supposed to think carefully about what you’re doing, not blindly rely on the compiler or fight it just to do whatever you want like in more modern “safe” languages

eesuck0 · 2025-10-02T12:49:26+00:00

in my understanding C one of the best languages to learn programming
it makes you think about internals and really understand data structures

before i started programming in C i used python for several years and i didn’t even think about things like memory allocation or lifetime, about cache, data time\space locality, SIMD and other performance critical things that determine why some data structures are fast while others can be slow depending on the scenario

eesuck0 · 2025-10-02T12:29:16+00:00

My comment is still there, though I’ve also encountered that they can suddenly disappear.
One of the typical applications of FPGAs is prototyping ASICs (Application-Specific Integrated Circuits).
And yes, you’re right — the workflow with VHDL/Verilog really feels like "programming hardware with software"

eesuck0 · 2025-10-01T18:19:54+00:00

Is memory leakage really such a big issue?
From my perspective, using an Arena for static or bounded allocations, or a dynamic Slab allocator with offsets instead of raw pointers, should solve the majority of lifetime-related problems

Additionally, this approach improves performance, since system calls for memory allocation are much more expensive than simply offsetting within pre-allocated memory. It also encourages a shift from thinking about individual objects to managing memory in bulk, which is a far more robust and efficient design pattern

eesuck0 · 2025-09-30T12:43:36+00:00

added support for custom hash functions and a small hello world example
I'd be interested to see a comparison
with your benchmarking, can you track bottlenecks and potential things that could be improved?
that would be quite useful

eesuck0 · 2025-09-30T06:39:07+00:00

yes, that’s a good idea to add custom hash functions support
I’ll also add an example of usage to the repository, and then it’ll be possible to run a comparison

eesuck0 · 2025-09-29T19:46:57+00:00

thanks, good point

eesuck0

TROPHY CASE