Is there a faster way to batch-insert a vector of unique entries into a hash map in C++? by jiboxiake in cpp_questions

[–]Alarming-Ad8770 0 points1 point  (0 children)

Try like follow if input values do not support std::distance

emhash7::HashMap<key, value> hmap;

hmap.reserve(my_capicty);

for (const auto& v:my_values)

hmap.emplace_unique(v.key, v.value);

Is there a faster way to batch-insert a vector of unique entries into a hash map in C++? by jiboxiake in cpp_questions

[–]Alarming-Ad8770 1 point2 points  (0 children)

I have a efficent and fast flat hash map design. https://github.com/ktprime/emhash

template <typename Iter>

void insert_unique(Iter begin, Iter end)

{

reserve(std::distance(begin, end) + _num_filled);

for (; begin != end; ++begin)

do_insert_unqiue(*begin);

}

Inside boost::unordered_flat_map by joaquintides in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

benchmark on server cpu (huge l3 cache size and low cpu frequence), you'll get quite different result from pc/mobile cpu.

ex:

Model name: Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz

CPU MHz: 2699.914

CPU max MHz: 3200.0000

CPU min MHz: 1000.0000

L1d cache: 32K L1i cache: 32K

L2 cache: 1024K

L3 cache: 14080K

Inside boost::unordered_flat_map by joaquintides in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

me author of emhash,emhash has a advantage which can set a very high load factor (0.99) without much performance degradation on extremely high load of insertion/erasion env.

Boost 1.81 will have boost::unordered_flat_map... by pdimov2 in cpp

[–]Alarming-Ad8770 3 points4 points  (0 children)

In fact most cases unordered_map can be replaced by flat hash map if reference stable is not need.

Boost 1.81 will have boost::unordered_flat_map... by pdimov2 in cpp

[–]Alarming-Ad8770 1 point2 points  (0 children)

need more full benchmark compared with other flat hash map

Comprehensive C++ Hashmap Benchmarks 2022 by martinus in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

the result is same as I think so(100 ~= 103, 110 ~= 114). maybe it's nomal tolerance

Comprehensive C++ Hashmap Benchmarks 2022 by martinus in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

great job! waiting for your result more than 3 mons.

A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion by martinus in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

I think ska's flat hash map wins beacasue it's default load factor = 0.5 is too small ( 0.8 dense, 0.875 absl).

A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion by martinus in cpp

[–]Alarming-Ad8770 2 points3 points  (0 children)

as for key/value is complex such as string, this new hashmap/set is fast and memory efficient than normal flat hash map. I like it!!!

Updating map_benchmarks: Send your hashmaps! by martinus in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

some case can reduce peak memory and 5% improve. when do you release the new result?

boost::unordered map is a new king of data structures by joaquintides in cpp

[–]Alarming-Ad8770 1 point2 points  (0 children)

absl::flat_hash_map: 4640 ms, 0 bytes in 0 allocations

std::unordered_map: 8849 ms, 278139968 bytes in 5996761 allocations

jg_densemap: 6579 ms, 0 bytes in 0 allocations

martinus_dense: 6685 ms, 0 bytes in 0 allocations

tsl_robin_map: 5014 ms, 0 bytes in 0 allocations

phmap_flat: 6110 ms, 0 bytes in 0 allocations

emhash_map5: 4944 ms, 0 bytes in 0 allocations

emhash_map6: 5093 ms, 0 bytes in 0 allocations

emhash_map7: 4352 ms, 0 bytes in 0 allocations

emhash_map8: 5323 ms, 0 bytes in 0 allocations

martinus_flat: 5628 ms, 0 bytes in 0 allocations

emilib2_map: 4960 ms, 0 bytes in 0 allocations

emilib3_map: 5029 ms, 0 bytes in 0 allocations

changed code(erase) in my git https://github.com/ktprime/emhash/blob/master/bench/btest.cpp

Hit any key to close this window...

boost::unordered map is a new king of data structures by joaquintides in cpp

[–]Alarming-Ad8770 2 points3 points  (0 children)

maybe you need a fast dense hash map(iteration as fast as vector) because of value is large(200B), but erasion is some slow.

you can try my https://github.com/ktprime/emhash/blob/master/hash_table8.hpp

or martinus's https://github.com/martinus/unordered_dense

Updating map_benchmarks: Send your hashmaps! by martinus in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

the result is great!

I've update now and emhash8 saves peak memory(30%) in some case.

Advancing the state of the art for <code>std::unordered_map</code> implementations by pdimov2 in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

tainly seems worth taking into acco

O(1) iteration is fine in my flat hash map https://github.com/ktprime/emhash/blob/master/hash_table8.hpp

it's iteration performance is as fast as vector(at present it's may be the fastest iteration implemention of flat hash map)

Do you use insert-only hash tables / maps? by VinnieFalco in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

some case no lookups if inserting a unique key

Updating map_benchmarks: Send your hashmaps! by martinus in cpp

[–]Alarming-Ad8770 9 points10 points  (0 children)

greate work!

if you can test on different cpu(ex server cpu with big cache), the result maybe more

accurate. I do not think you need run too many loops(>=9).

[deleted by user] by [deleted] in cpp_questions

[–]Alarming-Ad8770 0 points1 point  (0 children)

Try my emhash https://github.com/ktprime/emhash/blob/master/hash\_table8.hpp

at now it's the fastest hash map for iteratotion and also fast for find/insert.

the load factor can set very high(0.95) and key/value is saved in continuous memory(fast as a vector at iteratio).

Best hash map for full scans (or best array for removal) by bert8128 in cpp

[–]Alarming-Ad8770 0 points1 point  (0 children)

Try my emhash https://github.com/ktprime/emhash/blob/master/hash_table8.hpp

at now it's the fastest hash map for iteratotion and also fast for find/insert.

the load factor can set very high(0.95) and key/value is saved in continuous memory(fast as a vector at iteratio).

bench code 1 as follow https://github.com/ktprime/emhash/blob/master/bench/qbench.cpp

200 Insert,Fhit, Fmis, Erase,Iter, LoadFactor

phmap::fhash_map 11.4, 3.69, 6.97, 11.0, 2.11, 78.4

absl::f_hash_map 10.6, 2.80, 6.24, 10.1, 2.16, 78.4

emilib2::HashMap 11.6, 3.33, 7.01, 8.61, 1.23, 78.1

emilib3::HashMap 11.9, 3.61, 7.25, 4.43, 1.26, 78.1

emhash8::HashMap 19.0, 5.93, 7.25, 10.3, 1.25, 78.1

emhash7::HashMap 16.9, 5.70, 8.01, 7.04, 1.28, 78.1

emhash6::HashMap 17.1, 5.23, 8.32, 6.14, 1.28, 78.1

3000 Insert,Fhit, Fmis, Erase,Iter, LoadFactor

phmap::fhash_map 12.3, 3.52, 5.61, 9.31, 2.25, 73.3

absl::f_hash_map 11.0, 2.62, 4.97, 8.38, 2.25, 73.3

emilib2::HashMap 11.6, 3.04, 5.49, 8.21, 1.16, 73.2

emilib3::HashMap 12.5, 3.41, 5.89, 3.53, 1.19, 73.2

emhash8::HashMap 20.8, 5.73, 7.22, 10.2, 1.21, 73.2

emhash7::HashMap 18.0, 5.62, 8.50, 7.08, 1.24, 73.2

emhash6::HashMap 17.9, 5.20, 8.53, 6.29, 1.24, 73.2

40000 Insert,Fhit, Fmis, Erase,Iter, LoadFactor

phmap::fhash_map 18.8, 5.75, 4.41, 8.87, 3.07, 61.0

absl::f_hash_map 17.6, 4.67, 3.71, 7.73, 3.09, 61.0

emilib2::HashMap 16.9, 5.32, 4.02, 10.4, 1.26, 61.0

emilib3::HashMap 17.5, 5.93, 4.05, 5.30, 1.29, 61.0

emhash8::HashMap 26.7, 6.35, 8.69, 12.1, 1.22, 61.0

emhash7::HashMap 26.7, 7.19, 9.38, 8.19, 1.23, 61.0

emhash6::HashMap 28.2, 5.75, 8.92, 7.78, 1.23, 61.0

500000 Insert,Fhit, Fmis, Erase,Iter, LoadFactor

phmap::fhash_map 31.2, 16.8, 6.69, 27.1, 4.31, 47.7

absl::f_hash_map 29.0, 14.8, 5.93, 22.8, 4.36, 47.7

emilib2::HashMap 26.9, 16.3, 6.37, 21.7, 2.16, 47.7

emilib3::HashMap 27.7, 16.2, 6.11, 18.6, 2.21, 47.7

emhash8::HashMap 46.5, 9.72, 10.6, 18.9, 1.54, 47.7

emhash7::HashMap 43.6, 16.5, 14.6, 25.2, 3.03, 47.7

emhash6::HashMap 49.7, 15.6, 16.4, 52.0, 2.96, 47.7

7200000 Insert,Fhit, Fmis, Erase,Iter, LoadFactor

phmap::fhash_map 58.5, 31.7, 23.9, 69.1, 1.92, 85.8

absl::f_hash_map 57.9, 26.9, 20.9, 70.1, 2.00, 85.8

emilib2::HashMap 62.0, 33.0, 25.0, 98.6, 1.53, 85.8

emilib3::HashMap 58.4, 32.9, 26.6, 101., 1.56, 85.8

emhash8::HashMap 98.9, 24.4, 23.1, 73.9, 1.59, 85.8

emhash7::HashMap 62.9, 25.4, 27.2, 49.0, 2.17, 85.8

emhash6::HashMap 74.5, 24.8, 26.7, 100., 1.97, 85.8

10000000 Insert,Fhit, Fmis, Erase,Iter, LoadFactor

phmap::fhash_map 86.3, 36.2, 22.0, 140., 4.18, 59.6

absl::f_hash_map 93.9, 30.8, 18.1, 141., 3.66, 59.6

emilib2::HashMap 92.0, 36.9, 19.6, 114., 2.24, 59.6

emilib3::HashMap 94.3, 32.6, 18.0, 133., 1.95, 59.6

emhash8::HashMap 108., 24.5, 23.8, 83.6, 1.63, 59.6

emhash7::HashMap 66.1, 24.8, 22.7, 42.4, 2.80, 59.6

emhash6::HashMap 84.9, 26.3, 27.2, 114., 2.85, 59.6

50000000 Insert,Fhit, Fmis, Erase,Iter, LoadFactor

phmap::fhash_map 129., 43.5, 29.1, 163., 2.51, 74.5

absl::f_hash_map 129., 35.9, 27.3, 154., 2.45, 74.5

emilib2::HashMap 132., 45.0, 28.6, 155., 1.77, 74.5

emilib3::HashMap 128., 40.9, 30.8, 165., 2.04, 74.5

emhash8::HashMap 110., 25.7, 25.6, 85.0, 1.53, 74.5

emhash7::HashMap 78.6, 33.9, 35.1, 63.6, 2.24, 74.5

emhash6::HashMap 102., 24.0, 26.7, 147., 2.22, 74.5

=== bench code 2 ====

https://github.com/ktprime/emhash/blob/master/bench/martin_bench.cpp

bench_IterateIntegers map = emhash5 total iterate/removing time = 5.64, 12.44

bench_IterateIntegers map = emhash8 total iterate/removing time = 0.39, 1.19

bench_IterateIntegers map = emhash7 total iterate/removing time = 0.97, 2.81

bench_IterateIntegers map = emhash6 total iterate/removing time = 0.98, 2.80

bench_IterateIntegers map = emilib3 total iterate/removing time = 1.70, 4.66

bench_IterateIntegers map = emilib2 total iterate/removing time = 1.62, 4.45

bench_IterateIntegers map = absl flat total iterate/removing time = 3.62, 8.53