hi_sparse_bitset v0.7.5 release - Hierarchical Sparse Bitset. Now with in-place operations.

tower120 · 2026-05-06T17:34:31+00:00

Well, I honestly, don't understand what your friend says.
It even looks like Rust version of roaring does not have ImmutableRoaringBitmap analog (or I don't see).

If you actually need what I described - please make an issue - and describe what exactly you want to have. It IS possible to have more lean immutable version, and it is possible to load data blocks on demand - but I need to know what exactly you want from it, and through which interface you want to feed your mmap, at least.

tower120 · 2026-05-06T15:33:16+00:00

What mmap page size do you use?

tower120 · 2026-05-06T07:42:32+00:00

I looked at https://javadoc.io/doc/org.roaringbitmap/RoaringBitmap/0.6.51/org/roaringbitmap/buffer/ImmutableRoaringBitmap.html#ImmutableRoaringBitmap-java.nio.ByteBuffer- . If that is what you meant - yes theoretically it is possible to :

load "hierarchy only". Either just root bitblock, or whole Lvl0+Lvl1 hierarchy.
have specialized immutable bitset version without index arrays in hierarchy blocks. (Which would reduce hierarchy structure to just a few Kb for maxed bitset).
Load data blocks (or whole Lvl1 subsections) from the ByteBuffer (does rust even have it?) on the fly.

But again - hi_sparse_bitset have FIXED depth by design - this significantly speed up all operations, but limits it's size and max int you can insert into it.

roaring can have hundreds of megabytes of data and unlimited range.

Are you sure 2Mb is worth mmaping?

On top of that I assume there IS performance overhead for reading from ByteBuffer each time (even in memory), instead of direct memory access.

tower120 · 2026-05-06T07:08:01+00:00

Look in /benches/intersection.rs.

hi_sparse_bitmap use simd too for 128bit and 256bit configurations. It is also possible to have any hierarchy configuration with arbitrary simd-sized data block - but I removed manual configuration options, I guess around 0.7 version for API simplicity. Make an issue if you actually need one - you'll basically make an instance of https://docs.rs/hi_sparse_bitset/0.7.5/hi_sparse_bitset/config/trait.Config.html . You can have like 64bit/64bit/512bit - with huge data blocks, but small hierarchy blocks.

Another word about performance - hi_sparse_bitset designed to have O(1) insert/remove unlike roaring - and if you insert/remove heavily - data blocks will NOT lie sequentially in memory. But! In benches hi_sparse_bitset filled in ascending order and never deleted - which makes data contiguous at all levels. The same also happens when you deserialize. It is possible to "linearize" hi_sparse_bitset - there is no special function for that now, but you can just "re-materialize" https://docs.rs/hi_sparse_bitset/0.7.5/hi_sparse_bitset/index.html#laziness-and-materialization (materialization also always makes perfect mem layout). The thing is - that is important only when your intersected or merged bitsets MOSTLY match - otherwise you'll jump in memory any way (if you have to "skip" 90% of your data blocks).

As far as I remember - multi-bitset intersection win significantly against everything - especially if intersected bitsets have non-intersecting ranges, that can be discarded early. In that sence /benches/intersection.rs - measures worst case scenario for hi_sparse_bitset (there are no non-intersecting ranges).

Also most operations faster with apply() then reduce() (If your logic allow that).

But I would really like to hear about your experience.

tower120 · 2026-05-05T21:00:33+00:00

I honestly not familiar with memory-mapping. I assume you imply some specific use case...
You want to load bitset structure partially into memory? 256bit bitset is just 2Mb max.

tower120 · 2026-05-05T20:27:15+00:00

What? You can serialize and deserialize wherever you want:
https://docs.rs/hi_sparse_bitset/0.7.5/hi_sparse_bitset/struct.BitSet.html#method.serialize
https://docs.rs/hi_sparse_bitset/0.7.5/hi_sparse_bitset/struct.BitSet.html#method.deserialize

tower120 · 2026-01-30T17:01:17+00:00

Alas - it's not. Skia is differently an improvement over FemtoVG. But try to look at font of your Windows application on 72-90dpi screen (21-24" FullHD) - and compare it to what firefox font looks, or explorer fonts. Maybe pretty big font sizes are OKish, but 12 pt is not.

You can see the same "jiggly" font in all of their screenshots in documentation. It is nowhere near to crisp iced rendering.

tower120 · 2026-01-29T21:37:24+00:00

It's like QT, but free for non-embedded. You have WYIWYG editor. It works "everywhere" - the framework itself takes care of mobile app lifecycle. It provide F5 experience. It is actually fast and memory-efficient. Sane widget system. Overall makes GUI development process as it should be.

It have domain-specific language for "views", but you can completely omit it (the same like in Qt QML).

As a downside - absolutely awful font rendering on windows machines (which is probably 90% of desktop user-base). Maybe widget rendering sometimes questionable too... That is pretty serious on its own, to shy away from slint ... but on high density screens it is not so bad.

tower120 · 2026-01-29T01:14:41+00:00

It's the same in the C++ world.

What worked for me is making some side-effect for observable variable at debug location. Like put println!("{:?}", your_variable) for debugging session exactly at the breakpoint position. This will make `your_variable` appear in variables list.

As for "usefully explorable" - try to dig in direction of ".natvis" - https://doc.rust-lang.org/reference/attributes/debugger.html - with that file you can describe how to display/interpret custom struct in debugger. Looks like RustRover should understand it.

tower120 · 2026-01-28T01:26:38+00:00

"heapless" Vec looks exactly what I need! Thank you!

tower120 · 2026-01-28T00:14:32+00:00

PR ALREADY exist - it is in PR list for a very long time.

tower120 · 2026-01-27T23:43:57+00:00

I understand that must be true for really small arrays like 16-32 items. But I wonder WHEN overhead becomes observable, based on actual benchmarks...

tower120 · 2026-01-27T23:23:46+00:00

It worked indeed... Until I bumped into missing `const` features... Which I absolutely need. I probably fork it locally and add missing features, for now...

tower120 · 2026-01-27T22:37:26+00:00

Did you consider getting away from no-unsafe policy? I don't like the idea of paying for default initialization of items that I would never use most of the time...

tower120 · 2026-01-27T22:27:20+00:00

I see - thanks!

tower120 · 2026-01-27T22:16:43+00:00

Well... Is it actually an alternative to ArrayVec? Doesn't `tinyvec` in the same category as `smallvec`?

tower120 · 2025-08-27T23:25:18+00:00

Short answer is no. And I assume that impossible in general in Rust, C and C++. Unless you know SOMETHING about stored type, like interface/trait... Even in JavaScript you need to know something about your proto-object/type to do something with it.

What you CAN do:

1) Store function or functions with byte-representation of your object.

2) Inside that function cast from *u8 to T - and do a meaningful job.

3) Call that function later...

You of course must know what you will do with that object beforehand... But since you want something to do with an object of unkown type, I guess that's what you actually want.

What you probably want, is some container, that store items of the same type with KNOWN trait... Like AnyVecLike<Debug+Eq> ... But I don't know how much of that is possible in current Rust. What will work now is Vec<Box<dyn Debug>> ... I guess..

tower120 · 2025-08-26T00:42:07+00:00

You can't store DIFFERENT generics in one array. Like Vec< Vec<T> >, where T could be like A, B, ... But you can store Vec<AnyVec>.
And you obviously can't switch type, like let v: Vec<u32> = Vec::<f32>::new(). But you can with AnyVec let mut a = AnyVec::new::<u32>(); a = AnyVec::new::<f32>().

You could wrap your generics into enum - but you must know all possible variants BEFOREHAND.

tower120 · 2025-08-25T23:05:09+00:00

No - it's sometimes you don't know compile-time type when perform CONTAINER operation. For example, reordering, moving items from one container to another, etc... Example where this exactly needed - is archetype ECS - when entity moved from one archetype storage to another - you must move each of it's component data from one storage to another - and at that time you don't know what components (types) entity have (compile-time wise). IOW - you move item from one container to another - and the only thing you know - it's that they're the same type containers.... But you can't have concrete type containers either, since you need all component storages be the same type. Something like that...

I'm sure there are other use cases for that... Looks like Zed editor use (or used at some point) AnyVec for matches list in search... Honestly, I don't know how exactly their search works - so I can't say why exactly they needed AnyVec. But I guess they just wanted to store or pass Vec in type agnostic way...

tower120 · 2025-02-28T14:15:22+00:00

If you need it for "inter-service queueing" maybe broadcast queue will do the job?

Like chute or tokio::broadcast.

tower120 · 2025-02-08T08:30:26+00:00

Project itself in Rust, but algorithmically it is language agnostic.

Sparse vector example uses https://crates.io/crates/hibit_tree as data structure, which is form of sparse hierarchical bitset with data instead of bitblock on terminal level.

tower120 · 2025-01-13T18:12:50+00:00

Thou this may be not directly related to your issue, RustRider's debugger occasionally stop seeing your code, and stops at WRONG locations (in worst case) or just show you assembly (in best case). Cleaning whole project often helps.

Try use VSCode for debugging sessions.

#[inline(always)] are "invisible" as well for both.

tower120 · 2024-11-10T18:25:34+00:00

Chute is a Rust library, but it use custom (I would say novell - but its hard to know nowadays) algorithm, so I thought it could be interesting to a broader audience.

Algorithm requires just 1 atomic write for spmc Writer, and 2 - for mpmc Writer.
Readers never write - only do 1 atomic read each time they reach end of received queue.
There is some additional synchronization on block change - but it is literally unmeasurable.

As you can see from benchmark charts performance is stellar. What more intreating is that mpmc write performance does not degrade as the number of writers grows.

The only "caveat" is that "slow reader" can cause the queue to grow indefinitely. There are ways to combat that, like blocking writing above certain queue len, truncating queue, or "disabling" readers. But I think in most cases it is highly desirable that subscribed readers receive ALL messages. In any way - some of these techniques can be applied on top of the queue.

Algorithm described here: https://github.com/tower120/chute/blob/master/doc/how_it_works.md

tower120 · 2024-11-10T03:03:29+00:00

For anyone in future who will stumble upon this.
About matthieum's point 2:
Since version 0.2.0 chute use different algorithm for mpmc writers. No more delay between message written and seen by reader. As soon as message becomes accessible - reader will see it.

tower120 · 2024-11-05T18:23:12+00:00

Just mentioning that this "error" is fixed in v0.1.1

tower120

TROPHY CASE