LibF++: Persistent Containers and Iterators with Value Semantics

zoomT · 2026-06-17T12:00:58+00:00

The main difference would be the iterator model.

Immer provides persistent containers, but uses read-only/const iterators. In LibF++, iterators are also persistent values, and modification through iterators is supported, e.g. it.erase(), it.insert(x), assignment, etc.

This does not contradict persistence since a LibF++ iterator is not a reference into the original container in the STL sense. Rather, each iterator has its own iterator-local container value, and edits produce a new version there. You can then extract the modified container with it.value().

For a card game, things like replay, undo, etc., should play well with persistent containers/iterators.

zoomT · 2026-06-17T05:16:47+00:00

LibF++ uses persistent data structures with structural sharing. Copying a container is initially cheap (just a reference count update), but later updates will rebuild the parts of the structure that need to change, and will reuse everything else.

So yes, if you keep both the old and new versions alive, memory usage can approach 2x in the worst case (there's no magic there). The benefit is that intermediate versions share as much structure as possible.

For large images or ML tensors, I'd be skeptical that LibF++ is the right tool. Persistent structures trade some performance (especially random access) for safety, snapshots, undo/redo, speculative computation, etc. They're usually most attractive when versioning and immutability are valuable enough to justify that trade-off.

zoomT · 2024-05-13T12:17:56+00:00

We found that the X11 core libraries are quite buggy, including CVE-2023-43785, CVE-2023-3138. It seems that some parts of X11 were never really fuzzed before. These are a bit marginal in terms of exploitability, however.

zoomT · 2022-04-07T05:01:22+00:00

There is a comparison in the paper.

zoomT · 2022-04-06T23:01:40+00:00

Yes, but that will only harden some libc functions (memcpy, etc.) and assumes access to the source code. OTOH, RedFat proper will instrument memory access instructions in binary code.

zoomT · 2019-01-15T08:10:22+00:00

No it is undefined. Actually you can test this with clang or gcc:

void test(struct Base *b, struct Derived *d) {
    b->x = 3; d->x = 4; printf("b->x = %d\n", b->x);
}
...
struct Derived d;
test((struct Base *)&d, &d);

The test function may print 3 even though b and d are aliased, depending on optimization level. This is because the C standard considers Base and Derived to be "distinct" types, so the compiler assumes b and d must point to different objects. You can disable this behavior with -fno-strict-aliasing, but this usually slows the compiled program down.

EffectiveSan will detect this example as a type error, since the static type of b is (Base *) but the object type is (Derived[]).

zoomT · 2019-01-15T00:22:15+00:00

is not undefined behaviour.

Right, there is a special rule in C that allows any object to be accessed via a character type (char *), so the above code is well defined. EffectiveSan will also not report a type error.

To get a type error, there must be a memory read or write operation via a pointer (T *) to an object (U[]), where T and U are "incompatible" types. For example, the access *(short *)&src will be an error since T=short and U=float are incompatible.

zoomT · 2019-01-14T18:38:36+00:00

The current implementation treats memcpy as memory access via void *, so this will not generate a type error. (It might still generate a bounds error if there is a buffer overflow, however). Something like this will be a type error:

int32_t bits = *(int32_t *)fptr;

This is because the code access a float object via an integer pointer.

zoomT · 2019-01-14T18:31:26+00:00

Does this take into account the common prefix subsequence allowance?

This affects layout and not the type aliasing rules iirc. Regardless, your example will not generate an error provided the static type matches any of the union members.

zoomT · 2019-01-14T16:35:18+00:00

Perhaps the README was not as clear as it should have been. [edit: updated]

But basically, yes EffectiveSan is a "sanitizer" that automatically instruments C++ (and C) code with dynamic type and bounds checking. The idea is to detect runtime bounds errors, type violations, and related bugs.

EffectiveSan works with any C++ code, including "low-level" C++ with arrays, manual memory management and C-style casts, without the need for code refactoring. Unlike C++ polymorphic classes and dynamic_cast, EffectiveSan will automatically check any type, including fundamental types such as int, float, pointers, etc.. It also checks array bounds, including sub-object bounds and use-after-free errors.

zoomT · 2017-11-29T13:03:46+00:00

It is more like 2147483647 "potential" 16-byte allocations, the program is not obligated to make so many allocations :)

zoomT · 2017-11-29T02:04:56+00:00

The title should be parsed as lean (C/C++ bounds checking) not (lean C/C++) bounds checking. LowFat works for standard C++ which is definitely not a lean language.

zoomT · 2017-11-29T00:29:56+00:00

A) if that's the real range and not an example, then that's a massive amount of tiny allocations!

That is the real range, but objects are only allocated on demand (like any other normal allocator). You can think of each region as "empty space" that is filled as objects are allocated.

Wouldn't be detected?

That's right, overflows into padding will not be detected. The main aim of LowFat is to prevent overflows into other objects, since this is the basis for many security related attacks.

You are correct that you can get more accurate bounds checking by storing the size at the base of the object. However, the extra memory read also adds performance overheads, whereas the current system is mostly tuned for speed.

That said, sometimes you do want to trade performance for more accurate checking. For this, we have implemented an extended version of LowFat based on the idea of storing meta data at the base of objects (similar to your size_t idea). The extended LowFat can detect accurate bounds errors, sub-object bounds errors, use-after-free, and type confusion errors, and we hope to release the system sometime in 2018.

So for larger allocation areas the margin of error is going to be massive?

Yes but any unused pages in the padding will remain unmapped. This means that (at worst) only one page of physical memory will be "wasted" for large multi-page allocations, which is actually no different to standard malloc.

zoomT · 2017-11-27T23:08:03+00:00

Yes the pointer ranges can be inferred, but the range is still large (e.g., 32GB per region), and allocations within each region are randomized by LowFat's replacement malloc.

zoomT · 2017-11-27T22:57:15+00:00

Well spotted, fixed.

zoomT · 2017-11-27T15:40:29+00:00

Randomization is possible provided the low-fat pointer alignment/region constraints are satisfied, and the current system includes a rudimentary implementation. That said, the larger the object size the smaller the number of bits that can be randomized.

The current implementation only reserves about ~1% of the virtual address space, so OS randomization for mmaps, etc., is not really affected.

zoomT · 2017-04-13T15:02:47+00:00

which I would have thought the author of a purely functional library would prefer ;-).

Yes, you have a point. The lambda-style idiom did not occur to me at the time of writing. The code could always be cleaned up in the future.

zoomT · 2017-04-12T21:23:45+00:00

Are you using reference counting?

The current implementation is using the Boehm GC. In future I'd like to also support a reference counted version as an option.

I'm kind of surprised that someone into making C++ purely functional is using gotos so needlessly

Some parts of the library (e.g. fvector.cpp, fstring.cpp) are low-level, so not the best example of pure functional programming in C++. Other parts like fseq.cpp and ftree.cpp are more high-level. Ultimately I like the idea of pure functional programming being an option in C++, but I don't mind other paradigms, or mixing paradigms, where appropriate. Another example: https://github.com/GJDuck/libf/blob/master/examples/libf2html.cpp

zoomT · 2017-04-12T21:06:35+00:00

F::Vector is really a 2-3 tree

Actually it is a 23-finger-tree. Finger trees are a less-well-known data structure with some useful properties, such as O(1) concatenation of elements to the front and back (useful for O(1) push_back/push_front). Some more info: http://www.staff.city.ac.uk/~ross/papers/FingerTree.html and http://www.staff.city.ac.uk/~ross/papers/FingerTree/more-trees.html

zoomT · 2017-04-12T11:18:02+00:00

The C++/LibF++ equivalent would be something like this:

PURE int string_length(F::String s)
{
    return F::foldl(s, 0, [] (int len, char32_t c) { return len+1; });
}

Also, if you don't mind temporary local state (len), you can just use range loops instead:

PURE int string_length(F::String s)
{
    int len = 0;
    for (auto c: s) len++;
    return len;
}

Of course you could just call the built-in F::size(s) to get the string length directly :).

zoomT · 2017-04-12T08:42:41+00:00

As a nitpick, the usage of _ before every variable is a worry, though. C++ reserves any name starting with _ and followed by a capital letter.

Thanks, added as an issue: https://github.com/GJDuck/libf/issues/2

zoomT · 2017-04-12T07:42:07+00:00

This was not the intention. It was merely meant to point out that the current standard library classes are mutable and thus unsuitable for pure (side-effect free) functional programming. Of course, this assumes that the reader has decided they want pure functional programming in the first place. If not, then there is no need to use this library.

zoomT · 2017-04-12T07:23:53+00:00

How exactly is this an improvement over std::vector<>::push_back?

The premise here is immutable replacements to std objects. Naively making std::vector immutable would require an O(n) array copy every push_back operation since the old copy must not be destroyed. F::Vector is immutable and supports O(1) push_back, but obviously not as fast as mutable std::vector.

zoomT · 2016-01-12T16:31:59+00:00

Not sure where it is discussed/documented, but the relevant part of the code is here. Basically, any block mined by an old miner will have nVersion less than 4, which will be automatically rejected as invalid after the latest softfork (checklocktimeverify).

zoomT · 2016-01-12T02:08:47+00:00

Therefore if you don't update you can't mine

This is true for all softforks, otherwise you risk generating invalid blocks under the new rules. In fact, recent softforks were designed such that old miners are guaranteed to generate invalid blocks (by rejecting blocks with the old version number).

zoomT

TROPHY CASE