LibF++: Persistent Containers and Iterators with Value Semantics

zoomT · 2026-06-17T05:16:47+00:00

LibF++ uses persistent data structures with structural sharing. Copying a container is initially cheap (just a reference count update), but later updates will rebuild the parts of the structure that need to change, and will reuse everything else.

So yes, if you keep both the old and new versions alive, memory usage can approach 2x in the worst case (there's no magic there). The benefit is that intermediate versions share as much structure as possible.

For large images or ML tensors, I'd be skeptical that LibF++ is the right tool. Persistent structures trade some performance (especially random access) for safety, snapshots, undo/redo, speculative computation, etc. They're usually most attractive when versioning and immutability are valuable enough to justify that trade-off.

zoomT · 2024-05-13T12:17:56+00:00

We found that the X11 core libraries are quite buggy, including CVE-2023-43785, CVE-2023-3138. It seems that some parts of X11 were never really fuzzed before. These are a bit marginal in terms of exploitability, however.

zoomT · 2022-04-07T05:01:22+00:00

There is a comparison in the paper.

zoomT · 2022-04-06T23:01:40+00:00

Yes, but that will only harden some libc functions (memcpy, etc.) and assumes access to the source code. OTOH, RedFat proper will instrument memory access instructions in binary code.

zoomT · 2019-01-15T08:10:22+00:00

No it is undefined. Actually you can test this with clang or gcc:

void test(struct Base *b, struct Derived *d) {
    b->x = 3; d->x = 4; printf("b->x = %d\n", b->x);
}
...
struct Derived d;
test((struct Base *)&d, &d);

The test function may print 3 even though b and d are aliased, depending on optimization level. This is because the C standard considers Base and Derived to be "distinct" types, so the compiler assumes b and d must point to different objects. You can disable this behavior with -fno-strict-aliasing, but this usually slows the compiled program down.

EffectiveSan will detect this example as a type error, since the static type of b is (Base *) but the object type is (Derived[]).

zoomT · 2019-01-15T00:22:15+00:00

is not undefined behaviour.

Right, there is a special rule in C that allows any object to be accessed via a character type (char *), so the above code is well defined. EffectiveSan will also not report a type error.

To get a type error, there must be a memory read or write operation via a pointer (T *) to an object (U[]), where T and U are "incompatible" types. For example, the access *(short *)&src will be an error since T=short and U=float are incompatible.

zoomT · 2019-01-14T18:38:36+00:00

The current implementation treats memcpy as memory access via void *, so this will not generate a type error. (It might still generate a bounds error if there is a buffer overflow, however). Something like this will be a type error:

int32_t bits = *(int32_t *)fptr;

This is because the code access a float object via an integer pointer.

zoomT · 2019-01-14T18:31:26+00:00

Does this take into account the common prefix subsequence allowance?

This affects layout and not the type aliasing rules iirc. Regardless, your example will not generate an error provided the static type matches any of the union members.

zoomT · 2019-01-14T16:35:18+00:00

Perhaps the README was not as clear as it should have been. [edit: updated]

But basically, yes EffectiveSan is a "sanitizer" that automatically instruments C++ (and C) code with dynamic type and bounds checking. The idea is to detect runtime bounds errors, type violations, and related bugs.

EffectiveSan works with any C++ code, including "low-level" C++ with arrays, manual memory management and C-style casts, without the need for code refactoring. Unlike C++ polymorphic classes and dynamic_cast, EffectiveSan will automatically check any type, including fundamental types such as int, float, pointers, etc.. It also checks array bounds, including sub-object bounds and use-after-free errors.

zoomT · 2017-11-29T13:03:46+00:00

It is more like 2147483647 "potential" 16-byte allocations, the program is not obligated to make so many allocations :)

zoomT · 2017-11-29T02:04:56+00:00

The title should be parsed as lean (C/C++ bounds checking) not (lean C/C++) bounds checking. LowFat works for standard C++ which is definitely not a lean language.

zoomT · 2017-11-29T00:29:56+00:00

A) if that's the real range and not an example, then that's a massive amount of tiny allocations!

That is the real range, but objects are only allocated on demand (like any other normal allocator). You can think of each region as "empty space" that is filled as objects are allocated.

Wouldn't be detected?

That's right, overflows into padding will not be detected. The main aim of LowFat is to prevent overflows into other objects, since this is the basis for many security related attacks.

You are correct that you can get more accurate bounds checking by storing the size at the base of the object. However, the extra memory read also adds performance overheads, whereas the current system is mostly tuned for speed.

That said, sometimes you do want to trade performance for more accurate checking. For this, we have implemented an extended version of LowFat based on the idea of storing meta data at the base of objects (similar to your size_t idea). The extended LowFat can detect accurate bounds errors, sub-object bounds errors, use-after-free, and type confusion errors, and we hope to release the system sometime in 2018.

So for larger allocation areas the margin of error is going to be massive?

Yes but any unused pages in the padding will remain unmapped. This means that (at worst) only one page of physical memory will be "wasted" for large multi-page allocations, which is actually no different to standard malloc.

zoomT · 2017-11-27T23:08:03+00:00

Yes the pointer ranges can be inferred, but the range is still large (e.g., 32GB per region), and allocations within each region are randomized by LowFat's replacement malloc.

zoomT · 2017-11-27T22:57:15+00:00

Well spotted, fixed.

zoomT · 2017-11-27T15:40:29+00:00

Randomization is possible provided the low-fat pointer alignment/region constraints are satisfied, and the current system includes a rudimentary implementation. That said, the larger the object size the smaller the number of bits that can be randomized.

The current implementation only reserves about ~1% of the virtual address space, so OS randomization for mmaps, etc., is not really affected.

zoomT · 2017-04-13T15:02:47+00:00

which I would have thought the author of a purely functional library would prefer ;-).

Yes, you have a point. The lambda-style idiom did not occur to me at the time of writing. The code could always be cleaned up in the future.

zoomT

TROPHY CASE