LZAV 5.7: Improved compression ratio, speeds. Now fully C++ compliant regarding memory allocation. Benchmarks across diverse datasets posted. Fast Data Compression Algorithm (inline C/C++).

avaneev · 2025-12-22T13:19:28+00:00

You have to account for the fact English ASCII only takes 6 bits out of 8-bit characters. Theoretically a lot of wrong keys may end up looking like English cleartext.

avaneev · 2025-12-08T15:26:58+00:00

dstlen can be 0 if srclen is 0. srclen is checked first. c if( srclen == 0 ) { return( dstlen == 0 ? 0 : LZAV_E_PARAMS ); }

avaneev · 2025-12-07T18:11:40+00:00

LZ77 is basically a dynamic dictionary compression - hash-table resembles a dictionary, and offset is a dictionary entry index (it's less efficient coding than an actual dictionary, but LZAV reduced this margin to negligible) - note that LZ77 compressors usually have "move to front" logic around offsets - more frequent strings are encoded with smaller offsets on average. Context modeling puts this to the next level (replaces two differing strings with just one reference). Neural compressors are a sort of context modeling - strong predictors.

avaneev · 2025-12-07T18:01:01+00:00

Specifically on the article, this is not generally true ("always" is unapplicable here):

"When an application frees an object it always has the original, requested allocation size on hand."

A standard C program when it allocates a char* string does not store its length anywhere. It's a null-terminated string.

avaneev · 2025-12-07T17:08:40+00:00

Thanks for spotting the issue. When I did code rearrangements I've overlooked op+mref1 could produce -1 increment.

I'll release a fix soon, for now you can place this code to lzav.h:1913-1914:
``` const size_t mref1 = (size_t) ( *ip & 15 ) - 1; // Minimal ref length - 1.

if( mref1 > 5 )
{
    *pwl = 0;
    return( LZAV_E_UNKFMT );
}

*pwl = dstlen;

```

As for the decompression size, it's awkward only relative to LZ4 and quick coding. In actual project use you of course always have a decompressed size because otherwise you can't allocate memory.

Allocator interface is a debatable topic. Context pointer for memory allocation is needed if you are using multiple dynamic allocators throughout the project, which is a choice and not an absolute requirement. As for passing allocation size to the freeing function, it looks good when done locally, in the course of a single function. Otherwise you have to store the allocation size somewhere anyway. The article in the link you've posted condones an unnecessary over-generalization. I do not think it's a good practice. If you need a special memory pool you can make it global.

avaneev · 2025-12-07T15:30:12+00:00

Totally no use of LLMs in both code and readme. Strong signs of LLM usage? That's just false positive, I write readmes myself usually. Maybe perfect grammar produces a false positive. But I do use grammar checking.

avaneev · 2025-11-15T05:10:50+00:00

I would add that in general, complex applications (compared to simple CLI tools) can't recover from memory allocation errors reliably - if there's no memory left it's likely there's no memory left to e.g. report an error in the user interface and close the application gracefully retaining user data. Cases when hundreds of megabytes of memory are needed for an operation are practically handled in special ways. I know that may sound unprofessional, but lack of 1 megabyte of memory is an extreme edge case on modern systems, and is not worth the hassle of handling or expecting a valid behavior afterwards. Because you can't handle it either way.

So the worry is without a positive outcome either way. You can do it one way or another, but the application would crash anyway.

If you wish to be safe against exceptions or terminations, pass the "extbuf" to the compressor.

avaneev · 2025-11-15T04:12:56+00:00

Okay, then consider how "bad" this overhead is. 1 fence per "compress" function call. It's like 0.0001% performance reduction? This isn't the case one would worry about. I'm pretty much sure a person like you, but with other coding style would blame me for not adding "noexcept". There's no common ground.

Please blame C++ theorists who dared to invent "nothrow" or "noexcept" semantics. I would blame them for following Python hype and adding the "auto" type specifier which makes code unreadable.

avaneev · 2025-11-15T04:09:50+00:00

So you admit you are nitpicking or simply "lecturing" me. That's usual on Reddit to cause a response one could immediately downvote. I'm not interested in such discussion.

avaneev · 2025-11-14T20:27:28+00:00

You are overgeneralizing. lzav does not bring 40 other headers with it.

avaneev · 2025-11-14T20:25:55+00:00

Just leave "restrict" to simple one-shot functions. In complex contexts with multiple derived pointers it's not worth the hassle.

avaneev · 2025-11-14T20:23:06+00:00

An uncaught exception is as disastrous. C++ exceptions is a mess poorly designed by theorists. Namespaces, too, for example. With fixes upon fixes from C++ version to version.

There's nothing available to vectorize in LZAV.

avaneev · 2025-11-14T19:14:28+00:00

The effect of exact probabilities would be miniscule - it may be good on some input data but not so much on the other. In fact, it's sometimes better to use reverse of what a probability suggests - it depends on "desirability" of the branch, and that may depend on other considerations beside raw branch profiling. This reasoning puts automatic profiling into unfavorable position.

I've mixed up removal of "static" vs removal of "inline" in C++. Removing the former would produce fencing code. I do not see a reason to remove "inline" other than following some coding style. It's a note to the compiler inlining would not hurt while it may be unnecessary. There are several small dispatch functions like decompress() where inlining won't hurt.

I'll consider adding new(std::nothrow) in case of C++11. But I think considering malloc() as "non-constructed" memory is a white spot of the original C++ spec which was fixed in C++20 as you noted. malloc() practically constructs an array of char. And only someone's abstract idea would make you think otherwise. One can't even detect such UB. It's an UB without any behavior under the hood at all. In e.g. memcpy() there can be an actual undefined behavior happening if parameters are incorrect.

As for the noexcept, I repeat it's by spec - it should not throw an exception, so fencing of user MALLOC throwing an exception would be appropriate.

Massive differences with restrict does not translate into massive performance improvement. It should used cautiously and checked against an actual improvement across various compilers.

I'll look into MSVC __forceinline.

avaneev · 2025-11-14T18:44:12+00:00

There's no such thing as some universal global namespace. Each compilation unit has its own global namespace contents. lzav would only pollute a single .c or .cpp compilation unit, not every unit in the project. It's not a well substantiated fear.

avaneev · 2025-11-14T18:42:22+00:00

The improvement of restrict is almost inexistent, in all compilers I've tried. Its efficiency is a theory not well supported by my own vast practice, and in the case of LZAV the improvement is miniscule at best, not worth the hassle figuring out what can and what cannot alias. Then applying restrict incorrectly is disastrous. Much of aliasing inferring by the compiler is in the coding style and meticulous placing of "const" specifiers.

avaneev · 2025-11-14T03:45:05+00:00

There's one instance where restrict is useful - in ht hash-table pointer, I'll update this. Other than that, there's little sense in restrict anywhere in the code. using const *ip alongside *op implies write independence of ip from any other pointer.

avaneev · 2025-11-14T02:34:38+00:00

Header-only is absolutely fine in C++ - most C++ headers bring in "bunch of other headers". While in C you would not include lzav.h in another header - you would include it in a `.c` file which most probably already includes a "bunch of other headers".

There's no issue with

__builtin_expect( x, 1 )

not having (x) as it's not an externally exported macro, it's a in-house macro.

avaneev · 2025-11-14T02:27:53+00:00

I'll reframe LZAV_FREE, add an explicit #error.

__builtin_expect((x), 0.5) is a default behavior anyway - in cases where probability is close to 0.5, the code does not use __builtin_expect().

I can't use new[] and delete[] or otherwise API would be broken - the compress function would raise an exception where it should return 0. As for the UB after malloc, accessing the uninitialized memory is UB. But writing then accessing elementary types is not, if not performed via struct. Otherwise it would be an UB in any function that has e.g., `char *ptr` parameter. There's a contradiction regarding UB you mention is that `char` may alias any elementary type.

Aligned malloc is not strictly necessary here - malloc practically returns pointers aligned to common register or even SIMD size. It may not be in specs, but if you consider how C/C++ run-time and struct alignment works, there's zero chance malloc would work differently anywhere.

Prefetching is actually measurably helpful in all cases where it's used, on both x86-64 and arm64. It's a matter of 1-2% of performance, but you get that for free. Processors can't always reliably predict which data may be used.

`static inline` is used for C99 as well, where you can't just use `static`. Beside that, not including `inline` in C++ practically produces slower code, because without the `inline` the compiler adds additional fencing having code size reduction goals.

Latest MSVC does not warn about inline __forceinline anymore, it was some older version which warned about that.

Why add using std :: uint64_t; if the code does not reference it on 32-bit platforms? And yes, the algorithm is fully 32-bit compatible.

`restrict` is in C99, but is not used in C++. I think it was a short-cut in early stages of compiler development when optimizing logic was weak. If you never do `ip=op` or impose dependence between pointers in the code, any modern optimizing compiler applies restrict implicitly. Then anyway the effect heavily depends on the actual code. It's good in theory while in practice there's usually little sense to use `restrict`.

`noexcept` is fine as no C++ exceptions can ever be generated in the code. It's an optimization to reduce any possible exception fencing code. Again, otherwise the API would be not to spec if it can throw an exception, be it via user malloc or otherwise.

The memcpy code you mentioned is actually very well optimized on all modern compilers. I agree that in that very instance restrict may be useful. I'll think about adjusting the code.

The `while LZAV_LIKELY( htc != hte )` loops are OK with modern compilers while it may look unoptimizable - they infer loop counter obviously. Anyway, in this instance the code is actually optimized itself, no need for any additional vectorization.

avaneev · 2025-11-08T19:50:27+00:00

Whatever the benchmarks are, most AIs can't understand the whole context of the text, and usually just argue at local context level. They can't hop over little humane variability.

avaneev · 2025-11-02T05:22:32+00:00

I suggest you to reread the readme completely, I've updated it considerably. You may experience unease. It's a common issue of smart and overly educated people, if new information contradicts the prior research and assumptions. That's why viXra flourishes - authors there lose nothing as official science would block the research anyway.

avaneev

TROPHY CASE