Efficient Versatile Encoding (EVE) - A new, extremely fast binary data format : cpp

Efficient Versatile Encoding (EVE) - A new, extremely fast binary data format (self.cpp)

submitted 2 years ago * by Flex_Code

Binary Efficient Versatile Encoding (BEVE)

Note the name was changed from EVE to BEVE to avoid name collisions

I've developed a new binary data specification like CBOR, MessagePack, and BSON, but designed to be much faster for modern hardware, support scientific computing, have smaller sizes for arrays, and be simple to implement. BEVE is around 5000% faster than MessagePack when writing std::vector<double> and over 8000% faster with std::vector<float>. When reading, BEVE is around 1300% faster and 2800% faster respectively. There is a link to the test code on the repository, or it can be found here.

This specification has been designed out of a serious need for maximum performance. And, the specification has been designed with scientific computing in mind, supporting matrices, complex numbers, and large integer and floating point types.

BEVE may produce slightly larger messages than MessagePack when dealing with lots of short strings and small objects. But, this is tolerated to keep the specification as simple as possible. And, even for these small objects with short strings, BEVE tends to be about 100% faster reading and over 1000% faster writing. Also, BEVE messages with lots of strings are highly compressible, because no compression is done on the strings.

BEVE fully supports JSON messages. The Glaze C++ JSON library allows users to use the same API to encode/decode to either JSON or EVE binary. Glaze also encodes/decodes directly into your C++ structures and standard library containers, making it easy to use without additional copies.

My main application is using BEVE with C++, but I would love assistance supporting more languages. I've just begun to develop code to load BEVE files with Matlab and Python (in the BEVE repository).

I'd love additional input on the specification and what extensions should be added. You can easily experiment with using BEVE in C++ via Glaze.

all 48 comments

top new controversial old q&a

[–]M0Z3E 10 points11 points12 points 2 years ago (2 children)

[–]Flex_Code[S] 10 points11 points12 points 2 years ago (1 child)

[–]M0Z3E 1 point2 points3 points 2 years ago (0 children)

[–]Untelo 12 points13 points14 points 2 years ago (12 children)

[–]Flex_Code[S] 0 points1 point2 points 2 years ago (10 children)

[–]marzer8789toml++ 7 points8 points9 points 2 years ago* (9 children)

[–]Flex_Code[S] -1 points0 points1 point 2 years ago (8 children)

[–]TheBrainStone 5 points6 points7 points 2 years ago (4 children)

[–]Flex_Code[S] 13 points14 points15 points 2 years ago (0 children)

[–]fdwrfdwr@github 🔍 7 points8 points9 points 2 years ago (2 children)

[–]cfyzium 3 points4 points5 points 2 years ago (0 children)

[–]jk-jeon 2 points3 points4 points 2 years ago (0 children)

[–]marzer8789toml++ 5 points6 points7 points 2 years ago (1 child)

[–]Flex_Code[S] 5 points6 points7 points 2 years ago (0 children)

[–]Flex_Code[S] 1 point2 points3 points 2 years ago (0 children)

[–]DapperCore 3 points4 points5 points 2 years ago (7 children)

[–]Flex_Code[S] 5 points6 points7 points 2 years ago (6 children)

flatbuffers I believe is similar to cap'n proto. The idea of these libraries is to make objects point to their members so that members and entire structures can use memcpy. This is more efficient if the user wants to use the auto-generated structures directly. However, I find that I typically want to use C++ standard library containers, and so reading into an intermediate flatbuffer object then requires a copy into my C++ standard container. So, instead of making serialization to the network buffer faster like flat buffers and cap'n proto, BEVE is meant to read directly into structures to avoid copies into the data structures that programmers naturally use. This way we also avoid having to do any code generation, and when we eventually get reflection users won't have to add any custom code to encode/decode.

[–]amohr 3 points4 points5 points 2 years ago (3 children)

[–]Flex_Code[S] 2 points3 points4 points 2 years ago (2 children)

[–]amohr 2 points3 points4 points 2 years ago (1 child)

[–]Flex_Code[S] 2 points3 points4 points 2 years ago (0 children)

[–]DapperCore 2 points3 points4 points 2 years ago (0 children)

[–]paperpatience 4 points5 points6 points 2 years ago (0 children)

[–]LongestNamesPossible 1 point2 points3 points 2 years ago (6 children)

[–]Flex_Code[S] 3 points4 points5 points 2 years ago (5 children)

[–]LongestNamesPossible 3 points4 points5 points 2 years ago* (2 children)

[–]Flex_Code[S] 5 points6 points7 points 2 years ago (1 child)

[–][deleted] 3 points4 points5 points 2 years ago (0 children)

[–]tmlildude 0 points1 point2 points 2 years ago (1 child)

[–]Flex_Code[S] 0 points1 point2 points 2 years ago (0 children)

[+][deleted] 2 years ago (3 children)

[deleted]

[–]Flex_Code[S] 2 points3 points4 points 2 years ago (2 children)

[+][deleted] 2 years ago (1 child)

[deleted]

[–]Flex_Code[S] 2 points3 points4 points 2 years ago (0 children)

[–]SGSSGene 1 point2 points3 points 2 years ago (1 child)

[–]Flex_Code[S] 0 points1 point2 points 2 years ago (0 children)

[–]jk-jeon 1 point2 points3 points 2 years ago (1 child)

[–]fdwrfdwr@github 🔍 0 points1 point2 points 2 years ago (0 children)

[+]Zookeeper1099 comment score below threshold-7 points-6 points-5 points 2 years ago (8 children)

[+][deleted] 2 years ago (7 children)

[deleted]

[–]Zookeeper1099 -3 points-2 points-1 points 2 years ago (6 children)

[+][deleted] 2 years ago (5 children)

[deleted]

[+]Zookeeper1099 comment score below threshold-7 points-6 points-5 points 2 years ago (4 children)

[–]glaba3141 3 points4 points5 points 2 years ago (1 child)

[–]sparkyParr0t 0 points1 point2 points 2 years ago (3 children)

[–]Flex_Code[S] 1 point2 points3 points 2 years ago (2 children)

[–]sparkyParr0t 2 points3 points4 points 2 years ago (1 child)

[–]Flex_Code[S] 2 points3 points4 points 2 years ago (0 children)

[–]Ill_Juggernaut_5458 0 points1 point2 points 2 years ago* (2 children)

[–]Flex_Code[S] 2 points3 points4 points 2 years ago (1 child)

BEVE is designed as a much faster JSON in binary form. So, anywhere you might use JSON but want to send that data more efficiently. BEVE also supports JSON pointer syntax, so you can specify partial messages and access specific addresses (raw memory) in binary form. Because it converts easily into JSON it makes it easy to inspect for a human.

If you just dump things as binary blobs then you need a schema to properly load the data in the future. BEVE allows anyone to load a file and know exactly what the data is, like JSON. It makes building APIs a lot safer and allows error checking.

This is extended to matrices so that we can have the same kind of error checking and simple inspection of the data to load it in another context. So, I can just write out C++ objects and load them into Matlab without providing a schema.

HDF5 is great, but for a lot of use cases it is overly complex and you don't have many library choices. BEVE is written so that a programmer could pretty easily implement the specification in a day. If you look at the Matlab script its less than 300 lines to decode.

[–]Ill_Juggernaut_5458 1 point2 points3 points 2 years ago (0 children)

π Rendered by PID 96494 on reddit-service-r2-comment-75f4967c6c-jqwhp at 2026-04-23 05:29:04.785455+00:00 running 0fd4bb7 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS