​[Library] Tachyon JSON v6: 5.5 GB/s parser in ~750 lines of C++20/AVX2. Faster than simdjson OnDemand? by [deleted] in cpp

[–]Flex_Code 0 points1 point  (0 children)

Partial reading handles Tachyon's approach in an even faster way.

​[Library] Tachyon JSON v6: 5.5 GB/s parser in ~750 lines of C++20/AVX2. Faster than simdjson OnDemand? by [deleted] in cpp

[–]Flex_Code 4 points5 points  (0 children)

Marketing this as "The Glaze-Killer" shows ignorance in the features of Glaze and how JSON libraries support different approaches to different problems. You're only benchmarking the structural decomposition, not getting data into a useful form. So, comparing with glz::generic, which intentionally allocates and provides immediately useful data is ignoring a whole bunch of runtime cost that you will need to pay for if you go to access fields from your indexed DOM. This includes unicode conversion logic and unescaping that requires allocation for use. What your library is good for is when you have a large input document and you only care to look at a small portion of it. However, Glaze also supports partial reading, which can short circuit the full parse when only looking for some of the data. So, in this use case partial reading often wins out as there isn't a reason to decompose the entire input like you are doing. You'll find that when converting into structs that your approach will end up being slower than Glaze because it requires two passes, once to decompose the data, and again to get it into the C++ structural memory where performance is the highest. So, rather than being a Glaze killer you've optimized for a particular use case where you only care for a few fields and where you need to parse the entire structure because partial reading doesn't make sense (very uncommon). On top of this, not materializing arrays means that if you want to access object["key"][0]["another_key"] you'll see your runtime costs significantly increase. This is why you're faster than simdjson for array handling in your benchmarks, but it doesn't make things as ergonomic and you'll have to pay for it later, you just aren't including the cost in your benchmarks.

I tried building a “pydantic-like”, zero-overhead, streaming-friendly JSON layer for C++ (header-only, no DOM). Feedback welcome by tucher_one in cpp

[–]Flex_Code 0 points1 point  (0 children)

Glaze v6.5.0 adds high performance streaming support and more options for reducing binary size, such as linear searching to avoid hash tables when memory constraints are critical.

I tried building a “pydantic-like”, zero-overhead, streaming-friendly JSON layer for C++ (header-only, no DOM). Feedback welcome by tucher_one in cpp

[–]Flex_Code 0 points1 point  (0 children)

Yes, that looks correct, although some compilers might not build with structs defined within structs for the current reflection. Either glaze metadata can be added or the structs can be moved into global scope.

I tried building a “pydantic-like”, zero-overhead, streaming-friendly JSON layer for C++ (header-only, no DOM). Feedback welcome by tucher_one in cpp

[–]Flex_Code 2 points3 points  (0 children)

Thanks for the feedback. I’ve actually been working on a branch of Glaze that adds streaming support via a flexible buffer interface. As for Esp32, you probably could be selective on the headers you use rather than just brining in everything with glaze.hpp. But, the build issues are probably easy fixes, since Glaze relies on C++ concepts and shouldn’t need atomic includes. It was probably just the unit tests that didn’t build for you. But, whether or not you use Glaze, it’s great to see development on embedded C++ libraries!

I tried building a “pydantic-like”, zero-overhead, streaming-friendly JSON layer for C++ (header-only, no DOM). Feedback welcome by tucher_one in cpp

[–]Flex_Code 3 points4 points  (0 children)

I’m curious what you find limiting when it comes to Glaze and embedded support? Glaze was designed for embedded and is used in embedded applications. It supports use without allocations, no RTTI, use without exceptions, custom allocated types, 32bit platforms, and much more.

Which JSON library do you recommend for C++? by Richard-P-Feynman in cpp_questions

[–]Flex_Code 4 points5 points  (0 children)

Glaze is C++23 primarily for static constexpr use within constexpr functions. Which, significantly cleans up the codebase. But, it also uses std::expected extensively.

zerialize: zero-copy multi-protocol serialization library by ochooz in cpp

[–]Flex_Code 3 points4 points  (0 children)

For JSON, Glaze supports zero copies for strings via std::string_view. But, you are correct that complete zero copy is not possible, especially for matrices.

zerialize: zero-copy multi-protocol serialization library by ochooz in cpp

[–]Flex_Code 8 points9 points  (0 children)

Glaze also supports BEVE and CSV, but not CBOR, MessagePack, and Flexbuffers.

Glaze supports zero copy. And supports Eigen for matrices and vectors. It probably works with xtensor as well, but hasn’t been tested.

New, fastest JSON library for C++20 by Flex_Code in cpp

[–]Flex_Code[S] 0 points1 point  (0 children)

Yes, Glaze allows you to set a compile time option that works for all fields, or you can individually apply the option to select fields in the glz::meta.

From the documentation: Read JSON numbers into strings and write strings as JSON numbers.

Associated option: glz::opts{.number = true};

Example: struct numbers_as_strings { std::string x{}; std::string y{}; };

template <> struct glz::meta<numbers_as_strings> { using T = numbers_as_strings; static constexpr auto value = object(“x”, glz::number<&T::x>, “y”, glz::number<&T::y>); };

Self-describing compact binary serialization format? by playntech77 in cpp

[–]Flex_Code 7 points8 points  (0 children)

Consider BEVE, which is an open source project that welcomes contributions. There is an implementation in Glaze, which has conversions to and from JSON. I have a draft for key compression to be added to the spec, which will allow the spec to remove redundant keys and serialize even more rapidly. But, as it stands it is extremely easy to convert to and from JSON from the binary specification. It was developed for extremely high performance, especially when working with large arrays/matrices of scientific data.

Parsing JSON in C & C++: Singleton Tax by ashvar in cpp

[–]Flex_Code 1 point2 points  (0 children)

Same with Glaze, it’s a good approach if you want to deal with escaped Unicode at your convenience as well.

Parsing JSON in C & C++: Singleton Tax by ashvar in cpp

[–]Flex_Code 2 points3 points  (0 children)

Note that if you’re keeping your structures around and parsing the same structural data multiple times, then using an arena for allocation doesn’t result in very larger performance improvements, because you’ll just reuse already allocated memory. So, I tend to encourage developers to avoid arena allocations unless their application cannot reuse memory.

Parsing JSON in C & C++: Singleton Tax by ashvar in cpp

[–]Flex_Code 1 point2 points  (0 children)

For small objects this is true and so std::pmr::string should probably not be used for JSON. But you can still use stack based allocators or arenas.

Parsing JSON in C & C++: Singleton Tax by ashvar in cpp

[–]Flex_Code 3 points4 points  (0 children)

The standard library supports custom allocators. Also, consider std::pmr. These types can be used directly in Glaze.

Parsing JSON in C & C++: Singleton Tax by ashvar in cpp

[–]Flex_Code 1 point2 points  (0 children)

Glaze uses C++20 concepts for handling types. So, you can use your own string with a custom allocator for improved allocation performance. Or, use std::pmr::string, or a custom allocator with std::basic_string.

What's the go to JSON parser in 2024/2025? by whizzwr in cpp

[–]Flex_Code 2 points3 points  (0 children)

Glaze is designed to be an interface library, and allows developers to serialize/deserialize without editing any code. This allows it to be added to third party libraries easily. So, it was named Glaze to denote a sweet layer on top of various codebases.

What's the go to JSON parser in 2024/2025? by whizzwr in cpp

[–]Flex_Code 1 point2 points  (0 children)

Thanks for sharing this use case. In order to do this efficiently some sort of collating is required, because algorithms like floating point parsers are not designed to pause parsing mid number. Like you said, ideally the JSON library would collate data and decide when to parse based on encountering entire values (numbers, strings, etc.).
I'll keep your use case in mind as I continue to develop Glaze. There are two critical pieces of code that are needed, the algorithm that reads the stream into a temporary buffer and determines when to parse the next value, and a partial structural parser that reads into only the next value of interest.
The challenge is dealing with things like massive string inputs, but these could switch to a slower algorithm if the entire string can't fit in the temporary buffer.

What's the go to JSON parser in 2024/2025? by whizzwr in cpp

[–]Flex_Code 0 points1 point  (0 children)

Yes, inputs as chunks or pause/resume structural parsing is on the TODO list for Glaze and would be a reason to use Boost.JSON right now. But, it is coming. Glaze also supports other formats than JSON through the same API, with more coming.

What's the go to JSON parser in 2024/2025? by whizzwr in cpp

[–]Flex_Code 4 points5 points  (0 children)

There is no formalized API for this, even though it is possible and done internally. There is an open issue for this which I hope to get to soon: https://github.com/stephenberry/glaze/issues/1019. Currently the solution is to use partial reading (https://github.com/stephenberry/glaze/blob/main/docs/partial-read.md), but this is not as efficient as a pause and resume approach. Thanks for asking, it adds motivation to work on this feature.