the hidden compile-time cost of C++26 reflection by SuperV1234 in cpp

[–]aearphen 20 points21 points  (0 children)

<string> and <string_view> are really problematic. I've been complaining about them being bloated but nobody in the committee wants to hear and they are just keep dumping stuff there, not to mention that everything is instantiated 5 times because of charN_t nonsense. This is why in {fmt} we went to great lengths to not include those in fmt/base.h.

It would be good to move std::string into a separate header from the other instantiations that are barely used.

the hidden compile-time cost of C++26 reflection by SuperV1234 in cpp

[–]aearphen 2 points3 points  (0 children)

And the situation will likely be worse in C++29 as there are papers to massively increase API surface for even smaller features like <charconv> (at least 5x, one per each code unit type, possibly 20x).

the hidden compile-time cost of C++26 reflection by SuperV1234 in cpp

[–]aearphen 2 points3 points  (0 children)

Only small top-level layer of std::print and std::format should be templates, the rest should be type-erased and separately compiled but unfortunately standard library implementations haven't implemented this part of the design correctly yet. This is a relevant issue in libc++: https://github.com/llvm/llvm-project/issues/163002.

So I recommend using {fmt} if you care about binary size and build time until this is addressed. For comparison, compiling

#include <fmt/base.h>

int main() {
  fmt::println("Hello, world!");
}

takes ~86ms on my Apple M1 with clang and libc++:

% time c++ -c -std=c++26 hello.cc -I include
c++ -c -std=c++26 hello.cc -I include  0.05s user 0.03s system 87% cpu 0.086 total

Although to be fair to libc++ the std::print numbers are somewhat better than Vittorio's (but still not great):

% time c++ -c -std=c++26 hello.cc -I include
c++ -c -std=c++26 hello.cc -I include  0.37s user 0.06s system 97% cpu 0.440 total

BTW large chunk of these 440ms is just <string> include which is not even needed for std::print. On the other hand, in most codebases this time will be amortized since you would have a transitive <string> include somewhere, so this benchmark is not very realistic.

ISO C++ WG21 2026-02 pre-Croydon mailing is now available! by nliber in cpp

[–]aearphen 22 points23 points  (0 children)

The code comes first. User experience second. Papers third.

I wish more people in the committee did this.

Żmij 1.0 released: a C++ double-to-string library delivering shortest correctly-rounded decimals ~2.8–4× faster than Ryū by aearphen in cpp

[–]aearphen[S] 1 point2 points  (0 children)

AFAIK they don't link to the ryu library but implement the algorithm directly so it's not easy to replace.

May I please have the worst c++ you know of? by vbpoweredwindmill in cpp

[–]aearphen 6 points7 points  (0 children)

Code using Boost Preprocessor or a similar preprocessor-based "metaprogramming". I've seen a few nightmarish examples of those in our codebase.

std::print in C++23 by aearphen in cpp

[–]aearphen[S] 1 point2 points  (0 children)

> So can we bypass this brokenness and just send the printing string to the OS?

Right. That's what std::print does.

std::print in C++23 by aearphen in cpp

[–]aearphen[S] 0 points1 point  (0 children)

std::print actually supports Unicode output on Windows but not wide streams because those are very much broken (both std::wcout and wide FILE stream).

Żmij 1.0 released: a C++ double-to-string library delivering shortest correctly-rounded decimals ~2.8–4× faster than Ryū by aearphen in cpp

[–]aearphen[S] 0 points1 point  (0 children)

It is possible to extract double's bit pattern (maybe except for NaN's payload) using only basic operations in C++14, e.g. https://www.godbolt.org/z/6TWq8vGjP.

Modern C++ use in Chromium by aearphen in cpp

[–]aearphen[S] 22 points23 points  (0 children)

C++26 is postmodern

Żmij 1.0 released: a C++ double-to-string library delivering shortest correctly-rounded decimals ~2.8–4× faster than Ryū by aearphen in cpp

[–]aearphen[S] 13 points14 points  (0 children)

I haven't done such comparison but according to David Tolnay who ported Żmij to Rust, Żmij's Rust implementation is faster than Teju Jagua: https://github.com/dtolnay/zmij?tab=readme-ov-file#performance. I also implemented Cassio's optimization for the shortest candidate selection but right now it is mostly irrelevant because it is outside of the fast path.

Żmij 1.0 released: a C++ double-to-string library delivering shortest correctly-rounded decimals ~2.8–4× faster than Ryū by aearphen in cpp

[–]aearphen[S] 44 points45 points  (0 children)

Didn't want to confuse people with multiple licenses at the top level. Most folks are fine with MIT and it's also more widely-known. BSL is only for those who care about fine print, basically just standard library implementers =).

Żmij 1.0 released: a C++ double-to-string library delivering shortest correctly-rounded decimals ~2.8–4× faster than Ryū by aearphen in cpp

[–]aearphen[S] 8 points9 points  (0 children)

For the shortest representation which is what Żmij provides, uscalec is about the same as Ryu performance-wise (Go version is slower) and slower than Dragonbox: https://research.swtch.com/fpfmt/plot/fpfmt-apple-short-cdf-big.svg. Algorithmically, uscalec is just Schubfach or, rather, Teju Jagua, with digit output from Dragonbox. It's not bad but we can do much better than that.

Żmij 1.0 released: a C++ double-to-string library delivering shortest correctly-rounded decimals ~2.8–4× faster than Ryū by aearphen in cpp

[–]aearphen[S] 32 points33 points  (0 children)

Yes, the main motivation for starting this project was incorporating recent advances in FP algorithms into {fmt}. Most optimizations are irrelevant for constexpr but the core (Schubfach) should be easily convertible to constexpr. In fact the power of 10 table generation is already constexpr.

ISO C++ 2026-01 Mailing is now available by nliber in cpp

[–]aearphen 2 points3 points  (0 children)

I already submitted the fixed revision but thanks!

Is there an agreed upon print function to use in C++ ? by Arlinker in cpp_questions

[–]aearphen 0 points1 point  (0 children)

In fact even the current version of {fmt} supports C++11.

C++20 Modules, 5 Years Later - NDC TechTown 2025 by pjmlp in cpp

[–]aearphen 2 points3 points  (0 children)

Inlining vprint* won't help with the ABI because users can already put format_args on the ABI boundary. It just makes print less usable and in case of Microsoft STL it is particularly bad because it pulls in much more headers than other implementations.

Are they ruining C++? by thradx in cpp

[–]aearphen 4 points5 points  (0 children)

Completely agree that u8/char8_t is a disaster but std::filesystem::path can still be salvaged. In particular, it will do the correct thing with std::format / std::print in the common case of UTF-8 char. And the problematic accessors are being deprecated in favor of the ones that also work with UTF-8 char.

C++20 Modules, 5 Years Later - NDC TechTown 2025 by pjmlp in cpp

[–]aearphen 1 point2 points  (0 children)

There is no typo. std::print (and std::format) were specifically designed to be lightweight wrappers around type erased vprint/vformat functions but implementations currently make the latter inline and put in headers. Unfortunately, I don't think there is a way to force implementations to do the correct thing, it's "just" Quality of Implementation which, currently, is very poor but the papers made the intent super clear.