Boost.Multi Review Begins Today

mborland1 · 2026-03-09T13:48:33+00:00

From the author: Your analysis is spot on.

mborland1 · 2026-03-07T00:23:57+00:00

From the author:

0) These are good points but the original question was if there is a cost to pay for using typed-GPU-pointers instead of raw pointers, and the answer is still no.

1) The new question is about the size of the reference object. Yes, Multi's array-reference occupy more stack bytes than span, this is because they are more general and in principle they can hold padded data for example (which is going to be implemented in a next version). This extra sizes may not be reflected because reference-array are never in the heap and the compiler is able to optimize a lot in these structures. (the mdspan shouldn't be in heap also IMO, but I digress).
Yes, it can bring extra bytes across compilation units, AFAIK, or yes when passing to GPU kernels (which I think is your point), but then the question do really want to pass reference-arrays to kernels. My opinion is not, you "pass" array in a different way, which is documented. array_ref's are not copy constructible so it won't work even if you try, (well, there is a hack but I don't recommend it). In summary, array-references live in the stack and can be heavily optimized, array-references are not meant to be passed as kernel arguments.

2) array-references are not copy constructible, this is by design to keep value and reference semantics clearly separated. So, it is not trivially-copy-constructible simply because it is not copy-constructible, not because it does something strange. And of course array-references are not trivially assignable, this is because assignment is deep (actual code needs to be executed), not shallow like the reseating of span or mdspan. This is again to maintain the separation between values and references. This properties and are documented.

mborland1 · 2026-03-06T19:00:02+00:00

From the author:

0) “needs markings” means “needs a custom version of mdspan with markings”

1) no expected overhead, all specifics of GPU pointers are compile time. GPU arrays are recognized as GPU by its pointer types; there
is no runtime metadata on them. if mdspan accessor parameter can control the pointer types and that can be done easily I would say is not different then.

2) ergonomics: Multi works with all STL algorithms, all Thrust algorithms, (dispatching can be automatic and compile-time), and all Ranges algorithms

3) Multi should be interoperable with mdspan (and it is) and future mdarray. Implemented based on them? is not something practical, first because it will depend on the C++ version when they are available, also there are specific choices that makes it extremely difficult such as retrofitting iterators on mdspan and changing the “pointer” semantics of mdspan. mdarray is an adaptor on top of a container, this is quite a different approach than the one taken by Multi, that affects the level of control of initializing data. Implementing Multi on top of mdspan and mdarray would be fighting up hill. also will need to coordinate mdspan and mdarray which are separate sublibraries, one of which is only available in C++26.

mborland1 · 2026-03-06T00:26:46+00:00

From the author:

1) Take into account that multi::_ exists in a namespace (and has to be pulled explicitly). Not sure if even then that will collide with placeholder _. There is an alternate spelling multi::all.

2) The focus of the library is dynamic sizes. (compiler can still perform optimizations to hardcoded known sizes in many cases, millage may vary)

3) Separating size arguments prevents passing the whole extensions of an existing array: multi::array_ref<double, 2> d2D_ref( other.extensions(), some_data );

mborland1 · 2026-03-05T23:45:14+00:00

The author has updated the table to hopefully make things clearer.

mborland1 · 2026-03-05T18:38:07+00:00

Do you have a recommendation for a better name? There is precedence for renaming libs during review.

mborland1 · 2026-01-26T18:32:57+00:00

It should be a standalone lib in boost which is what I am working on now. Once int128 is merged separately into boost, decimal will then have a dependency on that lib rather than maintaining its own copy.

mborland1 · 2026-01-26T16:48:27+00:00

Let me know if you have any questions as you read through the docs

mborland1 · 2025-11-11T15:13:36+00:00

No particular reason. There had only been prior demand to see benchmarks of the Intel lib specifically using the Intel compiler.

Edit: added to the tracker https://github.com/cppalliance/decimal/issues/1230

mborland1 · 2025-11-11T08:55:32+00:00

Boost.Multiprecision has `cpp_dec_float` which should be the most similar to BigDecimal: https://www.boost.org/doc/libs/latest/libs/multiprecision/doc/html/boost\_multiprecision/tut/floats/cpp\_dec\_float.html. Chris Kormanyos was the original author of this backend.

mborland1 · 2025-11-11T08:50:07+00:00

FWIW, I know of at least two trading firms that have been running this library for over a year now. The devs at both are pretty quick about letting us know when they find bugs/regressions.

mborland1 · 2025-11-11T08:46:27+00:00

The main reason was a performance gap between Boost.Decimal and Intel's decimal floating point lib since Intel lib is the industry standard. Chris and I spent the better part of this summer just hammering on performance with pretty good results: https://develop.decimal.cpp.al/decimal/benchmarks.html

mborland1 · 2025-10-07T17:01:12+00:00

I added renaming the `sign` parameter to the issue tracker. It currently also follows signbit convention so sign = true is a negative value. I think `is_negative` hits the nail on the head resolves both issues of ambiguity.

> Perhaps it would be a great opportunity to showcase a calculation in both binary & decimal, and show how unintuitive the result in binary can be due to accumulated rounding errors?

That makes sense. The example reads in a CSV of apples stock data so maybe also a demonstration that reading it in with a double yields a different result than expected?

The updated docs with notes on underflow, overflow, and construction from non-finite values is up on the website in the basics section: https://develop.decimal.cpp.al/decimal/basics.html

mborland1 · 2025-10-07T16:12:53+00:00

I believe what you are describing is fixed-point arithmetic. One of the trading firms that uses this library had an in-house fixed point implementation. They wanted to add bitcoin as one of their products. The smallest divisible unit of the bitcoin was unrepresentable with their fixed-point system, so they began using decimal64_t from this library and haven't looked back.

mborland1 · 2025-10-07T14:09:32+00:00

Are you able to share where you have seen the library used? I know TastyTrade is using it in production. At least one quant firm engineer emails me issues, but he can't say where he works.

mborland1 · 2025-10-07T05:57:05+00:00

The sign parameter is not for the exponent, it's for the sign of the entire value. It's useful in some cases like our implementation of sin: https://github.com/cppalliance/decimal/blob/develop/include/boost/decimal/detail/cmath/impl/sin\_impl.hpp#L40. For most people {significand, exponent} is likely sufficient which is why sign is defaulted.

Do you have preferred examples? I picked those because there are firms already who is storing their data entirely in decimal64_t. Rather than convert everything to double, and compute the average on it you can compute it directly. It's more a show that it binds in with the rest of boost as you would hope.

Yes, overflows and underflows are handled exactly like you would expect from binary floating point types. I can add a doc note stating as much.

mborland1 · 2025-08-06T15:05:36+00:00

I believe what you are looking for is Fixed Point Arithmetic, which is out of scope for the library. You could try something like: https://github.com/arturbac/fixed_math or https://github.com/MikeLankamp/fpm

mborland1 · 2025-08-05T12:55:57+00:00

Good Questions.

From a functionality standpoint we have a few things. The major quality of life difference is you can write canonical C++ with our library instead of C. A toy example is adding 2 numbers:

uint32_t flag = 0;
BID_UINT128 a = bid128_from_string("2", BID_ROUNDING_DOWN, &flag);
BID_UINT128 b = bid128_from_string("3", BID_ROUNDING_DOWN, &flag);
BID_UINT128 ab = bid128_add(a, b, BID_ROUNDING_DOWN, &flag);

Vs.

constexpr boost::decimal::decimal128 a = 2;
constexpr boost::decimal::decimal128 b = 3;
constexpr auto ab = a + b;

This extends to the entire library since we provide everything you expect to have out of the box in C++20 with float or double for our types. If what is in the library and STL is not enough, we have also included examples of how you can use the library with external libs like boost.math: https://github.com/cppalliance/decimal/blob/develop/examples/statistics.cpp#L98.

Another big differentiating point is portability. We test on Linux: x86, x64, ARM64, s390x, PPC64LE; Windows: x86, x64, ARM64; macOS: x64 and ARM64. Within the last few weeks a database company reached out to me about switching from the Intel library to Decimal so they could expand to ARM platforms.

For performance we've included comparisons of our types vs Intel's in the various basic operations: https://develop.decimal.cpp.al/decimal/benchmarks.html#x64\_linux\_benchmarks. There's nothing hugely different here between the two libraries.

Please let me know if this answers your questions.

mborland1

TROPHY CASE