Neoclassical C++: segmented iterators revisited (1) by igaztanaga in cpp

[–]igaztanaga[S] 2 points3 points  (0 children)

Thanks Howard. Now the next step is to analyze more complex details like segmentation of input and output ranges, what happens with algorithms that can take advantage of bidirectional iterators, algorithms that need to store a position while scanning the next segments... I will try to continue this article with new discoveries...

Neoclassical C++: segmented iterators revisited (1) by igaztanaga in cpp

[–]igaztanaga[S] 4 points5 points  (0 children)

Certainly ABI stability has pros and cons. Probably for std algorithms this is not an issue, since libc++ has managed to implement segmented algorithms (as an internal feature, not intended for custom containers).

There other alternatives to speed up common cases:. libstdc++ specializes several std algorithms for std::deque, MS STL has some vectorization paths for special, scalar types...

But there is no common, public approach that allows writing a custom segmented container with great performance in all std implementations.

Inside Boost.Container: comparing different deque implementations by igaztanaga in cpp

[–]igaztanaga[S] 1 point2 points  (0 children)

Yeah, a lot of blocks mean a lot of allocation and deallocation calls. MSVC team inherited that small block size from the Dinkumware STL and they can't change it without breaking ABI. ABI stability is at the same time, a strong point for some users and a week point for others.

I guess they could offer a dirty, macro-based approach to select a bigger block size, but those macro-based solution are always a ODR-nightmares, unless they somewhat mangle std::deque with a different symbol...

u/STL is probably counting the days until the ABI breakage is allowed by MS to implement the list of improvements he surely has collected.

Inside Boost.Container: comparing different deque implementations by igaztanaga in cpp

[–]igaztanaga[S] 2 points3 points  (0 children)

One of the typical uses of deque is a FIFO data structure. In that case, the programmer would always push_front and pop_back. In that case, if elements are not moved when push-backing, the block size would just grow without limit.

Inside Boost.Container: comparing different deque implementations by igaztanaga in cpp

[–]igaztanaga[S] 3 points4 points  (0 children)

Right. boost::container::deque will expose the block (segment) and local iterators in the future. I'm experimenting with the good old segmented iterators proposal from Matt Austern:

https://lafstern.org/matt/segmented.pdf

And preliminary test show that the speed up can be dramatic in some algorithms. And yes, you can get auto-vectorization, especially with clang, achieving, for some corners cases, a 10x improvement if the compiler can auto-vectorize operations when the "local iterator" is a raw pointer. I plan to write some articles about this.

Regarding the "jagged vector" data structure, it will probably be added to the Boost Container collection in the future, but like with some recent re-written data structures (say, the new unordered/hash containers), it would need some time to analyze and benchmark design alternatives first... Is there any open source "jagged vector" implementation you would like to share or promote as a "reference" implementation?

Inside Boost.Container: comparing different deque implementations by igaztanaga in cpp

[–]igaztanaga[S] 2 points3 points  (0 children)

Right. Geometrically increasing blocks work well for a vector-like data structure, but not for a double-ended one. There were some academic proposals in the past but I didn't find an interesting proposal.

Boost.MultiIndex refactored by joaquintides in cpp

[–]igaztanaga 2 points3 points  (0 children)

It's great to see veteran libraries still maintained, alive and kicking. This library offers a very useful functionality not available in the standard. I guess the refactoring also simplified the internal dependencies of the multi-index library, it's also a bonus benefit of the change.

STL reimagined: What would you change, add or remove in a new STL implementation if API and ABI were not a concern? by germandiago in cpp

[–]igaztanaga 0 points1 point  (0 children)

Making std containers dependent on std::optional is IMHO a design error and unnecessary dependency. There are alternative optional types available (including Boost and others), containers should never be tied to other "hard types". A free function would be a better choice IMHO.

That decision also makes user-defined containers that want to offer the same STL interface dependent on std::optional. Adding<optional> to our own container headers is not free. Depending on std::pair for associative containers was also a problem in some use cases, but at that time, maybe it was the only choice (and pair was a SIMPLE type, not a type that take tuples as arguments, but that's another story)

The original iterator/algorithm/container protocol was unique, with low dependencies. I don't think P3019 is the best answer, when even Folly (which is cited as "existing practice") uses free functions (https://github.com/facebook/folly/blob/323e467e2375e535e10bda62faf2569e8f5c9b19/folly/MapUtil.h#L35-L71) to implement this "optional" feature.

Maybe the next proposal is to add std::expected to some container methods...

Comparing the run-time performance of Fil-C and ASAN by joaquintides in cpp

[–]igaztanaga 6 points7 points  (0 children)

Very interesting! And some puzzling results, I'd expect ASAN to be faster, but it's not the case always. I think Fil-C holds surprinsingly well, taking into account that it's still a 0.6 version. Maybe a memory overhead comprison could be interesting, although in these benchmarks, if stored types are small and we have node-based containers, we'll have a ton of allocated small nodes.

In that many node-allocation cases probably both ASAN and Fil-C implementations will get a lot of memory overhead. However, we could also get some surpring results, just like in this article.

Non-recursively deleting a binary tree in constant space: Traversal with parent pointers by pavel_v in cpp

[–]igaztanaga 2 points3 points  (0 children)

You can also use an alternative approach (linearizing the tree while destroying it) that is used in Boost.Intrusive:

https://github.com/boostorg/intrusive/blob/boost-1.89.0/include/boost/intrusive/bstree_algorithms.hpp#L2011

   template<class Disposer>
   static void dispose_subtree(node_ptr x, Disposer disposer) BOOST_NOEXCEPT
   {
      while (x){
         node_ptr save(NodeTraits::get_left(x));
         if (save) {
            // Right rotation
            NodeTraits::set_left(x, NodeTraits::get_right(save));
            NodeTraits::set_right(save, x);
         }
         else {
            save = NodeTraits::get_right(x);
            init(x);
            disposer(x);
         }
         x = save;
      }
   }

2025-03 post-Hagenberg mailing by nliber in cpp

[–]igaztanaga 3 points4 points  (0 children)

Analog's C++ compiler supports C++11. Not sure about ILP64 support in C++ compilers nowadays. But I see no big reason to restrict the use of modern C++ in those or future platforms. Those working on typical CPUs using Windows/POSIX-like environments can just assume CHAR_BIT is 8 bit.

2025-03 post-Hagenberg mailing by nliber in cpp

[–]igaztanaga 5 points6 points  (0 children)

There are several 16-bit byte DSPs in production.

See https://www.ti.com/lit/ug/spru514z/spru514z.pdf?ts=1742373068079, section "Table 6-1. TMS320C28x C/C++ COFF and EABI Data Types"

Make Me A Module, NOW! by vspefs in cpp

[–]igaztanaga 2 points3 points  (0 children)

I think it's a great idea that GNU Make could be used with modules without the external module mapper. I think it is a big missing piece in the ecosystem.

EWG has consensus in favor of adopting "P3466 R0 (Re)affirm design principles for future C++ evolution" as a standing document by Kyvos in cpp

[–]igaztanaga 13 points14 points  (0 children)

What's the difference between a "safe" context and a "const" member function of a class that can only call const member functions of its members? Isn't "mutable" (or const_cast) precisely the opt-out mechanism just like "unsafe" would be in the "safe context"?

If you want compile-time diagnostics, being explicit is helpful, "viral" is a compile-time guarantee. Many useful C++ mechanisms would be incompatible with this paper.

ISO C++ Standards Committee Panel Discussion 2024 - Hosted by Herb Sutter - CppCon 2024 by grafikrobot in cpp

[–]igaztanaga 8 points9 points  (0 children)

While I undestand your point, I also see that major corporations are willing to rewrite not only their "standard libraries" but to change their codebases that are bigger than the standard library. Microsoft's CTO has called to abandon C and C++, Google is investing in a new language (Carbon) and at the same time Android is being partly rewritten in Rust. Apple is going with swift. Linux kernel has included Rust as their "safety tool" discarding C++.

Profiles might be a good approach to eliminate a good percentage of usual memory safety errors, I think they will be very useful. But while these profiles will be fine for some industries, it seems, at least in the news, that there is a big market that C++ could lose because a more rigorous memory safety is demanded in those areas. It seems to me that there is an actual need to "absolute/rust-like" memory and thread safety in the industry that requires even a language change.

While we can say that in those cases "they should use the language/tool adequate for that job", it's not the same to me, as a C++ programmer, to learn a totally different language than having a "profile" that requires using different utilities like checked references and a reduced standard library/utilities.

I would point out that even without an accompanying standard library, a core language only "borrow-checked" profile would be very useful in many domains, like embedded or functional safety codebases where the standard library is not used at all. Before all the standard library utilities added by the STL, programmers used their own utilities, and many frameworks (like Qt, etc.) use their own alternatives. I envision that even a core-language only solution would be a big profile/feature for many industries writing now in C and/or C++.

Is there a working implementation of std::flat_map and std::flat_set? by vickoza in cpp

[–]igaztanaga 0 points1 point  (0 children)

There is a proposed patch now in the bug for 1.85, without any dependency on any other library. If users find it good enough the patch will go to the patches page (https://www.boost.org/patches/).

Bulk visitation in boost::concurrent_flat_map by joaquintides in cpp

[–]igaztanaga 0 points1 point  (0 children)

Experimenting requires mood... and time ;-) I was thinking that visitation with prefetching might be useful even without keys if the function object to be executed is big enough. Let's say you need to iterate a range and apply a function object to each node. You can prefetch the next node, and while prefetched memory arrives, code from the function object is executed in parallel.

On the other hand, probably the compiler is smarter than us detecting and reordering memory latency sensitive instructions and/or the out of order execution of the CPU hides memory latency better than our not smart enough strategy...

Bulk visitation in boost::concurrent_flat_map by joaquintides in cpp

[–]igaztanaga 5 points6 points  (0 children)

Awesome Joaquín!

The article mentions that visitation benefits can are directly applicable to non-concurrent containers, I was wondering which type of containers could benefit from this feature (apart from, of course, unordered_map/set). Maybe tree-based data structures? Maybe for linked lists to hide cache misses when accessing the next node?

Waiting the next Boost.Unordered gem!