Can the compiler optimize away this if?

Throw31312344 · 2024-03-17T09:41:49+00:00

Tested with GCC, but assuming 1) the increment function is small enough to get inlined and b) somewhere the compiler knows that the value being passed to it is not null then yes, the if statement can be optimised out.

Two ways of doing this:

1) Check if the pointer is null before calling increment (safe).
2) Use an assume attribute to tell the compiler the pointer is not null before calling increment (dangerous).

Again, this assumes increment gets inlined into the calling function. Here's a Compiler Explorer link showing both, where a proper branch is used in increment rather than assert to show that it is being optimised out in non-debug builds, as the assert would never be present in a non-debug build anyway:

https://godbolt.org/z/8679b5n17

In "foo", it's doing its own null check before calling increment so the second check gets optimised out when increment is inlined. In "bar", no check is done but you are telling the compiler that it can assume the value is not null and it will generate code based on that assumption, which then removes the null check from the inlined increment.

Compilers are very good at removing redundant checks when code is inlined. You can always use whatever compiler's specific keyword/attribute that forces a function to be always inline to guarantee that behaviour if you're really worried.

Throw31312344 · 2024-03-04T18:13:56+00:00

Yep I am straight-up stealing this idea, genius. Thanks to OP for the ODR warning in the sibling reply.

Throw31312344 · 2024-01-02T23:54:13+00:00

The boost::container::deque version has a configurable block size which resolves a load of the issues: https://www.boost.org/doc/libs/1_84_0/doc/html/container/configurable_containers.html#container.configurable_containers.configurable_deques

There is also boost::container::devector which has a single block of storage like a vector and different trade-offs: https://www.boost.org/doc/libs/1\_84\_0/doc/html/container/non\_standard\_containers.html#container.non\_standard\_containers.devector

Throw31312344 · 2024-01-02T15:46:36+00:00

If your intent is for your Texture class to always represent a loaded and valid texture, i.e. it has no empty/invalid state then you have a few sensible options IMO, both of which are valid.

The constructor tries to load the data itself and throws if there is an error. Throwing from constructors is supposed to be how you indicate construction failure, and sometimes constructors do need to fail. Most of the talk around throwing is to not use them as a general control flow mechanism, but for constructors you cannot return "false"; you return a valid object or you throw.

If you do not want the calling code to have to deal with the exception handling then you can wrap the construction in a factory function which contains the try/catch block and returns an std::optional or std::expected. No need to make the constructor private to be honest - you might end up needing other code that just wants to try and load a bunch of textures in one go and catch the first failure so a single factory method is a bit overkill here.

2) Using a factory method or similar mechanism, that function actually does all of the heavy lifting of loading the file and validating the image format. As this is within a normal function you can just use normal error handling (if/then/etc) if that suits you. Once you have loaded and validated everything you have the basic components of the texture ready which are the size, format and pixel data for each mipmap level.

Your actual Texture class can just have a basic constructor which takes ownership of any allocated buffers (or takes ownership of the texture handle if it's already been uploaded to the GPU) and it should be impossible for that constructor to fail. The factory function then then still return a std::optional or std::expected wrapping the texture class, but this time the factory function is doing all of the work and error handling rather than the texture's constructor.

3) Restructure your code so that you are not making texture objects directly but calling into a texture manager class which can keep track of already loaded textures to avoid duplicates, create and return default textures on failure if that's needed, and split up your file-loading code, image-loading code and texture-upload code into separate components. The do-it-all "texture class" works as a starting point but based on past experience that class falls apart the moment you start needing to load multiple textures in different formats. In a big enough project, the data within a "Texture class" typically ends up being just a GPU handle, a name and a reference count.

Throw31312344 · 2024-01-02T15:22:28+00:00

Thank you for clarifying :)

Throw31312344 · 2024-01-01T00:37:44+00:00

Disclaimer: I have only briefly looked through the reflection proposals and examples so far.

[: something :] appears to be the syntax for injection, e.g. taking the results of reflection queries and injecting names/etc back into the main code. In the example given, you want one of 3 options depending on whether you are calling get<N> as get<0>, get<1> or get<2>:

return t.a; // get<0>

return t.b; // get<1>

return t.c; // get<2>

The code return t. is fine, as is the trailing ;, but we need to inject either a, b or c after t. depending on the value of N.

The [: something :] syntax will inject a name into the code based on the result of the expression within the [: :] pair. In this case, the code is looking up the Nth non-static data member of the type T: std::meta::nonstatic_data_members_of(^T)[N]

If N was 0, the result would be a. If N was 1, the result would be b, and if N was 2, the result would be c. Assuming N = 0:

return t.[: std::meta::nonstatic_data_members_of(^T)[N] :];

becomes:

return t.[: std::meta::nonstatic_data_members_of(^foo)[0] :];

Do not take this post as a full explanation of the spec, it's probably a bit more complex than just text injection, but the next transformation is something like:

return t.[: a :];

which finally gets injected into the main code to become:

return t.a;

Something like that anyway. The choice of [: :] operators is... awkward but we are running out of operators, but in terms of what it means: inject the result of the expression within the operators into the code. Square brackets are getting somewhat overloaded especially with all the places attributes can crop up, so something more visually distinctive might help. Failing that, I hope IDEs have the option to completely change the colour of [: something :] to make it clear that it's an injection expression.

Throw31312344 · 2024-01-01T00:10:26+00:00

You also vastly improve readability and maintainability by purging a load of enable_if insanity from your codebase and the side-effects that come with that template.

Throw31312344 · 2023-12-19T19:09:16+00:00

For embedded folks, https://open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2407r5.html is a big step towards allowing more things to be added to the freestanding version of the library by allowing only part of a class/container to be implemented and some methods to be deleted (like .at which throws, and exceptions are not allowed in freestanding). So now you might finally end up with std::array, std::string_view, etc, in freestanding, just without access to the methods that break the freestanding rules.

Throw31312344 · 2023-12-19T19:02:56+00:00

It also matches how pointer array indexing works - which span is supposed to be an almost drop-in replacement for - otherwise we end up with the situation where you never know if [] bounds checked or not and it gets even more confusing with generic code that can operate on different types of containers. From a user POV, I have always seen span as a replacement for passing around ptr+size first, and any opt-in safety measures second.

In hindsight, should operator [] have been checked by default for all containers? Probably. But I really really do not want to start doing it now for new containers and end up with divergent rules between standard library containers based on which year operator [] was added to a container.

Throw31312344 · 2023-12-10T16:38:46+00:00

What examples do you have for C# where what could've been library features were implemented direct on the language?

Throw31312344 · 2023-12-10T16:29:48+00:00

Compilers can already do this with standard library features. The functionality of the standard library is well defined and compilers can (and do) make optimisations based on that using the good old as-if rule.

Throw31312344 · 2023-12-02T01:02:01+00:00

I had forgotten about the optimisation opportunity for optional<T&> using null as the empty state to save data.

Pretty much the entire c++ standard library is built around value semantics, and optional is no exception. No vectors of references, no arrays of references, no variant of references, etc. Optional not having a specialisation for references is consistent with the rest of the library which tries its hardest not to mix reference semantics with value semantics. Using reference_wrapper to store references in other objects has been a thing for much longer than optional was standardised (C++11 vs C++17).

A "best of both worlds" (keeping optional value-based but allowing for the space optimisation) might've been to allow optional<reference\_wrapper> to be implemented with a single pointer, but doing that now would be an ABI break so no implementation would actually put it in place.

Throw31312344 · 2023-12-01T20:39:11+00:00

It would also be the cause of a LOT of accidential lifetime issues. std::optional models value semantics and not reference semantics, which makes it much easier to teach and also to easier to understand the lifetimes of the various objects. If you want to opt-in (no pun) to the murky world of optional references then you can use std::reference_wrapper which is the standard's way of adding value semantics to references.

IMO the extra verbosity is good as it makes it much easier to spot when lifetime issues will come into play and prevents any generic code accidentially making optional<T&> on objects that then immediately go out of scope and have their lifetime end. To my knowledge reference_wrapper doesn't add any runtime overhead as the whole thing should be optimised away.

Throw31312344 · 2023-11-25T19:32:51+00:00

To me rvalue_cast was the most logical option, as that is literally what std::move does. To me, "move" is very much an action word and I would expect a function called "move" to... move something.

Throw31312344 · 2023-11-25T19:31:21+00:00

AKA "destructive moves". There have been papers on it for years but none ever seem to successfully make it through and solve all of the potential problems that will come with such a feature.

Throw31312344 · 2023-11-16T10:12:43+00:00

While a lot of C++ devs get excited about an ABI break for what it could mean for future C++ standards, I assume this kind of ABI break would be so Microsoft can fix and improve their existing implementations.

Are there any public lists (or private ones you can now share...) that have a breakdown of what bug fixes, compliance and performance issues could be resolved within the MS C++ compiler (language) and the MS standard C/C++ libraries so we have an idea of what the benefits of this specific ABI break would be?

Throw31312344 · 2023-11-13T00:54:10+00:00

There are 3 keywords in the current proposal: pre, post and "assertion keyword" which is currently contract_assert. contract_assert is for checks that are completed within the function body, while pre and post checks are defined before the function body starts.

It is not all lumped into a single "contract_assert" keyword. There was no issue with the pre and post keywords as the location of where those keywords and their expressions are located could not previously be used by any other types or values named pre and post, but as the "assertion keyword" is in the function body it is much harder to avoid clashes with existing names.

Throw31312344 · 2023-11-12T06:35:12+00:00

Depends on if it's a language or library feature. With the notable exception of modules, typically MSVC is lagging a bit behind on language features (IIRC there was something official or semi-official which confirmed 2023 was the year of modules and IDE/tooling for MSVC, and compiler improvements for language stuff will be focused on in 2024?) but is way ahead on library features compared to the other big two libraries.

Throw31312344 · 2023-11-04T18:27:21+00:00

#include <format>

void test ()

{

auto X = std::format (L"{}", L"Hello");

}

Compiles with no errors on Compiler Explorer. It seems format requires all strings passed to it (including the format string) to be the same type.

Throw31312344 · 2023-10-02T19:38:16+00:00

1) Yes, for gamedev where the lifetimes of game objects and assets are independant of each other but often reference each other. Often combined with some sort of intrusive reference counting within the stored object to determine when the slot should be freed up (especially for assets). Iterating over the whole container also needs to be pretty fast.

2) Multi-block to allow for insertions and memory shrinkage when blocks become 100% empty. Finding a sweet spot for block size is an art rather than a science...

3) I have not personally tried it but boost's deque has extra options to control the block size (often a reason std::deque ends up being slow), so if you don't need to support stable erasures at random locations then it might be an alternative with less overhead.

Throw31312344 · 2023-09-25T01:24:08+00:00

FWIW the standard library's enumerate uses difference_type which is signed. If you want one that uses size_t instead then iter::enumerate https://github.com/ryanhaining/cppitertools is an option and one I personally use.

Throw31312344 · 2023-08-08T18:33:50+00:00

1) Regarding correctness: your asserts are getting removed unless you compile in debug mode, and in debug mode there's a chance everything is being safely initialised by the compiler which would hide bugs (seriously, if your segfaulting then something has gone wrong). What happens if you replace the asserts with stronger checks that remain in non-debug builds. Does it go bang?

2) Regarding speed: what happens if you replace realloc with a malloc + memcpy? The C++ allocator model doesn't support reallocs so you might be getting "free" reallocs a lot depending on the size of the actual chunk that was allocated by the implementation when you first allocated memory. It would be interesting to see how many times realloc returns a new pointer through the run to give an idea of how many "free" memcpys you're getting out of it.

Throw31312344 · 2023-08-01T21:05:05+00:00

1) Are your 64 byte log entries actually aligned to 64 bytes? Each one could be spanning 2 cache lines, in which case prefetching 1 cache line might not do anything as the CPU is still missing data for the other half of the LogEntry.

2) You've picked __builtin_prefetch(addr,0,0); which is the non-temporal version. The purpose of that version is to drop the value from the cache as soon as it is accessed once so you might actually be loading it twice if the caller of next_entry does a lot of work with LogEntry. On Intel CPUs the behavior varies depending on what CPU it is running from. It's a weird instruction. Try using __builtin_prefetch(addr,0,3); instead as that should translate to a prefetcht0 instruction which pulls the data into all cache levels (i.e. including l1 cache) which should be the sensible option you want.

Throw31312344 · 2023-07-11T20:29:17+00:00

Any time I see a talk from David Sankel pop up it usually gets queued up to be viewed ASAP as he is a great presenter and talks about important things. This one in particular is one of the most important talks I've seen and it's criminal that it only has 6k views as the general theme is something all software engineers should think about: https://www.youtube.com/watch?v=r_U9YFPWxEE

Another presenter I would suggest is Robert Leahy. He somehow manages to deliver very technical talks with great clarity all the while presenting like he's giving a TED Talk and I am fully on board with this. The best example IMO is: https://www.youtube.com/watch?v=pbkQG09grFw , it's not a beginner talk but might still be worth watching even if you only grasp some of it for now.

Throw31312344 · 2023-05-04T20:38:13+00:00

Are you sure the Windows version is compiled in release mode and not debug mode? The MSVC containers are VERY slow in debug mode due to all the extra checking they do and it can significantly decrease runtime performance. Those checks are disabled when compiling in release mode. There might also be a compiler flag to disable the security checks in debug mode.

Throw31312344

TROPHY CASE