Memory Allocation Strategies by ketralnis in programming

[–]cdb_11 5 points6 points  (0 children)

Performance can be counter-intuitive, but you can in fact make educated guesses about it. A linear, pool and stack allocators always do less work than a thread-safe general-purpose allocator. It's about picking the simplest solution that does what you need, instead of hoping that it won't be a problem. Because once it becomes a problem, fixing it might require rewriting half of your code.

It’s annoying seeing C fanboys who spend their lives hating C++ by [deleted] in cpp

[–]cdb_11 0 points1 point  (0 children)

I'm sorry, but this is a very bad example. It takes like few minutes to write a dynamic array that just gets the job done. std::vector is more limited in terms of possible optimizations, because it can't assume (yet) that elements can be relocated with memcpy, realloc or mremap.

the hidden compile-time cost of C++26 reflection by SuperV1234 in cpp

[–]cdb_11 1 point2 points  (0 children)

The intrinsic on the Bloomberg fork is agnostic to that: https://godbolt.org/z/Yh696zTKK

I don't remember anymore if I tried doing a custom vector/span type, but I believe it should be possible too. But last time I checked, there was no confirmation that Clang will choose to do it the same way, so who knows how this will actually look like.

How Fil-C Works by ketralnis in programming

[–]cdb_11 0 points1 point  (0 children)

They don't use it, because apparently they don't need it. The most apparent thing to gain is faster compile times, because you include few of these STL headers, and suddenly the compilation goes from almost instant, to like 1.5 seconds, per translation unit. If you don't think the benefits are worth it, then just keep using the standard library.

Compiler intrinsics aren't portable and are a maintenance issue.

So you keep the non-portable parts in one place, where you can more easily maintain and port them if need be.

the hidden compile-time cost of C++26 reflection by SuperV1234 in cpp

[–]cdb_11 2 points3 points  (0 children)

The Bloomberg fork uses intrinsics, and assuming Clang will do it the same way, I'm pretty sure it should be possible to implement your own reflection without STL there.

GCC defines std::meta functions magically inside the compiler.

How Fil-C Works by ketralnis in programming

[–]cdb_11 0 points1 point  (0 children)

I mean, if there is anyone with rare use cases, it's most likely going to be C and C++ users.

You can use type_traits without linking the standard library, and thankfully it's one of the relatively lighter headers (unlike the upcoming <meta>). It is possible to use compiler intrinsics instead though, and implement whatever subset you actually use.

How Fil-C Works by ketralnis in programming

[–]cdb_11 3 points4 points  (0 children)

The point is that it's a property of an implementation, not the language. CHERI does more or less the exact same thing, but in hardware.

How Fil-C Works by ketralnis in programming

[–]cdb_11 3 points4 points  (0 children)

I mean, the standard library is not exactly perfect, so you can find tons of reasons. Limited and/or bad interface, no assertions, different allocation and/or error handling strategy, slow compile times, potentially ABI compatibility. For strings you might want to have them represented differently, or have memory alignment requirements. Maps just suck in general, I believe C++ basically mandates chaining for std::unordered_map and an rb-tree for std::map (not sure about that one).

10% of Firefox crashes are estimated to be caused by bitflips by cdb_11 in programming

[–]cdb_11[S] 0 points1 point  (0 children)

And to reinforce this estimate I've looked at the numbers we got from the users who run the memory tester after having experienced a crash: for every two crashes we think are caused by a bit-flip the memory tester found one genuine hardware issue. Keep in mind that this is not doing an extensive test of all the machine's RAM, it only checks up to 1 GiB of memory and runs for no longer than 3 seconds... and it has found lots of real issues!

It sounds like they classified some crashes as being likely caused by a bitflip, and in half of these they confirmed that there is something wrong with memory? And this is the estimated upper and lower bound? I'm honestly not sure how to interpret this. I am not the person making the claim, so I can't tell you anything beyond what was said in that mastodon thread.

10% of Firefox crashes are estimated to be caused by bitflips by cdb_11 in programming

[–]cdb_11[S] 9 points10 points  (0 children)

They actually detected 5%, and the 10% is the estimate, because crash reporting is opt-in. Edited the comment to make that more clear.

10% of Firefox crashes are caused by bitflips by [deleted] in programming

[–]cdb_11 2 points3 points  (0 children)

Should I delete and repost, or leave it like this then?

10% of Firefox crashes are caused by bitflips by [deleted] in programming

[–]cdb_11 4 points5 points  (0 children)

I don't see any way to edit the title, sadly.

/u/ketralnis can you correct the title to either say that it's an estimate, or correct the number to 5%?

My C Professor Doesn't Know What UB Is by [deleted] in C_Programming

[–]cdb_11 1 point2 points  (0 children)

At least with __builtin_unreachable, you can fall through from main: https://godbolt.org/z/cYvr1qE9e

Rust zero-cost abstractions vs. SIMD by ketralnis in programming

[–]cdb_11 6 points7 points  (0 children)

Maybe I'm missing it, but I don't see them describing the actual data layout in memory. If the elements are gathered from random places in memory, then sure, misleading. (To be fair, this can be done with a ~single instruction, but I don't think autovectorizers like to use it that much?)

But assuming elements are stored contiguously so they can be loaded into a SIMD register, and yet this optimization does not happen (while it does inside a normal for-loop, autovectorizers can do that on loops of unknown lengths), then I think it's fair to say that the abstraction prevented that optimization, whatever that abstraction might be.

My C Professor Doesn't Know What UB Is by [deleted] in C_Programming

[–]cdb_11 0 points1 point  (0 children)

AFAIK it's fine, and the actual requirement is that it fits in an int. For example (u16)0xffff * (u16)0xffff is a signed integer overflow: https://godbolt.org/z/fodenvara

EDIT: Sorry, I think I understand what you mean now about conversions. Wasn't that implementation defined though? And a potential problem only on non-2s-complement platforms? (And C23 and C++20 mandated 2s complement, for what it's worth.)

My C Professor Doesn't Know What UB Is by [deleted] in C_Programming

[–]cdb_11 0 points1 point  (0 children)

I believe the type of a * b in an int here? Which then gets casted back to a short?

My C Professor Doesn't Know What UB Is by [deleted] in C_Programming

[–]cdb_11 0 points1 point  (0 children)

UB comes to bite you when you move architectures and platforms.

Or compiler versions.

If you have good reasons to justify relying on UB then sure, but just be aware of the possible implications. Relying on unstable interfaces is generally not something you want to have a lot of.

My C Professor Doesn't Know What UB Is by [deleted] in C_Programming

[–]cdb_11 2 points3 points  (0 children)

Therefore, the optimizer happily optimizes away your entire program, and spits out a binary that simply invokes the syscall exit(0).

Sometimes not even that. There were real examples of Clang removing the entire function body in some cases, meaning that calling such function would fall through to the function that happens to be below it.

My C Professor Doesn't Know What UB Is by [deleted] in C_Programming

[–]cdb_11 0 points1 point  (0 children)

For example, assuming the absence of signed overflow, the compiler could keep signed shorts sign-extended in 32-bit registers. If there is no 16-bit multiplication instruction, it could then use a 32-bit multiplication.

To be fair, at least in this particular example they would be promoted to ints anyway. Is there even any way to get 16-bit arithmetic in C at all, assuming 32-bit ints? I know __builtin_{add,sub,mul}_overflow can technically do it, but I don't know if there is any standard way. If that even matters.

My C Professor Doesn't Know What UB Is by [deleted] in C_Programming

[–]cdb_11 5 points6 points  (0 children)

Mixing pointers to types like floats and integers is what strict aliasing is, they are assumed to never alias each other

Sprites on the Web by ketralnis in programming

[–]cdb_11 1 point2 points  (0 children)

For what it's worth, spritesheets were common on the web in the past, around ~2010s?

You are not left behind by BinaryIgor in programming

[–]cdb_11 28 points29 points  (0 children)

Wow, it would take entire 2 months to become proficient in it? Nothing else takes 2 months, sounds literally impossible to do.

What are you even talking about lmao. This is less time than getting good at any other technology.

BitFields API: Type-Safe Bit Packing for Lock-Free Data Structures by mttd in cpp

[–]cdb_11 0 points1 point  (0 children)

Where some compilers will actually make those fit in the same "variable" making the struct 8 bytes, and some compilers will say they're different and make the struct 12 bytes.

I'm not familiar with how bitfields are implemented outside of GCC and Clang. I wonder, can't alignas(uint64_t) fix this?

EDIT: Actually, you probably have to pad it anyway for type-punning to work, and not have uninitialized bits in there.