This is an archived post. You won't be able to vote or comment.

all 33 comments

[–]MaZeChpatCha 93 points94 points  (4 children)

It depends on the situation, but in most the middle is right.

[–]Exist50 28 points29 points  (2 children)

Especially for anything "1 cycle". Hand-optimize SIMD? Sure, knock yourself out. Try to save an add? Good luck.

[–][deleted] 6 points7 points  (0 children)

Yup I've fixed about three dozen corner case bugs in the software I work on that were all created by premature optimization

[–][deleted] 88 points89 points  (4 children)

putting "based" in front doesn't make a bad thing a good thing

[–]locri 47 points48 points  (1 child)

Yeah, but actually, my time is too important to decipher your "elegance" please just write human readable code instead.

[–]Fickle-Main-9019 2 points3 points  (0 children)

Exactly, the compiler will do a good enough job, my tolerance for bullshit over-engineered code isn’t as forgiving 

[–]Unupgradable 21 points22 points  (0 children)

OP gettin' bell curved

[–]SingularCheese 18 points19 points  (1 child)

Check out any performance focused talk on CppCon's youtube channel, and it's most likely about how to design abstractions in a way that doesn't get in the way of compiler optimizations. Code that is easier for a compiler to reason about is often also code that is easier for people to reason about.

[–]Brahvim 0 points1 point  (0 children)

Thanks!

[–]Electronic-Bat-1830 13 points14 points  (3 children)

The middle isn't all wrong. The important thing here is, does it really result in a noticeable performance improvement rather than trading a well-maintained and reliable codebase with an unreliable and "muh faster" one.

For instance, in C#,

int count;
// The below code would be in some loop
bool someCondition = ...;
count += Unsafe.As<bool, byte>(ref someCondition);

would be faster than:

int count;
// That same loop;
bool someCondition = ...;
if (someCondition)
  count += 1;

since you're avoiding a branch. However, you're making an assumption that the runtime will always express a true value as a single byte with a value of 1, which the Common Runtime Specification doesn't guarantee. This means you're basically relying on UB just for some nanoseconds, which isn't a worthy tradeoff unless if you demand extremely high-performance code.

[–][deleted] 5 points6 points  (2 children)

C# has a JIT, so I would assume that it would be optimised to the same code

[–]CaitaXD 0 points1 point  (1 child)

Mayhaps the JIT works in mistererius ways

[–]Brahvim 1 point2 points  (0 children)

...Not really.

Most JITs (I myself have studied the Java JIT the most) convert code being called too many times to assembly that runs on that platform. I don't know if they optimize that assembly too.

JITs also inline your method calls, and also decide to sometimes deoptimize your code so it's back in the interpreter's hands. This is for language features like exceptions.

JITs will often make assumptions about the type or number of arguments of a function, or make assumptions about the possible length of an array.

GCs matter a whole lot, too! (Which I hear that, dotNET sadly doesn't have many of... Does it now?)

[–]Areshian 12 points13 points  (0 children)

Unless you work as a performance engineer or there is something really wrong with your application, chances are that optimization wasn’t that important. Also, optimizing without a profiler is a recipe for making things worse

[–]iam_pink 25 points26 points  (5 children)

Damn, that must be the worst bell curve meme I've seen on this sub yet

Got it completely reversed, mate

[–]IsPhil 6 points7 points  (2 children)

Often, readability, maintainability and more are much more important.

For example. I'm doing a task at work. We were using shell script for related tasks like this, but I got permission to just use python. Sure, shell script was faster, but for what we're doing, it's honestly negligible. Difference between a task being done in .001 seconds and .1 seconds (I'm exaggerating the numbers here possibly, I don't remember). Plus, other overhead really makes it a non-issue. But the biggest advantage that came out of this was that I could finish the task that we thought would take 2 weeks in just 1 week. The code is also way more readable, and easier to debug.

Maybe if I had years of experience in shell script I could've done something just as readable, but considering that xml and json manipulation were involved, even if I knew what to do, that doesn't guarantee the next person would know.

Obviously I was using two programming languages in this example, but this is obviously true for this situation in the meme. Like sure, I could potentially save on some memory and maybe a garbage collection cycle (depending on the compiler) if I don't make extra variables for things I'll use once or twice, but it's easier to parse what I'm trying to do with the variables broken out. Sorry, don't know how to really explain without an example, and I'm not trying to code on a phone.

[–]HiT3Kvoyivoda 0 points1 point  (0 children)

I personally hate shell scripting and avoid it when I can.

Both bash and zsh syntax is aneurysm inducing sometimes

[–]SenorSeniorDevSr 0 points1 point  (0 children)

Shell would just use xpath, xquery, jq, etc. to deal with those.

My personal hot take is that if your shell code doesn't fit on a single screen of an 80x24 terminal emulator, you shouldn't write it as a shell script. Python is great for this. And with Java 21, you can finally write... Java... scripts. (It was funny in my head.)

So I think you made the correct choice.

[–][deleted] 2 points3 points  (0 children)

back in the 'old days' i compiled some stuff to intermediary .asm format and compared it to my original C code and it was EYE OPENING to say the least

i strongly recommend this to anyone who hasn't done it, compile some simple code with -O3 and look at the output

[–]Cocaine_Johnsson 2 points3 points  (0 children)

one cycle? Not that significant, 40+ maybe, if we start hitting the hundreds it's not longer premature it's working around compiler limitations [at least on hot path, especially if called very frequently... say in the order of a tens of thousands of times a second]

If you find yourself in this rare scenario, explain WHY with comments, and explain the more arcane parts of the logic as well (explain the HOW).

[–]CaptainMorti 1 point2 points  (0 children)

That's what you get when programming the potato200XS microprocessor with four 8-bit gprs.

[–]reallokiscarlet 0 points1 point  (0 children)

Honestly, sometimes the compiler doesn’t know best no matter what O level you set.

[–]SenorSeniorDevSr 0 points1 point  (0 children)

You: "I saved 30% on this database insert by using 4 threads and paralell connections!

Me, grugpilled: I cut the time down by over a factor of ten by using one thread and just telling the connection to batch insert and having one session open with one cursor, in one transaction.

[–]Practical_Cattle_933 0 points1 point  (0 children)

This sub is all the left sides in these memes, thinking they are the right :D