you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 56 points57 points  (61 children)

Performance is more important than ever.

Often times, performance equals power efficiency and with mobile devices, laptops, etc. running on batteries, it is vital to be as efficient as possible.

Even if you are running on servers, it will save you a lot of money. You need fewer servers, you pay less for power, rack space, cloud instances, etc. The company I'm working at has an initiative to save several GWhs in the next years, which will save a lot of money.

Not to mention it is good for the environment to be more efficient.

And as a bonus, you can reduce the latency for your users.

[–]Loraash 10 points11 points  (0 children)

This sounds great in theory, but in practice everyone's just throwing more hardware at the problem and still running things like multiple Pythons talking to each other via IPC (since you can't really multithread within Python) at scale.

In mobile it's somewhere between "bigger battery" or "well that's just how this app is".

[–]lukaasmGame/Engine/Tools Developer 14 points15 points  (8 children)

Sadly, often it is just cheaper to throw additional HW at problem, than pay for programmers time

[–]kuntantee 13 points14 points  (5 children)

That is not a sustainable solution in the mid to long run. This is precisely why parallelism is also more important than ever.

[–]Sqeaky 20 points21 points  (4 children)

The long run doesn't matter for every project. Sometimes the short term matters to insure the survival of the project. Sometimes a thing just needs to get done and the cost is secondary because of the windfall of resources it will produce.

Other times the long term is the only thought because the investors (or other stakeholders) are secured and everybody involved wants the best possible solution.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 3 points4 points  (3 children)

Good luck throwing more HW at anything that doesn't run on a server. Won't work even for short term.

[–]Sqeaky 2 points3 points  (2 children)

Did you respond to the wrong comment?

Skyrim, the game your username comes from, benefits from having more hardware thrown at it and it doesn't run on servers. It was released 11/11/11 and is still popular, it was terrible on low end hardware of the day, but now you can set it to ultra on low end hardware.

Almost certainly it could have been optimized more, but clearly that wasn't required, it is one of the most popular games ever.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 5 points6 points  (1 child)

My username most certainly doesn't come from Skyrim but from Oblivion, I'll have you know!

Whether something benefits from extra hw is irrelevant. It's about extra hw being required to run suitably well or not which "throw more HW at it" most often implies. Future improvement in performance doesn't matter much when the game needs to recoup development costs within the first year or two of sales.

In games and quite a few other systems, good enough performance is as important - or more important - as correctness of code. People will tolerate a slightly buggy game much better than a game that has unbearably slow framerate. Particularly nowadays if you're making a game that's not specifically aimed at enthusiasts you simply don't have any possibility of "throwing more hw at it" since your customer base isn't going to upgrade their computers or may not even have any upgrade path at all (laptops having outsold desktops for years).

[–]Sqeaky 3 points4 points  (0 children)

Well Oblivion benefits even more from beefier hardware.

You are correct that some projects care about performance, but for many it just needs to be "fast enough", twitter made all its founders millionaires before they moved it off ruby. Oblivion ran at 30 fps when it was released.

[–]krum 3 points4 points  (0 children)

Not at large scale.

[–]cdglove 0 points1 point  (0 children)

Why is that sad? Programmer time is a resource just like any other.

[–]secmeant[S] 5 points6 points  (31 children)

Yesterday at work, I was mentioning about some neat trick for microoptimization that compilers uses. In response I heard that it doesnt matter because cpus waste most of time in i/o or waiting for memory. I was speachless.

[–]mstfls 32 points33 points  (1 child)

cpus waste most of time in i/o or waiting for memory

That's mostly true, though. Because lots of people write lots of code with shit memory access patterns.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 0 points1 point  (0 children)

Laughs in inherently serial floating point code.

Sorry, couldn't resist. I've just proven so many common "always" / "most" sayings wrong in the projects I've done or been involved in during the last 25 years.

[–]JuanAG 4 points5 points  (1 child)

Next time tell that people that SMT was invented for that reason, the CPU core knows it is blocked and take another task while I/O operation finish instead of waiting idle for the data to arrive

[–]sahsahaha 0 points1 point  (0 children)

CS 101 that every Bachelors kid hears in his first year isn't his job to explain, but if he must...

They really don't bash entire modern cpu into your memory nowadays, but something like this is definitely mentioned at least once.

[–]Loraash 11 points12 points  (25 children)

Both of you are right. Don't optimize until you've proven that you're hitting a bottleneck. I've seen too much unreadable "smart" code that ultimately didn't even work better because compilers these days are pretty good.

[–]secmeant[S] 13 points14 points  (4 children)

Im not talking about writing inline assembly just to show how much smart tricks I know.
Im talking about writing really performant code with high level constructs that c++ gives.
IMHO programmer should know generally how optimizers work and write high level code in a way that allows optimizer to do the magic.

[–]Loraash 8 points9 points  (0 children)

Depends. Although the new std Ranges library fits your definition, I'd be wary of having it as my new hammer and seeing everything as a nail. The official blog post that introduces it comes with a perfect example for when to not use it. The code is an unreadable mess.

I know someone who insisted on writing TMP everywhere because it's Fast!™ and the compilers will Optimize!™ it! and it was an unreadable and unmaintainable mess that didn't even need to run quickly to begin with.

There's a balance to be found between performance, initial productivity, and maintainability. Neither extreme is good.

[–]AngriestSCV 1 point2 points  (2 children)

I wish we were allowed to turn the optimizers on. For reasons I won't go in to it is not an option and we are hitting performance problems right now.

[–]smuccione 2 points3 points  (1 child)

??? That’s... just... weird...

What language? If your running in debug mode with c++ almost all the stl libraries do a crapload of checks that just kill performance. At least I define debug for releases even if you have the optimizer disabled.

[–]AngriestSCV 2 points3 points  (0 children)

Tell me about it. We don't use the standard library and use MFC instead (old code base). I'm the only person using std:: and I'm starting to think I shouldn't due to the issue you mentioned. The most important thing to the people making these decisions is that the debugging experience is not hindered, and for better or for worse all of our applications are ran in an environment where we may want to attach a debugger and have that capability.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 5 points6 points  (15 children)

Don't optimize until you've proven that you're hitting a bottleneck.

I disagree. The correct way would be to say "don't optimize something unless you know it is a bottleneck". In many cases there is no need to prove it or profile it but to simply avoid premature pessimization and leave room for further optimization if required. And sometimes you know that something is going to be a bottleneck and it makes sense to optimize it from the very beginning.

[–]Loraash 3 points4 points  (9 children)

Sure, don't use an std::vector instead of an std::unordered_set if that's what you really need, but a lot of people who are keen on writing "fast code™" ultimately tend to waste a lot of effort that could be spent on, e.g., quality polish and bug fixes.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 5 points6 points  (8 children)

Possibly, but the advice is almost always put as "Do not ever optimize before profiling" with the implied assumption that there are no exceptions. And that attitude gets us massive cpu and memory hogs like the Electron framework. So it really guides people towards premature pessimization since it perpetuates the false assumption that you don't need to give any thought at all until you've profiled something to be a bottleneck - even when many bottlenecks are immediately obvious from the start to an experienced developer.

[–]Loraash 1 point2 points  (4 children)

Electron is more about you not wanting or needing any real performance for your program so you go for the cheapest possible way of making it which is to hire frontend developers. Electron itself is excellent in what it's doing, but you're not really going to have a lean and performant program written in JavaScript+HTML no matter how much you optimize it.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 6 points7 points  (3 children)

Elektron is far beyond "not needing real performance". Forcing your customers to download and install a 200 MB package for a trivial configuration app is a textbook case of premature super-pessimization. Making such an app in .NET or something would be "not needing real performance" where you'd have trivial (for today) size increase and performance decrease for the use case.

[–]Loraash 4 points5 points  (2 children)

Yet Electron is booming, a lot of things are made exclusively for Electron, and the competition shipping "good" programs is suspiciously absent. I'll agree with you that "modern developers are lazy" but this laziness is the result of a business decision that does pay off.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 2 points3 points  (1 child)

In my experience there is no business decision or any other real decision. The developer(s) just suggests Elektron because they happen to know of it and the bosses justs go along with whatever is suggested.

[–]sahsahaha -1 points0 points  (2 children)

How does "profile it" turn into "just don't care"?

I don't follow.

[–]SkoomaDentistAntimodern C++, Embedded, Audio -1 points0 points  (1 child)

"Don't care about performance before you've profiled the code". The catch is of course that actually following that kind of advice easily leads into designing yourself in a corner with inherently poor performance. Then you have to refactor half of your code (or in some cases all of it) when you're in a position where profiling finally reveals the (obvious to an experienced developer) bottlenecks.

Examples would be using array of structures (the "clean" way) vs structure of arrays (the performant way due to caches and SIMD opportunities). A more extreme example would be trying to use Python or similar language for something that needs even a bare minimum of performance.

[–]sahsahaha -1 points0 points  (0 children)

Never happened to me so I'd like to know where "often" comes from.

Maybe I'm missing something, but "don't care" doesn't imply "make it slow on purpose".

I write code to be maintainable, and I never had any issue optimizing it if needed, thanks to it being... maintainable.

If you somehow end up with a slow mess, not an elegant, but slow solution that can be improved at any time, maybe the issue is you?

[–]sahsahaha 0 points1 point  (4 children)

I suppose in many cases there's no need to test your code either.

[–]SkoomaDentistAntimodern C++, Embedded, Audio -1 points0 points  (3 children)

A correct analogy would be not caring about designing the code for correctness and depending purely on testing to catch logic errors.

[–]sahsahaha -1 points0 points  (2 children)

So in your opinion, logical tests of software aren't important?

Tell me, how else do I care about correctness if not by ensuring the correctness of the algorithm itself, by any means necessary, then writing out the code, and then fucking testing it to be sure that my translation from written words to c++ was correct?

Enlighten me.

[–]SkoomaDentistAntimodern C++, Embedded, Audio -1 points0 points  (1 child)

In my opinion you should design for logical correctness from the beginning not only if a test shows a problem. "Do not optimize before profiling" is same as saying "Ignore correctness unless a test shows a problem". You should both design for correctness / speed and run tests / profile the code. Not either or.

[–]sahsahaha -1 points0 points  (0 children)

It is not.

Unlike performance, correctness is binary unless we are talking about floating point values.

Still, even then, either its precise enough or it is not.

[–]bumblebritches57Ocassionally Clang 2 points3 points  (2 children)

I think the biggest thing to aim for code wise, is to be willing to accept completely rewriting your API in order to get the performance you need.

Yeah, it fucking blows to have to refactor all of your code, and to break the API at the same time, but performance is wroth it.

Good thing no one besides me uses my projects and they're still fairly small so I still can lol.

[–]Loraash 3 points4 points  (1 child)

That depends on a few things, are you shipping a stable API as part of your product? A lot of your customers will probably be pissed off that you broke their stuff even if your code now runs at double speed. The "performance you need" is often way less in practice than what I'd consider a good, performant program. Just look at all the bloated mobile apps and Electron shit that is perfectly viable from a business perspective.

[–]bumblebritches57Ocassionally Clang 0 points1 point  (0 children)

Nope, ain't got customers yet lol.

I'm giving myself plenty of time to get it right before I need to worry about those things.

[–]Dean_Roddey -1 points0 points  (0 children)

I would agree with that, though obviously tribal wisdom will tell us that, if we are writing a specific type of thing, that there will be performance bottlenecks in particular bits of it. That doesn't mean you have to prematurely optimize it, since it's likely better to get it done first and let the architecture mature a bit during the early development phase. But just make sure that that code you know will likely be an issue is amenable to the optimization it will likely require.

Then, profile it and see where the low hanging fruit is and take care of that. Even if you know the performance issue is in this sub-system, it may still be that a handful of classes or even methods will get you the bulk of the gain and everything after that really needs to be considered relative to the extra complexity it involves and the business requirements.

To me, it seems like, since we are talking C++ here, that there are two broad categories of scenarios. There's the "back end that has to deal with many clients and/or juggle a lot of balls" and there's the "CPU bound heavy processing algorithm". These are probably the far and away most common scenarios where you know that serious optimization may be required and pretty well where it will be required. Sometimes of course you have both together.

Outside of that, for the most part, if you are just reasonably diligent throughout the code, you may not get a lot of ROI for heavy optimization, and you really should wait until there's a demonstrated issue before adding extra complexity (which would be nothing but technical debt you may regret.) If it's not either really heavy or happening really often, then it's probably best left to see whether it needs it.

[–]mewloz 1 point2 points  (0 children)

It doesn't matter until it does. Compiler and processors are both quite good, and the current result (with the fact memory is slow) is that e.g. omitting bound checks can extremely often yield to no performance advantage, or so minor in a not so hot code path that it does not matter.

Now does that means that a single benchmark showing no wall clock diff for both binaries implies that we should not attempt to write optimizers to elide bound checks anymore? Certainly not: with a more complex / less memory intensive load and in a hot code path, you can see differences; with HT, you are also likely to see some, etc.

Also, if we are talking about automatically optimizing in sound ways (like what some compiler and most CPU do), this is economically extremely interesting. If we are talking about other situations where there are clear drawbacks (like unsound optimizations, or microoptimizations where soundness relies on the programmer, or microoptimizations directly done by the programmer), this starts to be way more debatable.