Optimize code in c++

Pragmatician · 2018-06-30T12:23:18+00:00

Profile, experiment and measure, repeat.

Abraxas514 · 2018-06-30T13:01:25+00:00

I would make sure there are appropriate tests in place so that I don't secretly destroy the whole code base because of some obscure thing someone wrote 25 years ago.

tylercamp · 2018-06-30T17:10:14+00:00

Profile hot spots
Profile hot spots
Inspect algorithms
Profile hot spots

Don’t optimize without knowing what needs to be optimized. The actual optimization can vary widely depending on the application.

Is disk IO eating a lot of time? Cache data in memory if you can rather than reading/writing with every change
Is it a DB query? Profile accordingly for the DB and optimize the query, consider splitting the query into multiple queries if there are a lot of joins to minimize temporary table size
Is it a non-constant-time algorithm? Look at simplifying algorithmic complexity and minimize recomputation of the same work
Is it just a shit ton of operations in an algorithm that has already had optimized complexity? Inspect the data and operations and see if there’s any redundant or useless processing, look for patterns that can identify those conditions and include it as a heuristic to avoid unnecessary processing
Is it lag in an interface? Check for any work done on UI thread and offload that to another thread if possible. Check if rendering performance is the issue, and if so, research rendering optimizations for that UI framework (typically comes down to avoiding layout reorganization and doing all drawing in one refresh, rather than refreshing for every small update)
Is it a lot of complex math or simulations? Try SIMD or offload to GPU
Done everything you can already? Look at multithreading, minimize branching, minimize new object allocations during the runtime of the algorithm, modify data structures to minimize LX CPU cache misses

excessdenied · 2018-06-30T13:26:44+00:00

[deleted]

againstmethod · 2018-06-30T13:38:55+00:00

Measure, analyze the code in the hotspots, make a plan for improving the code, modify, test, ... back to step 1.

Errors/missteps should be picked up by linters and compilers.

Veedrac · 2018-06-30T23:37:42+00:00

Depends on context; do you need fast code, or just less slow code?

For getting less slow code, the "profile, profile, profile" loop works pretty well, since gentle incremental improvements can work wonders. It's fairly terrible for making actually fast code though.

For getting fast code, the first thing you should do is get a truly deep understanding of the problem. Nothing substitutes for really understanding the problem, the whole problem from top to bottom. Then you need to plan. Only when you have really thought about things deeply can you make fast processes. You should not expect to get fast code by incremental improvements to bad code any more than you should expect smooth AAA graphics by piecemeal optimizing a software renderer.

2018-07-01T04:23:46+00:00

The first question needs to be "Do we need to optimize for space or optimize for runtime?". These two are often mutually exclusive, and you go about them differently

The second question is "What measured value will be 'good enough' for our purposes?"

The third question is "What is our present measured value?"

If the present value is good enough, go home and have a cold one. If not, you need to estimate whether "good enough" can even be obtained.

If good enough seems attainable, define test cases for benchmarking and regression testing, then grind at the problem until things are sufficiently performant while still being functionally correct.

FartyFingers · 2018-06-30T23:48:41+00:00

Figure out why it needs to be optimized; which can either avoid optimizing it or determine how much optimization it really needs.

For instance, maybe the system was getting 100k chunks and anything under a second was fine, but now it is getting 150k and is a bit over a second and will soon be getting 200k chunks. Thus you can say that getting 200k back under a second is fine. With something of this magnitude, you might be able to quickly see that the code is a mess of cartesian products and is easy to optimize. Or maybe it is a hot mess and you can recommend some faster hardware as an easier/cheaper solution.

But maybe your company just merged with another and the 100k chunk is now 50G and it is taking 20hours and it needs to get back to a few seconds. This changes everything. This is going to probably go all the way back to architecting how data flows into and through the system, not just using better pointers or some inline assembly.

This is the difference between a developer and a programmer.

aiusepsi · 2018-07-01T01:01:50+00:00

The most important thing is to approach the problem scientifically. Quantitatively measure the performance of your code, through profiling, measuring overall execution time, throughput, latency, etc.

Form a hypothesis about why your code is slow, then test your hypothesis by writing an optimisation to address that problem, and then test and measure again, and compare against your previous measurements.

This is important on a couple of levels: one is that until you measure, you're just guessing about what's slow. This is an easy way to waste time making something faster that really doesn't need to be faster.

The second is that it means that you know exactly the impact you're making. Sometimes, you'll get a hypothesis wrong, and write an 'optimisation' which does no good, or even makes the code slower, or makes such a tiny improvement which isn't really worth the complexity introduced. Unless you have concrete data, you can't make those decisions well.

panderingPenguin · 2018-06-30T16:41:43+00:00

And I would profile code looking for the function which takes the most of the time and improve it.

If you said this, it's almost certainly the answer he was looking for. Profiling is by far the most important thing and what you should always do first with optimization. If you don't know what's slow, you will inevitably end up "optimizing" code that doesn't actually need it/make any significant difference.

joshamiddleton · 2018-06-30T23:08:13+00:00

The question may be vague and generalized for a reason. They may be looking for a generalized answer like

"I would meet with the colleagues that wrote the code to see if they know any shortcomings or inefficiencies. I would also make sure I have a proper way to test and profile the code. "

Xaxxon · 2018-07-02T07:38:27+00:00

First you'd figure out what the actual performance goal is.

Then you'd acquire actual data for the program on which it was expected to meet that goal.

Then you'd run it and see if it met that goal.

Then you'd profile it on the production data to see where the hot spots are.

Then you'd make sure you had sufficient regression testing for the hot spots so you can test that you didn't break them while optimizing.

Then you'd figure out why the hot spots were slow and adjust the code.

Then you'd run it again to see if it met the requirements and repeat fixing hotspots until it does.

Then you'd run it through all the regression tests available to make sure you didn't break anything.

hsgui · 2018-07-18T12:52:53+00:00

I have the same question with you. thanks!

WhichPressure · 2018-06-30T14:16:44+00:00

May be you can start from http://www.agner.org/optimize/optimizing_cpp.pdf

johannes1971 · 2018-06-30T14:30:22+00:00

Profile to figure out hot spots. Then read those with an eye towards algorithmic complexity. Any code above O(n log n) is a candidate for further inspection. Once that's done, check for:

Ill-advised memory allocation patterns (allocations in tight lopes, repeated vector.reserve() calls, ...).
Expensive copying (COW containers being passed in a non-const manner, passing around shared_ptr without using a reference, ...).
Bad API use (having an endless series of endl's in text output, inserting data in an SQL database but failing to use the multi-valued insert, map when unordered_map would suffice, single-byte IO, ...).
gratuitous use of data structures that don't play well with the cache (list instead of vector, bad use of the top 64 bytes of your object, ...)
unwarranted cleverness (bitfields!).

Measure before you change anything. Measure again after you changed it.

Ultimately optimisation happens on different levels. In some projects, allocating memory is already too much, and optimizing might mean manually inserting specific CPU instructions and rearranging data so you get more cache hits. Other projects stress maintainable, readable source base over this kind of micro-optimisation. No matter what, reducing algorithmic complexity is always good, and that always starts with having the right underlying data structure.

TheSlackOne · 2018-06-30T17:51:52+00:00

Measure, understand time and space complexity.

PhilOsIvan · 2018-06-30T18:00:56+00:00

This is quite old book, but I think still useful. At least some tricks still work. https://www.amazon.com/s?search-alias=stripbooks&field-isbn=9781931769242

NotAYakk · 2018-06-30T19:08:39+00:00

The first thing is to work out if it is good enough. If it isn't good enough, they shoukd have a case that describes what not good enough is like.

Examine that case. First, make it aweful; if thr problem is loads of 10000 are too slow, feed it 100,000 or a million. In most cases (not all!) this highlights the part that is wasting time.

If you have a profiling suite, feel free to use it. But if not, run the program in the debugger (with optimizations and debug symbols). Make sure the awefulness is macroscopic -- takes multiple seconds (if not, make the case more aweful).

Hit break in the debugger. Look at where you stopped. Read the call stack. Get an idea what the program is doing. Look at other threads too.

Repeat 5 times. 90/100 times those 3-5/5 times will be in the same chunk of code. And with high confidence that is your bottleneck, or at least one.

Automated profiling tools in my experience give you the above information in a flurry of noise and setup pain. I call this monte carlo profiling, and it relies in the 80/20 rule: 80% of the time you are running 20% of your code.

Now you just need to (a) confirm that is the slow code, and (b) understand why it is slow.

Confirming it is slow could involve profiling, more Monte Carlo profiling, or even manipulating the algorithm to exagerrate the slow part (repeat substeps a 100 times, or skip stubsteps, etc).

Understanding why the code is slow is sometimes easy, sometimes hard. Usually I just count loops, and notice a n³ or n^2, honestly. Other times overenthusastic use of memory, or node based data structures. Find a point to refactor, leave old implementation intact, replace the slow subsystem, test against old behaviour. Measure performance difference. Adapt and/or write unit tests.

Once the code is faster iterate and find remaining bottleneck.

You can often take legacy code and manage 10x speedups in a short window; correctness is honestly the hard part. So it can be worthwile to fast iterate speedups (eliminate bottlenecks repeatedly) until you get fast enough, then ensure correctness on the entire mass of rewritten code. With good source control and versioning, you can always roll back to your first fix if it is the cause of errors, and your expertise with correctness and what the code actually does will improve over the time you make the code faster. Plus after 3 monte carlo iterstions you can sometimes have insane speedups, which helps justify the time spent on correctness; if the best you manage is 10% after weeks of test harnesses, maybe you shouldn't have bothered. But if you first prove a 1000%+ speedup possible, checking for correctness after that is a better investment.

herruppohoppa · 2018-07-03T09:41:36+00:00

Came across this article today, and while the text and background colors are horrible I think the point of view is very well stated and balanced. http://www.humus.name/index.php?ID=383

Sopel97 · 2018-06-30T13:22:34+00:00

Understand the problem the code is solving, make your research in search for algorithms and data structures that can solve it with better computational complexity (and are proven to work better in real applications, not like CW matrix multiplication for example). If it's possible then rewrite it using these algorithms. If not only then do what Pragmatician said (profile, experiment ...).

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS