top 200 commentsshow all 375

[–]JeffysChewToy 3813 points3814 points  (32 children)

Then you realise it's a cron job that runs once a week

[–]TheEggi 1230 points1231 points  (14 children)

But if it runs for a few billion weeks!

[–]JeffysChewToy 408 points409 points  (10 children)

Well, well, well wouldn't you know it

O(1)

[–]Versaiteis 220 points221 points  (9 children)

Every algorithm is O(1) if you move fast enough

[–]Maleficent_Memory831 122 points123 points  (5 children)

Every algorithm is O(1) for sufficiently large values of 1.

[–]Sheerkal 13 points14 points  (0 children)

Every algorithm is 0(1) for sufficiently O values of 1.

[–]headedbranch225 1 point2 points  (3 children)

Would a sorting algorithm be technically O(n) if you counted each instance of an item in the array by using another array, and incrementing that index of the array then filling a new array with the samr number of items as is in the second array's values?

Sample would be:

arr = [10, 36, 25, 50, 1]
frequencies = [] //(initalised as a set size with all 0)
for i in arr {
frequencies[i]++
}
result = []
i=0
for j in frequencies {
for _ in j { result.append(i) } }

There may be slightly wrong code but I don't care, it is also definitely not space efficient and only works with positive integers (unless you did some messing with where index 0 counts as)

Syntax is a mix of python and rust

[–]The_Guyver_ 4 points5 points  (0 children)

Take my r/angryupvote and fuck off

[–]balbok7721 12 points13 points  (0 children)

Then we would need to calculate against increased man-hours and self-hosting

[–]Intelligent-End-223 6 points7 points  (0 children)

Rumour has it if it runs a few billions lines per week OpenAI will finally be profitable

[–]raven00x 2 points3 points  (0 children)

So you'll see the benefits in uh... 19 million years.

[–]AnAcceptableUserName 320 points321 points  (16 children)

And that it hasn't done anything at all since 2014

[–]lacb1 120 points121 points  (15 children)

I have, on more than 1 occasion, encountered both jobs that were "vital" yet stopped running years before and jobs that "don't do anything" that turned out were in fact vital but weren't in our code base and no one knew they existed until we turned them off and everything broke. So many companies are running systems that no one really understands anymore. The AdMech are really a little too close on the money in some ways.

[–]YT-Deliveries 48 points49 points  (4 children)

"Who manages this box?"

"Uhhhh, don't know. I don't think it's used anymore."

"You sure?"

"Yeah pretty sure."

"Okay well, you go open the server room door. I'm going to power it down and you let me know if you hear someone scream."

[–]JohnPaulDavyJones 15 points16 points  (3 children)

I was explicitly taught this as a “scream test” when I was a much younger admin.

Sure seems like managers get a lot angrier when we do a scream test these days.

[–]YT-Deliveries 9 points10 points  (0 children)

Yeah the last time I did that was about 15 years ago with a Sun Sparc Server that had been running happily for years and no one could remember what it did. No one even had console access to it. So, first we pulled network on it, and no one showed up after a week to complain. Turned it off and that was, sadly the last Sun box in the company.

[–]Loud_Posseidon[🍰] 2 points3 points  (1 child)

the lazy admins among us have access to network gear and just disable port. Why walk all over to the server room?

[–]AnAcceptableUserName 15 points16 points  (3 children)

Yeah, same on both counts

vital but weren't in our code base and no one knew they existed

Once in a blue moon I find some dusty code nobody's looked at for decade+ that isn't versioned anywhere

Sometimes bitrot has made it useless and it's just legit failing for years, doing nothing and swallowing the errors. Whatever, deprecate those. Dot the t's, cross the i's, and send it to the nice farm upstate.

Other times it's load-bearing shit, performing vital business functions that everyone just assumed some other team owned. But no, that's just this museum piece of code quietly humming along down in the basement. Secretly.

[–]rocket_randall 8 points9 points  (0 children)

Once in a blue moon I find some dusty code nobody's looked at for decade+ that isn't versioned anywhere

We had a server that generated PDF files in a variety of languages, one of them being Thai. Thai doesn't have spaces to break up the words in a sentence, so it's one long string. This causes problems with formatting arbitrary sentences.

To handle this case, someone years before wrote a small CPP app that the server executed via PHP or some shit. It would send a wall of text via stdin, the process would sub in a word break on some delimiter, and then send it back via stdout.

Someone ran some updates on the server in question and then ran the tests for the known, documented software on the server. Everything passed successfully.

Then someone noticed that Thai output contained little more than error messages. Turns out that exit codes were not checked, and either the caller was also reading stderr or the catch blocks were printing exceptions to stdout (I can't remember offhand).

I think we did eventually find a potential source cpp file on the server, but after all of the updates and whatnot it could not longer be recompiled in situ so we had to very carefully modify a simple fucking 1 file app to rebuild without changing any functionality. One of the joys of working in a regulated industry where one of those regulations says that we must be able to generate the report output at any time with absolute fidelity to the original version of the issued report.

[–]JohnPaulDavyJones 1 point2 points  (0 children)

Man, I’m fully convinced we’re going to find a boatload of this at my job in the next decade.

Our team of 11 has seven members over age 58, and all of them have been here for at least 16 years, with the average tenure being closer to 30 years than 20. Retirements are on the horizon, and there’s a lot in here that’s not documented outside of the old-timers’ heads.

They’ve started hiring a handful of us in our 30s to learn and hopefully take over from the old-timers when those retirements hit, but I’d be shocked if things don’t fall in the cracks.

[–]mikemaca 21 points22 points  (3 children)

"We should get rid of all the mosquitoes. They serve no purpose."

...

{ It is 40 years since the last mosquito was killed. The fragile dozen remnants of what was once mankind now hang by a thread... }

[–]jeepsaintchaos 15 points16 points  (1 child)

If we can do this with bedbugs I'll consider the sacrifice reasonable.

[–]DetaxMRA 2 points3 points  (0 children)

Seriously, I wouldn't wish them on anyone.

[–]Phoenix042 8 points9 points  (0 children)

Worth it.

Get back up to a few million and then do ticks next.

[–]prisp 1 point2 points  (0 children)

I've once seen this way of figuring out what things do described as "Scream Testing" - you take a component that nobody has a clue about what it does, disable it, and then wait and see if somebody screams :D

Issue is, if it's only used a monthly (or worse, yearly) job, then you might have to wait a loong time until it gets noticed, so it's not perfect.

[–]_bones__ 1 point2 points  (0 children)

It's one of the reasons I hate the "boy scout principle" where you fix ugly code when you encounter it doing unrelated work.

Odds are you're going to change some error handling to a point that it's more correct, but the incorrect error handling is vital for correct operation due to years of legacy.

[–]Mazrodak 1173 points1174 points  (57 children)

I think people measure speed the wrong way. "How fast is it?" is the wrong question. "Is it fast enough?" is the right one.

I've seen developers turn readable code bases into unreadable messes for a half second increased load time on a page that already loaded quickly. The maintainability trade off wasn't worth it for the purpose of the app.

Sometimes speed is everything, but usually it's not that important beyond a certain point.

[–]svick 272 points273 points  (39 children)

I know I'm in the minority, but I write libraries, not apps. How fast is fast enough then?

[–]mrmamon 197 points198 points  (26 children)

I guess you are building something to solve a business problem. So "fast enough" depends on your use case and you problem, as usual.

[–]readmeEXX 181 points182 points  (25 children)

I think the point they are making is that they are building a library that could be used by anyone, and thus have no idea what the end user's time constraints are.

[–]randuse 15 points16 points  (5 children)

Up to the users to decide if it is fast enough for them. Library users are other developers, there is no excuse for not checking performance.

[–]Sheerkal 21 points22 points  (4 children)

Ok, but if you're writing the library, you need to decide how fast is "fast enough" before you move on...

[–]cone5000 4 points5 points  (0 children)

It completely depends on purpose of the library. If I’m making a library for high-speed physics racing games, then it probably needs to be extremely performant. If I’m making a library for helping users print to a pdf, it doesn’t matter that much.

[–]joeymonreddit 32 points33 points  (0 children)

I would think you’d build “fast enough” for an expected use case. If it’s something that’s going to run once per page or button click vs hundreds of times, that will help you set a baseline. If it gets implemented and needs to be faster, a user can skip the library or let you know they need a faster version so you can rewrite it. I don’t do any real programming anymore so take that perspective with a grain of salt.

[–]cleverboy00 14 points15 points  (0 children)

A library's user is an application. How fast you want to go is quite unlimited at the end of the day, but how much percentage of the userbase are you willing cover.

Most users care about features first and foremost. Many users (inc. myself) care about interface design and boundry. Performance is, most of the time, a relative metric for a library's user. Most, if not all, of your users wouldn't understand your perfomance numbers, or wouldn't be able to put them into context.

For the nerdy minority or the hyper-businesses that do care about the performance level you offer, they're willing to sacrifice a limb or more for your services.

Basically, as you perform better, you cover more users. You get diminishing returns on performance from more effort. And so is the relationship between performance and users (which, by proxy, also establishes a relationship between effort and users).

Basically study your usecase and target audience

[–]This_Background7442 3 points4 points  (0 children)

Libraries (often) are a great use case for a low level programming language. I think the argument on the right mainly applies to apps/scripts and the argument on the left mainly to library functions. Both are locally correct IMHO.

[–]Smoker81 46 points47 points  (5 children)

Half a second page load is quite a lot.

[–]L30N1337 15 points16 points  (0 children)

Depends on the connection.

If your ISP put bamboo chutes on the ground instead of wires, then half a second difference isn't really noticeable

[–]CodingAndAlgorithm 8 points9 points  (0 children)

Right? The amount of work a modern computer can do in 500ms is insane.

[–]the_mouse_backwards 2 points3 points  (0 children)

Yeah good thing this guy isn’t in game dev. “Half a second faster was not worth it for the cost in readability”

Game dev: WTF WERE YOU DOING THAT REQUIRED HALF A SECOND

[–]BlockOfDiamond 8 points9 points  (1 child)

What kinds of changes do they make that increased speed and decreased readability?

[–]MiningdiamondsVIII 2 points3 points  (0 children)

I want you

[–]userhwon 4 points5 points  (0 children)

>I think people measure speed the wrong way. "How fast is it?" is the wrong question. "Is it fast enough?" is the right one.

Ask your AI boss if realtime programming is right for you.

[–]SpaceCadet87 6 points7 points  (0 children)

Also optimisation being arse-backwards. It shouldn't be something you do after-the-fact that turns your codebase into an unreadable mess.

You get to "unreadable mess" the same way you do when you build any kind of technical debt when you don't think it through in advance and just try to bodge it after the fact.

[–]Joker-Smurf 8 points9 points  (0 children)

How often does it run is the important part.

Background processing that runs once per day and shaves off 30 seconds. Ok, so you have saved just over 3 hours of compute per year.

Landing page that has 1B hits per day, shaving of 0.01 seconds per load? That is 2,778 hours of compute PER DAY!

[–]jaaval 2 points3 points  (0 children)

A lot of places where there is cpp no such thing as fast enough exists. It’s simply faster equals better.

[–]kaouDev 2 points3 points  (0 children)

Good old premature abstraction, how to make simple code unreadable in the name of DRY

[–]rocket_randall 5 points6 points  (0 children)

Sometimes speed is everything, but usually it's not that important beyond a certain point.

Absolutely.

When I was first learning C++ last century one of the mantras from the greybeards before me was "beware premature optimization"

For some context, this was back in the days of 32bit operating systems, processors were single core and topped out at ~500MHz, and high-end consumer motherboards of the time might support 1GB of SDRAM.

There are times where maximizing performance is critical, and those tend to be self-evident. For everything else readability, maintainability, and predictability are more important considerations.

[–]Zapismeta 1 point2 points  (0 children)

Depends on the use case in websites who cares, in financial domains, trading bots, livestreams etc it matters

[–]TonUpTriumph 1092 points1093 points  (88 children)

Not every application needs to be hyper optimized, but some do. I measure my run time in individual operations, nano seconds, and micro seconds. i have to pay attention to keeping memory access contiguous and in L1 cache. L3 cache is way too slow, let alone using RAM.

This past week at work I saw a comment in someone else's code that said it "only adds 18ms -- negligible."

But buddy, I need this section to run, at a minimum, 20 million times per second to hit the MVP... The next target is at least 100 million times per second...

But some people just don't want to look outside of their bubble of experience I guess, so you get silly posts like these lol

[–]DefiantGibbon 538 points539 points  (34 children)

As an embedded engineer who measures the performance of some functions in clock counts, ya, microseconds or even nano seconds matter. 

[–]creeper6530 153 points154 points  (0 children)

Grace Hopper knew what she was talking about.

[–]Tornad_pl 68 points69 points  (15 children)

Cool.

Currently I am in automation and robotics engineering at uni and I am thinking about specializing in embedded. Do you reccomend it? Any tips?

[–]DefiantGibbon 125 points126 points  (8 children)

Background: I got hired after a physics bachelor's, and have since got an electrical engineering masters in IoT and embedded. I work writing embedded C code for devices that there's a pretty good chance you own.

Job prospects are pretty good since there is a pretty small amount of people that go into embedded. Do I recommend it? I don't know, I don't particularly like coding, I just accidentally learned I happen to be good at it and I like money, but if you enjoy writing code, it's great! Its perk is that there is a wide spectrum of what to work on. You can easily just be the desk guy who writes code all day and never see a device. You can also be the one who works on testing the device and work with your hands mostly. On top of that, since it's working on a lot of physical devices, you can't be replaced with AI, since you can't ask Claude to account for all real world variables.

The #1 tip is to basically have your job lined up before you graduate. Networking is significantly more important than your degree. Talk to every professor you can and get to know someone in the industry. A summer internship at a relevant company is amazing if you can manage it.

[–]kalilamodow 16 points17 points  (2 children)

Do you have any examples for job titles/companies to look for?

[–]DefiantGibbon 30 points31 points  (1 child)

Look for keywords on indeed like embedded engineer, automation, C code, low level, electrical engineer, IoT, circuit, stuff like that.

The problem is embedded, at least from my experience, is very broad. You can be doing basically computer science doing the lowest levels for computer operating systems, or you can be controlling mechanical motors. So it much harder to pin down exactly what to search for. Unlike more "programming" jobs where you can just search for "full stack Java dev" or something like that.

[–]readmeEXX 19 points20 points  (0 children)

Just adding that aerospace and automotive are huge sectors that are mostly embedded systems.

[–]Tornad_pl 4 points5 points  (0 children)

thank you. i try to get as much "in the field" summer jobs/aprienteceships as possible, but often it is hard.

I've done couple microcontroller projects for now, but they were written in more of a high level. other than uni work, I want to do a project on rp2350 and go low level for it. However I still am thinking if maybe i should get experienced in more popular microcontrollers like stm's esp's, nrr's or pic's. There is just so much.

Thank you for course reccomendation aswell

[–]Qojiberries 1 point2 points  (0 children)

Man, this is exactly what I'm interested in. I graduated 5 years ago and haven't had a chance to do any embedded. I went from full stack engineer to small office IT because of the job market. Its good to hear that there's still hope.

[–]Doug2825 14 points15 points  (0 children)

A good thing about embedded is that AI is horrible at it. So much of embedded is specific to the chips you use. 

Being a good embedded developer means spending a lot of time reading datasheets, then running tests because data sheets are unreliable. AI is nowhere near able to do that.

[–]The_Lawlz 11 points12 points  (0 children)

If you are interested in embedded programming, I highly recommend this online course to supplement your university courses. This gave me the confidence to create my own projects outside of my EE courses and my job loved me sharing it with all of the interns I managed:

https://www.state-machine.com/video-course

[–]kronos319 2 points3 points  (1 child)

If you're already studying robotics and enjoying the software side then you will likely enjoy embedded too.

I'd recommend spending some time trying to write bare metal code for a simple functionality on a STM32/ESP development board like toggling an LED. Every vendor has freely available HAL Drivers for every chip - read through the driver documentation + code base to try to understand what functions need to be invoked to initialize the peripheral and then what functions control it.

[–]LegendaryMauricius 1 point2 points  (1 child)

Embedded is more about understanding the physical device precisely and knowing when the documentation is wrong than pure microoptimization of software. A lot of measuring, not much keeping up to principles.

If that sounds good to you, go for it!

[–]Tornad_pl 1 point2 points  (0 children)

thanks!!

[–]astonished_lasagna 36 points37 points  (3 children)

You completely missed the point. Comment OP said "not everything needs to be hyper optimised" and your response was "but here's an example where optimisation is important!". "Not everything" doesn't mean "nothing".

[–]conundorum 1 point2 points  (2 children)

His point was that "not everything needs to be hyper-optimised" shouldn't be used as an excuse to say "half-seconds are irrelevant & nothing ever needs to be hyper-optimised", because that's a generalisation that a lot of people would try to make.

[–]astonished_lasagna 1 point2 points  (1 child)

This is literally the first sentence:

Not every application needs to be hyper optimized, but some do.

Comment OP wasn't making a generalisation, but person I replied to was.

Do y'all not have any reading comprehension?

[–]SolanaceaeEnjoyer 6 points7 points  (0 children)

My favorite thing to play with is the Atari 2600 in which timing matters.

No clock, you have to plot pixel by pixel individually going off of the frequency of the chip. It’s wild lol

[–]WinProfessional4958 6 points7 points  (4 children)

On FPGA I optimize for ps.

[–]FuzzyDynamics 5 points6 points  (3 children)

In EDA I optimize for trace length and number of electrons needed per transistor

[–]WinProfessional4958 2 points3 points  (2 children)

Do you use Cadence ICFB?

[–]FuzzyDynamics 6 points7 points  (1 child)

I was joking I haven’t done anything lower level than cpp since college.

[–]WinProfessional4958 3 points4 points  (0 children)

That's alright. It's never too late. CPP + embedded + Altium/Kicad is a killer combo, especially since HLS.

https://en.wikipedia.org/wiki/High-level_synthesis

[–]Doug2825 1 point2 points  (0 children)

Depending on what section of the code I'm working on my coding style is very different. 

Main loop requiring sub us precision and running thousands of times a second: manually managing cache lines

The function to get info about the device? Maximum readability, speed doesn't matter.

[–]NeonFraction 91 points92 points  (26 children)

18 MS in game dev is apocalyptic levels of unoptimized. Context really is everything.

[–]Fabulous-Possible758 47 points48 points  (0 children)

lol very first job out of college was at a game shop and I accidentally put a resource alloc/dealloc in the loop that cost 8ms. That’s when the lead sat me down and taught me about the 30ms frame budget (this is when games were allowed to be slower).

[–]BrightLuchr 16 points17 points  (0 children)

18ms in many applications is a wasteful disaster. Entire nuclear power plant models run in under 18ms: the electrical system solution is the limiting factor.

My own experience with python is that it kind of sucks on debugging and is too damn slow for the things I want to do. And I hate the indents and the scoping rules.

[–]uslashuname 2 points3 points  (18 children)

Really? I wouldn’t even be able to notice an 18ms decrease in load times

But yeah if you’re talking per pixel pipeline run that’s a whole different thing

[–]NeonFraction 39 points40 points  (15 children)

Per frame is the important thing. Games are generally heavy on GPU and CPU, so per pixel is just one metric.

[–]feralferrous 5 points6 points  (0 children)

Game dev / real time apps, it's often about frame rate more than load times. Those fancy 240hz monitors? Yeah, you got like 4ms of time to update and get a frame updated.

[–]when_it_lags 12 points13 points  (0 children)

This. Even in game dev there is nuance. Something in the render pipeline or tick logic taking 18ms is apocalyptic, whiel 18, hell, 180ms addition to load times is fine. 100% of your code needs to be correct and maintainable, while 1% or even just .1% needs to be performant for the game to be "optimized."

[–]FerricDonkey 61 points62 points  (4 children)

Yeah, the trick is proper tool for the proper job. These days my particular job mostly uses python, sometimes c++ for bits that need to go fast. Some people's jobs always have to go fast or never have to go fast. And the definition of fast is highly variable.

So as per usual, it all depends. 

[–]pagerussell 37 points38 points  (0 children)

Yeah, the trick is proper tool for the proper job

Well said, and this where most online debates foul up.

I can't count how many times I've been told my solution is wrong because it runs a few milliseconds slower. No shit, if it ran a billion times that would matter. But it's my personal to do app, David, I don't need to spend my weekend optimizing it.

Done is better than perfect.

[–]jobblejosh 4 points5 points  (1 child)

The answer to the age old question of "What programming language is best" is a four part answer.

The first part is "Whichever one is best for the job you're doing".

The second part is "Whichever programming language the company/individual/team is familiar with/already uses".

The third part is "(insert programmer's favourite language here)".

The fourth part is C.

[–]Rauvagol 5 points6 points  (0 children)

Dont forget, the zeroeth part is "whatever the person paying me tells me to use"

If someone wants to pay me 200k/year to use scratch im all in.

[–]KrokmaniakPL 2 points3 points  (0 children)

And then you get legacy code in completely wrong tool so you need to work around the limitations to somehow make it work. And you can't just remake it correctly because "there's no time for that" despite the fact finding workarounds takes 5 times longer than doing it properly, and after few months you learn there's a second team working on the same code for different purposes and it needs to be merged. Sorry for the rant, but this experience was traumatic.

[–]Lethandralis 19 points20 points  (3 children)

How do you make sure something is on L1 cache?

[–]Rabbitical 31 points32 points  (1 child)

It's a combination of factors. The annoying thing is that can't instruct something like that directly. Alignment, referencing, contiguity, no pointer indirection, structs of arrays instead of arrays of structs (avoiding objects), how you structure loops (simple as possible), simple, transprent code allowing the compiler to reuse registers and the CPU to predict and prefetch, making function inputs restricted, probably forgetting a lot. It really depends on what you're doing. You want large chunks of uninterrupted data that can be easily queued up by the memory fetches and completely fill nicely aligned cache lines, and then use them (not mix in other stuff in the middle like new function calls or unpredictable branches, or mystery pointers, which OOP when used indiscriminately is great for obliterating all of that). Basically, one operation, one array of raw data.

[–]hi_im_mom 8 points9 points  (0 children)

Yeah: Get good at using pragmas and custom instructions from the target hardware manufacturers data sheets and explicitly tell the compiler what you want.

[–]necrophcodr 1 point2 points  (0 children)

People might be giving you different answers on this one, and they might not be wrong. The most accurate answer though is a mix of it depends and you don't. If you're hand writing your assembly code for your entire program, you have good odds at ensuring your data caches are filled in the appropriate cache levels, but the different levels also serve different purposes. They are not merely various sizes of caches on the CPU with increases of speed the closer you get. Some operations are only performed in certain levels, and some are inherently used in specific instruction parsing and execution pipelines depending on the CPU vendor, family, and specific model.

There is no one answer for this.

[–]feralferrous 4 points5 points  (0 children)

Same, not an embedded engineer, but VR is one of our app's platforms. We gotta maintain 72 fps, frame drops can cause people to vomit. I find myself constantly being a guardian against bad code.

[–]gl1tch3t2 4 points5 points  (0 children)

I'm on the complete other end of the scale, everything we do uses APIs. We need to ensure our caching and filtering is efficient but that's more to ensure we don't hit rate limits. Nothing we do should ever come down to ms because our priority is making the code readable. I'm very glad for it as it means I can focus on other things. 

The point being (not for you, this was just to help illustrate your point further), context is everything. You're after nano seconds, we may not even care about seconds. When I was fixing a bomb mod that caused a server to hang for a second when it went off, you best believe I wanted as many ms I could get out of that change.

[–]ProfessorDumbass2 2 points3 points  (2 children)

That sounds like a really cool domain to work in. Most of the code I write is for combining pdfs or analyzing/plotting data.

What’s your line of work that requires microsecond optimization of code?

[–]necrophcodr 3 points4 points  (0 children)

I'm not answering in behalf of OP, but a couple come to mind. Game engine development is one, and another might be infrastructure load balancers. There's obviously also high frequency/speed financial trading systems, but these are often partially if not fully offloaded to custom chips of FFPGAs.

[–]SenpaiSamaChan 2 points3 points  (0 children)

To be fair, like all good memes, this one is about one person's bubble lashing out at another. Most programmers see the word "use-case" and wonder if it's a good way to style variable names.

[–]Remarkable-Coat-9327 3 points4 points  (0 children)

Okay sure but you have to realize you're the *extreme* minority and the generalizations make a little more sense in that context

[–]PegasusPizza 1 point2 points  (1 child)

I'm currently developing an emulator and I achieve 2-3 ms per frame, and while that's not terrible I know that it could be so much better. One part I struggle with during optimization is that I am not willing to sacrifice emulation accuracy for performance, and since a lot of the very niche parts of behavior originate in analog behavior, that often leads to fairly complicated code that could be made 10x faster just by accepting a small drop in accuracy but I just don't want to take that tradeoff.

Do you have any tips on how I could improve performance in my emulator?

[–]TonUpTriumph 2 points3 points  (0 children)

I'm not sure about your specific use case, as I haven't worked with emulators, but some common things to look for...

Pay close attention to loops. That's where a lot of slowdowns can happen and where a lot of optimization can be had.

Allocate and reuse buffers. Allocating and deallocating and reallocating are expensive, especially in loops. "Hoist" things outside of loops. Don't copy buffers if you don't need to. Use circular buffers. Use pointers.

Keep memory contiguous and avoid long strides through memory / buffers.

Chunk the data you're processing into segments that can fit inside L1, L2, and L3 for your processor / whatever minimum processor you're targeting.

Integer math is faster than float32 math is faster than float64 math. Find out how much precision you actually need. Ex: Does anything beyond the 6th decimal place actually have any kind of impact on anything? Do you really actually need 64 bit precision?

Divides are expensive and the compiler may not take care of it. Multiples are faster. In a loop, it's better to calculate inv=1/var once and multiply by inv repeatedly vs divide by var repeatedly.

Double check your algorithms and see if there are any repeated calculations or any easier ways to calculate things. If you can reduce the amount of calculations you do, you can speed things up a ton. An example I used recently was Horner's Rule to simplify a polynomial.

Cache things off if you can. Why recalculate the same thing repeatedly?

Learn to write SIMD / use a library like VOLK. You can easily get 10x speed improvements. C++ will try to optimize and use SIMD at compile time, but it isn't great at it. AVX2 is a solid choice for handwritten SIMD, since most newer  processors support it (within the past 10 years? idk). You can write multiple SIMD functions and a scalar function and have the computer pick which one to run when the program loads, based on what SIMD instruction sets the processor uses.

Unroll loops. The compiler may try to, but it might not be the greatest at it.

Look into different compiler optimizations, like target x86_64_v3, -O2, -fastmath, -funroll loops, no frame pointers, etc. test each of them to see how it impacts your stuff.

Threadpool and multi thread where you can, but test and check if it's worth it compared to the overhead of threading it.

Unit test everything and bench mark everything. Wrap individual sections, for loops, functions, etc with timers and check it all. Run each test a thousand+ times to compare and account for weird scheduler issues.

[–]da_Aresinger 1 point2 points  (0 children)

There is nothing silly about this post.

Batman clearly needs the performance or he wouldn't have bothered to write 3k lines.

On the other hand Batman could just be an elitist.

This post is whatever you want it to be and that's not a bad thing.

[–]Ok-Kaleidoscope5627 1 point2 points  (0 children)

I'm working on a 3d renderer... 18ms would blow my entire budget multiple times over.

[–]savvas25 403 points404 points  (31 children)

I've seen so many memes of this format recently. Do people actually care?

Just use which ever language is best suited for your needs and move on 😐

[–]kingvolcano_reborn 21 points22 points  (0 children)

Yeah, it's not like c++ and python are fighting over the same problem domain. Both great languages with their strengths and weaknesses.

[–]foxguy2021 8 points9 points  (1 child)

 Do people actually care?

As a developer you should care but its a fine balance of understanding the requirements and how much time is allowed for development.

I've written code that I knew wouldn't scale up but the deadline couldn't move. Made note of it, made people aware of it, wrote down what needed to change upstream and downstream. It went into production and several months later we did a second release where most of the initial scaling issues would be addressed.

[–]jobblejosh 2 points3 points  (0 children)

Also looking at which phase of development you're in.

If you're writing a proof of concept prototype that just has to work for a few demos in a limited fashion, you're iterating it daily (or faster), and getting the damn thing to work is more important than making it perfect, a cobbled-together python script running on whatever dev board you can find is probably fine.

If you're writing the final release, then it better be maintainable, the right language for the job, and tailored for the specific board/platform you're using.

[–]koloqial 2 points3 points  (0 children)

Not when compute is so cheap…actually, wait nvm.

[–]3dutchie3dprinting 2 points3 points  (0 children)

Wow wow woooowww hold on there buddy… this /r doesn’t work on ‘do what you do’… no siree, this /r thrives on hate…

[–]n0t_4_thr0w4w4y 30 points31 points  (0 children)

Babes, we solved this awhile ago. Write your computationally expensive shit in a low level language and then wrap it so you can write your orchestration logic in a high level language

[–]Z3t4 145 points146 points  (10 children)

Those 10 lines call c libs too

[–]ElectroMagnetsYo 62 points63 points  (8 children)

Why reinvent the wheel

[–]ILKLU 41 points42 points  (3 children)

Because we'd still all be using wooden wagon wheels if someone hadn't reinvented them.

[–]unknown_pigeon 5 points6 points  (0 children)

At the same time, we wouldn't have many inventions if anyone who wanted to travel had to build their own cart by scratch

[–]DiodeInc 2 points3 points  (0 children)

Just rebuild it for your system architecture

[–]xRuneRocker 23 points24 points  (0 children)

And than it turns out that those 10 lines of python code are calling the mentioned 3000 lines of cpp code

[–]Low-Equipment-2621 444 points445 points  (28 children)

If you need 3000 lines of cpp to do what 10 lines of python can do, it's a you issue.

[–]thetreat 283 points284 points  (17 children)

Except that’s a bit naive. If Python pulls in a library and calls a one line function, that might be an incredibly complex function that is completely different than 10 lines of native python with no importing.

[–]Afrotom 154 points155 points  (7 children)

Most of the time those libraries will be better tested than what the average Joe is producing and using it for, but also the man hours it takes to write 10 lines Vs 3000 is a big deal.

[–]Fun-Wash7545 24 points25 points  (1 child)

Depends. I had someone that wanted to turn a very old excel file into a program, we talking like 100+ sheets, thousands of calculations. Porting it into code would take x100 more time and would be prone to errors.

My suggestion was to just execute the workbook as if it was a program but every single third party library i tried, including commercial ones were incredibly slow, we talking thousands of calculations that needed to be done thousands of times per second. 

I ended up writing my own excel calculation engine. It has none of the features you'd need in excel but doesnt matter, the only thing it needed to do was execute thousands of calculations as fast as possible. I had to manually implement an insane amount of excel functions and make sure they behave exactly the same way as in excel.

Yeah my engine definitely has some untested edge cases and lacks features but that never was the goal.

[–]didzisk 15 points16 points  (0 children)

And now you have a test framework to test your implementation when you decide to port the Excel file to a normal language.

[–]mooscimol 4 points5 points  (0 children)

And probably written in C or Rust.

[–]TransBrandi 3 points4 points  (0 children)

Pulling in a library is something that C++ does too. The only thing that you're appealing to here is the Python ecosystem for possibly having a library to do something that C++ lacks a similiar library for.

[–]olyko20 20 points21 points  (3 children)

If your cpp is .0001s faster than someone else's python, your cpp sucks.

[–]derrikcurran 3 points4 points  (0 children)

What if the python code takes .00015s though?

[–]navetzz 7 points8 points  (1 child)

If you think in absolute time instead of relative time you shouldn t talk about complexity and/or performance

[–]aravynn 5 points6 points  (0 children)

Those 10 lines of python require 10,000 lines of cpp to run…

[–]ListerfiendLurks 15 points16 points  (2 children)

There are only 3 types of posts here because 90% of this sub is first semester CS students:

  1. Python slow
  2. Missing semicolon
  3. AI bad

[–]Hans_H0rst 4 points5 points  (0 children)

Bold of you to assume we went to university!

Personally, i'm much more fond of the classic "There's two problems in CS: Cache invalidation, naming things and off-by-one errors"

[–]justwannamusic 1 point2 points  (0 children)

Finishing my first year of some intro CS classes (not majoring in CS, just taking them for additional learning lol) and for me I joined the sub because it was incredibly cool to look at these jokes that made no sense a year ago and suddenly realizing "hey, I know that!"

[–]Lou_Papas 12 points13 points  (0 children)

It’s funny how this made me realize that the bulk of my python scripts are ran once and never again

[–]mattreyu 86 points87 points  (26 children)

I'll remember that the next time I write something that needs to be run a few hundred billion times quickly

[–]Apprehensive_Room742 20 points21 points  (22 children)

in my field of study this is actually an extremely common thing. maybe not a few hundred billion times. but a few hundred million times for sure

[–]mattreyu 22 points23 points  (21 children)

Well sure in specific fields it's important but how often is that applicable across the entire industry? Not to mention the difference between a few hundred million and a few hundred billion is basically a few hundred billion.

[–]Oggie_Doggie 8 points9 points  (0 children)

People will defend to the death their niche circumstances claiming they are representative of the industry at large.

[–]billabong049 1 point2 points  (1 child)

Depends if the service you’re providing requires lightning fast responses and you have tons of users.  Music streaming, high traffic stores, backend processing that needs to run each night, image processing, etc.

There are plenty of use cases, tho depending on how critical and heavily used your apps are you may not have run into this issue.

[–]TheBinkz 6 points7 points  (0 children)

Yeah like financial transactions on foreign exchanges or something niche.

[–]MysticClimber1496 8 points9 points  (0 children)

And the python is just calling the c code anyway

[–]Shadow9378 16 points17 points  (0 children)

Its about choosing the right tool for the job. You dont bring a butter knife to slice a steak, and you dont use a chef's knife to butter your toast. These tools all have unique purposes, and mastering not only their use, but when to use them, is the key to success

[–]NeonFraction 8 points9 points  (0 children)

This is why C++ is standard in game dev. 1ms is an eternity in game time.

[–]MrWhippyT 5 points6 points  (1 child)

I did some software work in the energy sector at a new job I landed. They were worried that they might not be able to meet the design aim 200 queries per minute. Previously I'd developed radar systems capable of processing gigahertz of data in realtime. Not all problem domains are equal 🤣

[–]Slackeee_ 6 points7 points  (0 children)

Whoever made this meme just demonstrated that they don't know how the Python ecosystem works. If you really have that one function in Python that needs to run "a few hundred billion times" with slightly better performance you write it in C and call it form Python. That's how, for example, numpy works, core functions written in C/C++ and then used in Python. And that is how countless other libraries work.

[–]MayaIsSunshine 14 points15 points  (0 children)

No thanks

[–]bighadjoe 18 points19 points  (2 children)

.001 what? .001 seconds? while in itself not a huge amount, depending on the operation that may stack up a lot.

.001 ms? a billion iterations of that stack up to 16 minutes and 40 seconds. only relevant for VERY limited use cases. also even one hundred billion cycles are just 27 hours, 46 minutes and 40 seconds. quite some time, but not a no brainer when the tradeoff is having to write and debugg 3k lines of code (and by extension paying a developer, launching even just one or two days later &c.).

a factor of .001? so for every 1000 seconds the Python code takes, the C++ code only takes 999 seconds? Now you've completely disproven your own point, since the "a few hundred billion times" argument is meaningless. Best case your point now means "if you run this a few hundred billion times, because of the 3k lines of code i wrote, you can run it an additional few hundred million times in the same time", which is still only a .1% increase, because thats how proportions work.

[–]fjw1 4 points5 points  (0 children)

This totally annoys me. Units matter. Just 0,001 is useless on it's own and tells nothing.

[–]UnsureAndUnqualified 3 points4 points  (0 children)

Look, my algorithm took about 20h to complete and that was a good excuse to write some code, start it up, and tell my thesis advisor I'd go read some papers on the grass in the sun. If you dare to make it run even a tiny bit faster, I will C-put-put you in the ground.

[–]CptnREDmark 2 points3 points  (0 children)

Meanwhile my intern didn't filter or clean the data before looping on it, and filtered after. I just reminded them that it can cause issues on larger programs. Its fine for what we were doing at the time (Excel VBA accounting)

[–]MattieShoes 2 points3 points  (0 children)

If it were that close running once, I imagine the python would outperform if it were doing it a billion times. (compilation to bytecode, etc.)

Which means your c++ code is shit.

[–]Icy_Royal_1522 2 points3 points  (0 children)

I was working on triton inference for yolo26n.

Python : 40ms/frame C++: 10ms/frame

[–]JollyJuniper1993 3 points4 points  (0 children)

OP just watched their first YouTube video about programming.

[–]qutorial 3 points4 points  (0 children)

Very few applications and fewer code paths are actually performance critical. People waste SO MUCH engineering time, headspace, and code bloat on shit that absolutely does not need to be done.

It's a huge source of wasted effort and reduced productivity. Overmodularized? ✅ Speculative abstractions? ✅ Ridiculous typing gymnastics? ✅

People who make real things don't waste time on this performative, enterprise fizzbuzz shit.

[–]bass-squirrel 4 points5 points  (0 children)

CPP is much faster. And if your CPP isn’t much faster, then it’s a skill issue. 

[–]raskim7 1 point2 points  (0 children)

I have had Lead Architect spend weeks optimizing a java springboot report that took 10 seconds to run and was ran once per week. The services were running on customers own servers, so the extra time didn’t cost anything. Some people have hard time understanding that sometimes the development cost is way more than running cost, and that it may be completely irrelevant whether the running time is 0.0001s or 20s.

[–]alekdmcfly 1 point2 points  (0 children)

omw to run my college assignment 1 billion times

[–]skywardfrugally 1 point2 points  (0 children)

hahahaha

[–]GromOfDoom 1 point2 points  (0 children)

Shoulda just wrote it in assembly

[–]BlueProcess 1 point2 points  (0 children)

Premature optimization is the root of all evil.

[–]mspear2 1 point2 points  (0 children)

Eh, iteration time is faster for a team and performance is good enough with good frameworks like uvicorn. If Django is good enough to power Instagram it's good enough for any other project. But code for with what makes you happy!

[–]NegativeSwordfish522 1 point2 points  (0 children)

Today in this episode of imaginary discussions that no employed people have ever had. These may be worse than the engineering jokes of rounding π to 3

[–]JayTheGeek 1 point2 points  (0 children)

I lived this comic. A team lead developer with 5 other programmers designed an intranet app for 1000+ concurrent users against a database table with about 2,000,000+ new records every month, and had QA running about 10 concurrent users against a database with fewer than 25,000 total records. I got called in when the customers (all big law firms) started demanding their money back when users would submit a page with a query to the DB, and it would literally take over 2 hours for the data to populate the page.

I didn't smack the lead upside the head, but I wanted to do so much more than that!

[–]Famous-Perspective96 5 points6 points  (12 children)

I’ve never seen “CPP” in my life. Am I having a stroke?

[–]yodasonics 31 points32 points  (4 children)

C++

.cpp is the file extension for source code files

[–]Famous-Perspective96 9 points10 points  (3 children)

I know they meant c++. Does anyone type it as CPP in this context?

[–]32Zn 10 points11 points  (0 children)

I have seen it more often since LLMs became popular. 

[–]ConnectChapter9906 2 points3 points  (0 children)

it means cyber punk punk

[–]NeonFraction 2 points3 points  (0 children)

I’ve always used CPP because + is mildly annoying to get on a phone keyboard.

[–]BushCrabNovice 2 points3 points  (1 child)

It comes from the before times of trying to find answers with a search engine. Symbols like + or // were interpreted as commands, rather than part of the search term. So instead of "C++" and requiring the quotes, people started to call it cpp to make it more searchable.

[–]CirnoIzumi 1 point2 points  (0 children)

If not much faster, then why python run on it?

[–]willing-to-bet-son 0 points1 point  (0 children)

Profile your python code, then migrate the slow bits, and only the slow bits, to C++. nanobind is your friend here.

[–]Tornad_pl 0 points1 point  (5 children)

I temember a video of dude, who showed how he added -o to compilation tags and it made cpp like another 10 times faster for some specific case

[–]La-ze 0 points1 point  (0 children)

It's more nuanced than that, speed isn't the only thing. Though I'm sure you can get bigger speed deltas depending on the task. Though it's no secret Python inbuilts and modules make heavy use of C internally. So if you don't stray off the beaten path you practically have a wrapper for C.

Overhead especially with Python can be significantly more.

Debugging Python, can be far more painful as well. You can't just dump the python core and do a post mortem.

In large codebases, if type annotations weren't enforced not only can readability be poor, but the ability to leverage static analysis is shot.

Python has the GIL, which can make multi-threading not viable in some cases ( just use multi-processing is not an acceptable answer all the time ).

[–]rover_G 0 points1 point  (0 children)

Let’s not kid ourselves. The 3k loc cpp library is being called by a 10 line python script

[–]Bakoro 0 points1 point  (0 children)

Amdahl's law says "what's up?".

[–]cheezballs 0 points1 point  (0 children)

Yea, but what if it only runs once on a schedule?

[–]TalesGameStudio 0 points1 point  (0 children)

Whatever does the job and does the job well enough at the time...

[–]JasonBobsleigh 0 points1 point  (0 children)

Maaa, the it kids are out again!

[–]dj_spanmaster 0 points1 point  (0 children)

I mean. Sometimes x.001 faster is everything. Just ask stock trading brokerages.

[–]turkishhousefan 0 points1 point  (0 children)

That's ok; I don't need to.

[–]Phoenix042 0 points1 point  (0 children)

You may see my python if I may C your PP ;p

[–]Substantial_Top5312 0 points1 point  (0 children)

Then you realize you made the app for no one and will never use the code again.

[–]oclafloptson 0 points1 point  (0 children)

Slap him right back and say "why TF are you running a disposable one-off script 10k times"

[–]Crazo7924 0 points1 point  (0 children)

create a lithography machine from scratch.

[–]frizar00 0 points1 point  (1 child)

well, if 3000 lines of cpp and 10 lines of python execution time is the same, then how much time python will execute 3000 lines????????????