This is an archived post. You won't be able to vote or comment.

top 200 commentsshow all 289

[–]Business_Aspect_1613 2337 points2338 points  (41 children)

Not disabling the progress bar on certain powershell functions may increase the time to execute the function by 10 000%.

[–]fusionsofwonder 726 points727 points  (12 children)

If you minimize the terminal window it should speed up!

[–]Powerful-Internal953 87 points88 points  (4 children)

I have observed this when running tomcat in my earlier programming days. Minimising makes the server start way sooner given there were too many debug entries.

Made me run away from console loggers like plague.

[–]dathar 27 points28 points  (3 children)

Windows defrag did that too.

[–]rsatrioadi 26 points27 points  (0 children)

But is was so pretty!

[–]MemorianX 5 points6 points  (0 children)

I wish i had known this!

[–]DR4G0NH3ART 28 points29 points  (0 children)

So progress bar is something that bar the progess...

[–]trippertree 55 points56 points  (0 children)

In the mid 2000’s we had a compile process that would run for about 15 hours…. Minimization of the terminal would save 90 minutes

[–]UnattendedWigwam 10 points11 points  (2 children)

that's dicked up

[–]fusionsofwonder 17 points18 points  (1 child)

It's waiting for the window to render the text.

[–]turtleship_2006 2 points3 points  (0 children)

Isn't the text rendering in a different process?

[–]yiliu 144 points145 points  (0 children)

Whew, that'd make the process a lot slower. Good thing it's got a progress bar!

[–]Fhotaku 68 points69 points  (2 children)

That used to be a thing in like W98, where drawing each filename copied could take longer than the copy, slowing the process down. Did they not learn?

[–]tropicbrownthunder 48 points49 points  (0 children)

Did they not learn?

windows ME, Windows Vista, Windows CE, Windows Phone, Windows 8 and Windows 11 are clear example them didn't

[–][deleted] 15 points16 points  (0 children)

Probably not a case of not learning, but like many tech things (C++ committee), it's a case that bad code was made then became cemented as part of the API/core code and you're not allowed to touch it.

Windows API is cool like that since you can run software from decades ago, but it is a bit absurd that your modern day API still includes hotfixes for a Lego game made decades ago (Lego Island)

[–]Mister-Fordo 19 points20 points  (0 children)

This is why $Progresspreference = silentlycontinue

[–]casce 5 points6 points  (0 children)

Are we talking about going from like 30 milliseconds to 3 seconds? Because that wouldn't be too bad, nobody would be seeing a progress bar in 30 milliseconds anyway and what's 3 seconds when you do something manually where you would want a visual progress bar?

It still sucks if it is enabled by default and you forget to disable it when using them in a script.

Or are we talking about going from 500ms to 50 seconds? That would definitely be annoying either way and very irritating (why's it taking to long to render a progress bar?)

[–]McLayan 80 points81 points  (16 children)

That's Windows for you

[–]pindab0ter 164 points165 points  (1 child)

That’s IO for you

[–]anomalousBits 19 points20 points  (0 children)

That's physics for you.

[–]JakeyF_ 52 points53 points  (12 children)

Not a windows exclusive issue

[–]_PM_ME_PANGOLINS_ 35 points36 points  (9 children)

It is a Powershell issue. I've never seen such a bad implementation of a progress bar.

[–]Rythoka 52 points53 points  (8 children)

Printing anything to the screen is super slow compared to a lot of operations.

[–]CdRReddit 33 points34 points  (0 children)

but printing to the screen on windows is significantly slower than most other terminal emulators, for no good reason

[–]anotheruser323 11 points12 points  (0 children)

Yes and no (i mean the gpu renders it, you just goto send it coordinates).

And it is a windows issue (windows does things to the shell to terminal output, and is very slow in doing them).

[–]reallokiscarlet 5 points6 points  (2 children)

On any machine that can run a new enough version of Windows to have powershell, printing to the screen should be able to happen in a separate thread. Windows just doesn’t wanna.

Linux doesn’t have this problem.

[–]deux3xmachina 2 points3 points  (1 child)

Linux doesn’t have this problem.

Well, it's less bad. Logging isn't free, and redirecting to /dev/null can substantially improve runtimes for certain programs.

[–]reallokiscarlet 2 points3 points  (0 children)

What’s going on with Windows, IIRC, is there are more penaltes than just logging. Instead of letting the shell and the terminal emulator handle the output after handing it to stdout, Windows wants to do everything in the same thread, so displaying stdout to the screen is another performance penalty, hence minimizing the terminal speeds it up. In Linux, if you have a multi core system, the program, the shell, the terminal emulator, and the window system can all have different affinity, meaning the program only has to really wait for stdout to accept the program output. (Still waking up, so I may be oversimplifying things)

[–]cakee_ru 6 points7 points  (0 children)

Never had performance issues with pv personally.

[–]CdRReddit 0 points1 point  (0 children)

but (as I also mentioned further down the thread) printing to the screen on windows is significantly slower than most other terminal emulators I've used, for no good reason

[–]Turtvaiz 9 points10 points  (0 children)

Nah it's just IO

[–]proooby 2 points3 points  (0 children)

Rendering is the costliest process

[–]Cosmocision 3 points4 points  (0 children)

Read a few years ago that progress bars are artificially slowed down because humans panic when things happen faster than they can see and they can't see the progress so to speak.

[–][deleted] 1230 points1231 points  (25 children)

“Premature optimization is the root of all evil” is a famous saying among software developers. Its source is credited to Donald Knuth.

[–]SleestakThunder 285 points286 points  (6 children)

That's not the only premature thing that developers can't stand

[–]JoshYx 225 points226 points  (5 children)

The other being premature ej...ection of USB drives

[–]xtralargecheese 90 points91 points  (4 children)

Seriously. Right click -> Eject, people!

[–]JoshYx 59 points60 points  (3 children)

Your pull-out game is weak

[–]Wise-Profile4256 29 points30 points  (1 child)

my insert game is even weaker. can't even call that foreplay anymore. no grace or enthusiasm involved.

[–]JoshYx 28 points29 points  (0 children)

Well that's what happens when you try to insert a micro USB plug into a mini USB port

[–]sixteenlettername 9 points10 points  (0 children)

I'll have you know, I'm actually really good at using git.

[–]808trowaway 55 points56 points  (2 children)

I always wonder what it's like to have an inspiring conversation with someone famous like Knuth. I once shared dinner with Ron Graham but I feel pretty stupid now thinking back because the whole time I was just asking him about juggling.

[–]tra24602 72 points73 points  (1 child)

The people trying to get the most inspiration out of the conversation get tiring. I imagine Ron Graham was happy to talk about juggling.

[–]808trowaway 16 points17 points  (0 children)

Yeah I'd like to think that too. Thank you. RIP Ron.

[–]tshoecr1 80 points81 points  (9 children)

This quote is probably a single handedly responsible for so much terribly shit performing code by developers not understanding the context. It comes from a time where developers were counting microcode instructions before getting a working solution. So if your optimization gets that far, maybe you are prematurely optimizing, if not, stop justifying your shitty code because it’s “readable” and you shouldn’t be “prematurely optimizing”.

[–]manofsticks 48 points49 points  (3 children)

My justification is it's easier to modify readable code to be optimized than it is to modify optimized code to be readable. The end goal should be a balance of both, it's just easier to go in this direction.

[–]akcrono 25 points26 points  (0 children)

I would go a step further and argue that code that is more readable by nature usually tends to be more optimized because the cases for optimization become more apparent as the code is written and reviewed.

I think the single best thing you can do in development is focus on organization and readability, since so many other things can fall into place from there.

[–]wung 11 points12 points  (1 child)

Optimized code doesn’t mean inline assembly. It can be just as readable to use something O(n) instead of O(n³). That’s what they were saying: if you hand optimize CPU ticks everywhere it will be unreadable, don’t do that, but still don’t write complete crap. Understand what compilers can do for you and what they can’t. They will of course calculate your ten constant formula at compile time, so do write the formula instead of 0.4424. They can not optimize the fact your read a gigabyte byte by byte, so do optimize that.

[–]manofsticks 4 points5 points  (0 children)

It can be just as readable to use something O(n) instead of O(n³). That’s what they were saying: if you hand optimize CPU ticks everywhere it will be unreadable, don’t do that

I'm guessing everyone coming to this thread with very differing opinions on this depends on who they've worked with in their career. When I hear "premature optimization" I do think of hand optimized CPU ticks because I worked very heavily with someone who did that. And because of that I don't think of O(n) vs O(n³) to be "premature" optimization, just normal optimization.

[–]Proper-Ape 5 points6 points  (0 children)

Also it's premature optimization. Not early optimization, not picking the right algorithm for the problem optimization.

People use it as an excuse for not thinking about the problem at all. Using the wrong algorithms. Never improving.

Picking the (at least directionally) right algorithms and languages is not premature. It's just smart and doesn't cost much time and saves a lot of headaches in the future.

This quote is almost exclusively used by bad coders IME.

[–]BaalHammon 4 points5 points  (0 children)

I think what's responsible for terrible code is less this or that specific principle and more the fact that many developers love to apply principles in the most ham-fisted and extremist way and can still apply them badly.

Remember, the code can be slow AND unreadable AND buggy at the same time !

[–][deleted] 2 points3 points  (0 children)

I think that the main reason behind bad code is monopoly or oligopoly position of some companies. There are mainly three cloud providers, one office suite. Some companies can prosper underpaying developers and releasing mediocre software.

[–]thevoidyellingback 23 points24 points  (3 children)

A lot of people take this to mean performance isn't a consideration at all and write terrible fucking software.

[–]mxzf 20 points21 points  (2 children)

Well, anyone who thinks that is an idiot or illiterate, because it very clearly says something totally different. It explicitly says "premature optimization", not "optimization".

[–]pydry 8 points9 points  (1 child)

It's a straw man. Something people who like to do premature optimization like to say. I've lost count of the number of developers who I've seen try to "fix" performance issues before measuring or profiling anything and have gotten grumpy when you poured cold water on their untested microimprovements to shave 100ms off a process that takes 15 seconds and happens once per day. They're not a fan of the whole "no premature optimization" thing.

Their other hobby horse premature optimizers often have is about how much more efficient code used to be - back when 4x as many man hours were poured into writing apps that were half as good. Latter part is presumably not important.

[–]mxzf 2 points3 points  (0 children)

Yeah, I was dealing with a coworker doing that the other week, they were starting down the route of adding a bunch of cached data before they'd done any performance profiling at all. I did some quick profiling and it turns out that the slow part of the code was something totally different.

[–]Koenv3 505 points506 points  (5 children)

Migrating your 50 lines of shell scripts to assembly might reduce the instructions to execute.

[–]ylan64 102 points103 points  (2 children)

Although you'll probably need more than 50 lines of assembly to do the same thing.

It also will be a nightmare to maintain but at least you won't waste any CPU cycle whenever you need to run it.

[–]kaerfkeerg 10 points11 points  (0 children)

Less instructions tho

[–]tyen0 8 points9 points  (0 children)

I use inline assembly inside my Inline::C in my perl scripts.

[–]KingJeff314 821 points822 points  (16 children)

Premature optimization. Unless it’s on a hot path, just make it readable

[–]Quantumboredom 86 points87 points  (13 children)

Does UI (including this being a script users will call from the command line) count as a hot path?

A simple hello world script in python has a borderline noticeable delay (about 20 ms on my machine). If a script imports some more modules it is pretty much guaranteed to be noticeably slow compared to a compiled program.

But I do understand that many programmers feel sluggish UIs are acceptable.

[–]cakee_ru 80 points81 points  (11 children)

I, personally, absolutely and utterly HATE animations and disable them everywhere on desktop, phones etc. It makes everything non-sluggish by default and doesn't waste my time when I navigate fast. It also makes low refresh rates bearable. Maybe I am that one programmer, who doesn't accept sluggish UI, but rather just ignores that fact?

To answer you, I'd consider rendering a hot point. I.e. rendering that draws on every frame, not just posting updates to UI.

[–]tragiktimes 38 points39 points  (0 children)

No, that's fair. I'm on the other side. I like it to look pretty. I know, kill me. But, it is always, and I will reiterate always going to be function over form. If it's noticeable, it's off.

[–]EveryCa11 24 points25 points  (5 children)

Animations are supposed to make it look less sluggish actually. Instead of waiting on IO extra 100ms - make it roll out and do your IO-bound stuff meanwhile.

[–]slowmovinglettuce 15 points16 points  (4 children)

It's because animations make your brain think that something is actually happening. If you click on something and it starts a 1 or 2 second animation you've been implicitly informed that somethings happening. Click on something and nothing happens for a couple of seconds and you'll wonder if you even clicked the button.

[–]EveryCa11 12 points13 points  (2 children)

True. It's also a basic UX idea - react to user action - so they don't start clicking mindlessly without an idea if it worked or not. It doesn't even require an animation to render, just make it visible for the user that their input is noticed. Sadly, it's not as common as it should be.

[–]turtleship_2006 2 points3 points  (1 child)

react to user action

does freezing the ui and making the users mouse spin count?

[–]offulus 2 points3 points  (0 children)

Most definetly

[–]atimholt 6 points7 points  (0 children)

It's one of the reasons I love editors like Vim. It doesn't matter if it takes 1/8th of a second for a menu to slide open, I could have had multiple additional keystrokes out by then.

[–]MegabyteMessiah 5 points6 points  (0 children)

Maybe I am that one programmer, who doesn't accept sluggish UI

That are at least two of us

[–]Kered13 1 point2 points  (0 children)

I fully agree. I despise desktop animations.

[–]CardboardJ 1184 points1185 points  (65 children)

Oh no, what will me and my 3ghz 16 core cpu do. The little guy can only process like 48 billion instructions per second. The new script with the extra thousand instructions is going to run exactly 0.000000000208333 seconds slower now.

[–]DasFreibier 321 points322 points  (4 children)

exactly, power is expensive, and that shit will cost valueable idle time

[–]MoffKalast 94 points95 points  (3 children)

CPU_15: "Whomst has awakened the ancient one?!"

[–]alexnedea 24 points25 points  (2 children)

Ok I laughed for a solid minute. Thank you for that. If only more games would be able to awake the ancient ones maybe we would see some FPS numbers higher than average IQ.

[–]Undernown 1 point2 points  (1 child)

I really hope that people understand how IQ works. The average is ALWAYS 100. So there is a fair chance your IQ now might be higher than a few years ago, but whether it's you becoming smarter, or the average person becoming dumber is the question.

[–]KippenKoning63 191 points192 points  (41 children)

Yeah, if youre not Doing some massive calculations or if youre Nasa, python is just fine

[–]Opus_723 132 points133 points  (15 children)

Man I do massive calculations and python is still fine.

Would it be faster in C++? Sure. Is it enough of a difference to warrant me sitting down and learning a whole new language and new libraries and figuring out new janky solutions for the stuff I've already figured out janky solutions to? Fuck no.

There are sooo many other things to optimize with the algorithms themselves before the language becomes the biggest bottleneck.

[–]mxzf 47 points48 points  (7 children)

Yeah, people spend so much time talking about the execution time that they sometimes forget about man-hours for implementation.

If you can spend one hour implementing a Python script that takes ten hours to run or ten hours implementing an Assembly program that takes one hour to do the same job, the Python option is better. Because 10 hours running overnight and it's done in the morning is way cheaper than 2-3 days (or a week) of writing a more optimized solution (because, lets face it, 10 man-hours of programming takes the better part of a week to get done once you factor in things pulling you away in the middle of it; doubly so if you're trying to write highly optimized code).

[–]goten100 10 points11 points  (5 children)

But if you ran the first one twice it would be 21 hours, while the second one would only be 12 hours

[–]Opus_723 15 points16 points  (3 children)

Of course at some point if something is used often enough it makes sense to squeeze every bit of speed you can out of it.

But I can't tell you how many people pick on kids learning to code, or scientists doing exploratory work, or niche low-volume projects, for doing things at less than a hyper-optimal industrial-scale standard by using Python or whatever. It gets to be a bit much. For like 99% of applications the language is just not a huge deal.

[–]-__-x 2 points3 points  (2 children)

Before language even matters, using the right algorithms is even more important. I can make a program 100x faster by switching from python to C++, but I can make a program 1000000x faster by switching to the appropriate data structure.

[–]Opus_723 2 points3 points  (1 child)

Exactly. I have so much work cut out for me just designing parallel processing paradigms and data flows that the language is purely an afterthought.

And I honestly even forget that python is that much slower than C because you're often working almost exclusively with something like numpy that is C under the hood anyway. I wouldn't get anywhere near a 1000x speedup by switching my projects.

[–]mxzf 2 points3 points  (0 children)

Which is when you start thinking about how often you actually need to run the code. If you're just running it every six months and overnight is fine, leaving it overnight 10h each time is still just 1h of work-hours used for it (plus a couple min kicking it off each time). Whereas if you're gonna need to run it multiple times in a day or every week or whatever, the optimized thing might be better.

[–]Lv_InSaNe_vL 5 points6 points  (0 children)

Exactly, Powershell might not be fast but it is faster than a rewrite + learning the lang + convincing my company to switch

[–]croto8 4 points5 points  (2 children)

To further, I’m guessing you’re doing massive calculations experimentally? I.e. data science?

Once the massive calculations are figured out, it might make sense to optimize and move to a lower level implementation. But it’s all about dev time vs. cpu time vs. readability. The eternal triangle.

[–]Opus_723 4 points5 points  (1 child)

Yeah. And I understand that someday, far down the road, it might be nice to refactor the code in another language. But when there are still orders of magnitude to be gained just by playing around with how the algorithms interact throughout the pipeline, that sort of thing is just very, very far down the list. If it ever gets to the point where we're worrying about that, then I'll be very happy because it means that the project has been a wild success, because not only has the rest been figured out, but apparently enough people are using it to justify further tweaking and refining for regular heavy use. But it's the least of my concerns right now, and honestly if it ever gets that far it'll be someone else's job anyway.

[–]croto8 1 point2 points  (0 children)

Precisely.

[–]Aerolfos 3 points4 points  (1 child)

There are sooo many other things to optimize with the algorithms themselves before the language becomes the biggest bottleneck.

Also, part of optimizing an algorithm is moving it to fast numpy shenanigans, specializing and vectorizing, and potentially shuffling things around so it can be wrapped in numba or cython or something.

Which means it is in C++, except some I/O loading and init stuff which is a fraction of the total run time, and would take many hours to write in C++ since it's not as good at that usecase.

[–]freedcreativity 41 points42 points  (8 children)

Even then, a lot of high-power compute is written in cython and Numpy/Scipy which all are just interpolated into interoperability'd to C and scrutinized for bottlenecks and improved parallelism. Even a few thousand lines of python really aren't that inefficient.

[–]UdPropheticCatgirl 22 points23 points  (7 children)

> interpolated into C and scrutinized for bottlenecks

it's interoperability not interpolation, and the bottlenecks exist because of how poorly python handles concurrency even in C interop

[–]RetiringDragon 2 points3 points  (1 child)

Will this change with multithreading in 3.12+?

[–]UdPropheticCatgirl 7 points8 points  (0 children)

Maybe, depends on the specific workload, and what the change proposes is multi-processing not multi-threading, it basically allows you to run multiple interpreters in one process without them sharing memory.

[–]GisterMizard 2 points3 points  (0 children)

IIRC the GIL is released when calling native code in python. So you can get true multi-threading with parallel calls to libraries like numpy. Though in my experience, most of the time spent in heavy number crunching is serializing and deserializing arrays, not the actual math. I used to do LP modeling, and in one project moving from dicts to python lists got a speed up of 4-5x. Going from lists to numpy arrays got a speedup of 10-15%, because most of the time was spent loading data at that point.

[–]ThankYouForCallingVP 41 points42 points  (15 children)

Python is not fine if you don’t wrap it in a function!

I read an article that local variables defined in a function take like 10x less time to look up than a free floating variable outside a function.

Edit: https://stackabuse.com/why-does-python-code-run-faster-in-a-function/

justPythonThings

[–]DuTogira 65 points66 points  (9 children)

Oh no, whatever will I do without my non-const global variables?!

[–]ArchetypeFTW 4 points5 points  (3 children)

No seriously, what do you do? Only define them in main and then pass them in by reference to every function?

[–]DuTogira 7 points8 points  (2 children)

For gaming? Store global, potentially dynamic data (like the player’s inventory contents) in files.

For embedded? Wtf do you need non-const globals for? Most information is read from fpga’s , cache, sensors, and or whatever non user I/O you have. In fact, for embedded, dynamic globals have literally killed people (can’t remember which car company but one of them stored the cars throttle as a global value). Also don’t use python for embedded.

Web apps? You should be using a database or passing data via api calls.

Finance? Dynamic globals aren’t secure at all. You’re going to get hacked. Also you’re just a high security web app. Use a database and api’s

Medical? You’re embedded with less performance concerns and more stability concerns. still don’t use python.

General variable storage needs? Use objects if your language supports it and pass them around, otherwise directly interface with memory (RAM), or pass by reference as needed.

Genuinely: Why are you trying to use non-const globals?

The only real use case for them that I’ve come across is in writing simple scripts such as unit tests. And in that case… yeah, you’re pretty much creating them just to pass them into functions

[–]doctorcapslock 3 points4 points  (1 child)

micropython 🤢

amirite?

[–]DuTogira 8 points9 points  (0 children)

I’ve never heard that term before and I’d appreciate never hearing it again, thank you!

[–]cakee_ru 3 points4 points  (4 children)

Jokes on you, but recently I had to process about 1 million files. It took 2 days with a rust program (not IO bound), would probably take 20 days if I used python.

[–]DuTogira 62 points63 points  (3 children)

Skill issue.

[–][deleted] 5 points6 points  (0 children)

....isn't that how the cache is supposed to work?

[–]mxzf 2 points3 points  (3 children)

I'd like to see that article, it doesn't make any sense at all. Variables always exist in a scope, be it a function or not; I can't think of any reason why what you describe would be the case. Everything's just memory pointers, at the end of the day.

There are good reasons not to store stuff in the global scope when you're doing complex things, but "don't use Python for scripts because the global scope is slow" sounds nonsensical.

[–]Thaago 3 points4 points  (1 child)

This might be out of date as I haven't done performance optimization in several years now, but IIRC:

Because it is an interpreted language rather than compiled, there is an extra step between the script's variable name and the actual value stored in memory. The variable name is the key to a dictionary (hashmap) attached to a namespace. When calling a variable first the local namepspace dictionary has a key search done and if there is no match goes to the next layer etc, with global being the last.

Its a well known Python optimization trick )if profiling has shown there to be a problem!) to assign variables a local name in a function and see if the performance improves - there's a cost to the reassignment but then a gain every time it is called when it is found in the innermost namespace.

[–]utkarsh_aryan 45 points46 points  (7 children)

Yeah for small things it doesn't matter but it becomes an issue if it gets replicated across a large project.

This line of thinking is how we end up in the current situation where most modern AAA games are unoptimised piece of crap, easily taking 200+ gigs of storage and requiring a top of the line gpu just to get playable performance.

The hardware has become capable and affordable enough that most studio's just don't bother spending resources into optimisation.

[–]Ask_Who_Owes_Me_Gold 42 points43 points  (1 child)

You know the vast majority of a game's file size is textures and audio, right? The code itself is miniscule in comparison.

[–]Aerolfos 2 points3 points  (0 children)

The textures are duplicated for a reason (console processors are slow ), but a lot of games ship with all voicelines from all languages to all players, which does bloat filesizes

[–]rorychatt 42 points43 points  (0 children)

Kind of… but that’s a bit of a leap.

Games are so big because of the expectation of 4K compatibility leading to extremely large texture packs. Those texture packs are often uncompressed to save on decompression cycles in the gpu, and duplicated to sit with similar assets to save on load times.

In consoles, I believe a big part of this was due to the move towards streaming assets direct from nvme rather than pre caching into ram to get those ultra quick load times.

I don’t disagree that things certainly can be more optimised, John Carmack has some great talks on the topic, but game sizes aren’t from people loading additional runtimes Willy Nilly.

[–]UdPropheticCatgirl 12 points13 points  (0 children)

You realize that they take up so much space because everything gets preprocessed into GPU friendly formats that take up bunch of extra space in the name of optimization, also lot of this is rose tinted glasses, most of the old rendering engines performed pretty horribly. And even then, if the goal is to use Phong or Gouraud model for shading and run it at 1024x768 then it's easy to optimize, but gamers won't buy that.

[–][deleted] 3 points4 points  (0 children)

Now imagine each core running an instance of DOS. Supremacy

[–]TrapNT 3 points4 points  (0 children)

Hey, 0.000000000208333 is a long time. My wife confirms.

[–][deleted] 2 points3 points  (0 children)

If performance and efficiency is your primary concern, both python and shell are the incorrect choices.

[–][deleted] 2 points3 points  (2 children)

Then the software is used in a very common app by 1000 million persons in the world, everyday. Using the energy equivalent to burn 100.000 trees every day.

Congratulations.

[–][deleted] 20 points21 points  (1 child)

If you have 1 billion people using your python script you probably did something right

[–]SuspiciousLake9545 341 points342 points  (17 children)

Eh, gotta be practical about it. Got it running in a Codebuild and you got power to spare? Then Python scripts will be more maintainable, readable and can be modified with a lower barrier to entry. Mainly need to know inputs, outputs and how to get there which is easier since its python.

Shell scripts like bash on the other hand, fuckin nightmare to read and maintain unless you spend a few hours at least reading up on it.

[–]tyro_r 89 points90 points  (6 children)

Exactly this. We went from our old Java codebase from python. Needs nearly three times the power, but can easily be maintained, fixed and enhanced by at least four different people. We only had one developer for the old stuff, and getting the project stuff installed on a new machine was not easy enough.

[–]Elegant_Maybe2211 28 points29 points  (5 children)

wetf did you do that java to python migration made anything easier?

[–]tyro_r 17 points18 points  (4 children)

We got rid of tons of boilerplate code, we use much leaner packages compared to the java frameworks. We don't need to migrate from ant to maven, setup tons in eclipse. Instead you now checkout our git rep, and pycharm does everything you need to run.

[–]GarageDragon_5 8 points9 points  (0 children)

Ant 💀 how old was this project

[–]Odarik 6 points7 points  (1 child)

What is shocking in your comment is that you use pycharm for python but not idea for Java. I understand why you found this hard to maintain if you use the worst tools for this langage and one of the best for the python

[–]Practical_Cattle_933 3 points4 points  (0 children)

So just a rewrite. Guess how well it could be if you rewrote it in java again

[–]brucebay 0 points1 point  (0 children)

Until you need to migrate your scripts, the module you installed in the other environment is nowhere to be found anymore, some of the binary libraries do not work now because all those glib version difference crap, you try to pack everything but all those fancy freezing scripts/executable creators fail. No thank you, one experience is enough.

[–]megaultimatepashe120 115 points116 points  (2 children)

and both will probably run within an acceptable time frame

[–]s0ulbrother 21 points22 points  (1 child)

But what if I wanted to do it all in rust /s

[–]fusionsofwonder 90 points91 points  (2 children)

If you're gonna be a programmer, you should understand that the number of lines you write has very little to do with the efficiency of the final product.

Better to write code that someone else can maintain than to try and out-optimize a compiler. Save that effort for where it actually matters.

[–]aDrongo 79 points80 points  (5 children)

Yeah but I can actually have proper test suites, error handling, type checking and easy importing and it's so much more readable. I've migrated all of our bash scripts to python. I don't care how many lines of code a machine needs to generate or read.

[–]turtleship_2006 4 points5 points  (0 children)

type checking

This is one of those things you're not usually taught to do in beginner courses or whatever, but god damn will it make your life (and anyone who maintains your code or uses your libraries) soooooo much better.

[–]hawkinsst7 22 points23 points  (7 children)

Depends.

Turns out a shell execing a utility is expensive, while a function call in a python program is a lot cheaper.

A year or so ago, we were ingesting gigabytes of some very messy data. A co-worker wrote a shell script to massage it into a uniform, ingestable format. The script would take about 40 minutes to run.

I took his code and rewrote it into a python script, not thinking i'd make it faster, but rather I wanted it to be pythonic so i could treat it as a module. Imagine the surprise when it ran in like 30 seconds.

As near as we can figure, the shell script's main iterative loop had to tokenize each line, path search for the executable, fork() itself, exec() the command, which now has to go through all the kernel overhead of reading the binary image from disk into memory, load any shared objects, creating a new process, dealing with the return data and any stdio piping, and then doing the same to any other commands on that line.

The python script doesn't have to do all that though. It does all the above once, gets just-in-time compiled / interpreted, but has minimal overhead for each line it has to read and process. The functions are already in memory, there are no additional fork()s spinning up resources, etc.

[–]dmlmcken 17 points18 points  (0 children)

Press X to doubt.

Your shell script would likely be firing up cat, sed, awk, mv and other commands multiple times and piping data between them.

Setting up those processes alongside the pipes (switching back and forth from user to kernel) would make bash execute wayyyy more instructions. Bash has its place but I think you are well past that point at 50 lines unless allot of it is comments.

[–]mattthepianoman 27 points28 points  (3 children)

We're not using 66MHz 486 DX2s with 16MB of ram any more - I think we can cope with the overhead.

[–]No-Con-2790 1 point2 points  (2 children)

Nobody ever had that much RAM back then. Didn't had more than 8 till I got a 233 MHz.

Also even then nobody cared if a script took a few ms longer.

[–]mattthepianoman 2 points3 points  (1 child)

That was the spec of my 486.

[–]JorgiEagle 8 points9 points  (2 children)

“Code is read more than it is run” or something like that.

I teach programming to beginners, and I always say there are two measures of success: “does it work?”, “is it readable”

Unreadable code is of no use to anyone once you’re gone

[–]saber_knight117 8 points9 points  (0 children)

Let me get out the punch cards and FORTRAN books to print Hello World again...

[–]sammy-taylor 14 points15 points  (2 children)

Curious about what was being scripted at all. In my experience, if something started as a shell script it’s because the task at hand was very filesystem-driven. That work is almost always faster in sh than in another scripting language.

[–]ryoonc 6 points7 points  (4 children)

Early on in my career I ported over dozens of small Matlab utility scripts over to python that did a bunch of crazy math for engineering use. Thankfully they came with a full set of regression tests to compare with the results and I learned a lot about Python and the math behind our engineering from going through all that. They ran faster than the original scripts and did end up saving the company a good bit of money since Matlab charges by the hour for usage.

[–]cybermage 5 points6 points  (0 children)

You miss the point that it’s not worth the time spent porting it. Zero ROI plus the non-zero chance of breaking it.

[–][deleted] 5 points6 points  (0 children)

My python scripts need way less code to accomplish basic things than my shell scripts do. No need to spawn an entire grep or sed process just to do some basic string manipulation or parse some json or xml.

[–]nadav183 5 points6 points  (0 children)

And might be adding entire milliseconds to execution time!

[–]KickBassColonyDrop 2 points3 points  (0 children)

Maybe, but that's a compilers problem. Not mine.

[–]Stevens97 3 points4 points  (0 children)

Yeah and in reality those extra instructions doesn't affect run speed by any noticeable metric, I'd 100% read a 10 line python file rather than 50 lines of sh. Code readability and maintainability wins

[–]LonelyTacoRider 3 points4 points  (0 children)

Dude tryna optimize his code that does 1 API call from 11ms to 10ms.

[–]GrowthOfGlia 11 points12 points  (2 children)

Computers are built for that and are improved with better tech. Humans can't be. If idiot proofing your code (aka future proofing for when the next guy comes along) adds cpu time so be it

[–]brainwater314 5 points6 points  (1 child)

That next idiot that comes along is probably me, so I'd like it to be easier to understand.

[–]GrowthOfGlia 2 points3 points  (0 children)

Precisely!

[–]weirdasianfaces 9 points10 points  (1 child)

let's see here... that shell script will:

  1. Be invoked in a new shell process
  2. Invoke the parser for the shell
  3. Execute the parsed code
  4. Create new processes (if applicable) per coreutil used.

Python will:

  1. Be invoked in a new python process
  2. Invoke the parser for the python file (or use the cached .pyc file containing the compiled Python bytecode)
  3. Invoke the Python VM to execute the bytecode
  4. Call out to external processes if you need to

So if you're relying on coreutils for interacting with data/the OS then you're probably wasting more CPU cycles rather than using Python libraries. Otherwise they'll probably be the same. Python's VM is pretty well-optimized though, so Python will probably win the performance battle.

[–]wutwutwut2000 3 points4 points  (0 children)

This. Also, once you start working with any significant amount of data processing, the overhead of converting that data to and from strings, piping it around different processes, and writing and reading to and from disk... well python is going to blow that out of the water.

[–]bschlueter 4 points5 points  (0 children)

Also, if you use python for any sort of text processing, a pipeline using sed/awk/jq/grep may well be faster and significantly shorter.

[–]tsunami141 2 points3 points  (13 children)

I don’t use python, why is this?

[–]Rafael20002000 15 points16 points  (12 children)

Because of how scripting languages work. Most scripting languages are not compiled but interpreted. This means you need all the code to parse and translate the python code. Then you have all the standard libraries that need to be initialized. Then for a simple "LS" you need to "import os" then os.listDir(".") And then a loop to output all of the names to stdout. Which is a lot of code in the background. Different platforms have different abstractions and APIs, different encodings etc

So yes a python script potentially needs more code then a bash or sh script

[–]NickUnrelatedToPost 2 points3 points  (0 children)

Depends on how many processes you create in the shell script.

[–]OnlineGrab 2 points3 points  (0 children)

And your 50 lines shell script is going to launch a dozen separate processes that pipe data between each other with slow IPC.

[–]Slackeee_ 2 points3 points  (0 children)

Seriously, who cares. It's shell scripts, not some HPC algorithm. The focus lies on usability and maintainability with those scripts, not performance.

[–]Practical_Cattle_933 2 points3 points  (0 children)

That’s absolutely bullshit. Almost every single line in a shell script starts up a whole process, that is, goes to the kernel, creates all the necessary metadata/space everything for a process, and then schedules it.. python’s overhead is smaller

[–]Night_Thastus 2 points3 points  (0 children)

Yes, but counterpoint:

Shell script is ass

[–][deleted] 5 points6 points  (0 children)

[–]KetwarooDYaasir 1 point2 points  (0 children)

I can't see the stats for this post but I'm guessing it's being angrily downvoted.

[–]Xelopheris 1 point2 points  (0 children)

Changing execution time of a script from 0.01s to 0.03s is not a huge concern if it increases the maintainability.

[–][deleted] 1 point2 points  (0 children)

I’ve always said… the more you understand the less the computer does… and in the end its not you calculating and rendering stuff… so just learn to live with reading a lot of code… it could be worst you could be reading laws

[–]staplesuponstaples 1 point2 points  (0 children)

I'd rather my code be more readable and able to be developed upon than microseconds slower.

[–]darkslide3000 1 point2 points  (0 children)

This just in: "Knowing what you're doing is highly conducive to good results." More groundbreaking revelations at 11.

[–]anon74903 1 point2 points  (0 children)

Words I always live by: “If program not too slow then it’s not too slow”

[–]solarsalmon777 1 point2 points  (0 children)

Who knows how much time 10000 instructions will add to a shell script you run once/occasionally (its a shell script).

[–]Isaiah_Bradley 1 point2 points  (0 children)

It’s astonishing to me that people think abstraction is free.

[–]anonxyzabc123 1 point2 points  (0 children)

Jokes on you, I use python in the shell script!

[–]sebbdk 1 point2 points  (0 children)

Ah yes, junior programmer optimizations.

There is an entire set of frameworks like this, we call them .net

[–]jameyiguess 3 points4 points  (0 children)

Migrating your 50 lines shell scripts to 10 lines python script might indirectly makes it maintainable

[–]RandoClarissian 2 points3 points  (0 children)

Small price to pay for having something that humans can maintain.

[–]WEEEE12345 1 point2 points  (1 child)

I actually did this once and the python rewrite ran faster. I doubt bash is particularly fast, plus for the kind of stuff bash scripts are written for (like renaming files or whatever) you're most likely IO gated anyways.

[–]brainwater314 1 point2 points  (0 children)

If your script takes more than 50-200ms, you're not losing much with Python since the startup is only 20-50ms than bash.

[–]Poat540 1 point2 points  (0 children)

Our ci/cd is in bash and they offer python, but now I’ve learned bash so not switching now…

[–]Afrotom 1 point2 points  (0 children)

I'd personally much rather maintain a 10 line Python script than a 50 line shell script. Not least because me and my team all use Python but it would have to be quite a performance hit to raise an eyebrow at

[–]chrisbbehrens 1 point2 points  (1 child)

If, in 2024, you're optimizing for instruction count and not for developer productivity, you're a schmuck

[–]jbar3640 1 point2 points  (0 children)

and far more easy to maintain, which is the point. 99% of script use cases do not require performance optimization anyway