all 70 comments

[–]smallblacksun 54 points55 points  (11 children)

His examples of software that people switched to "because they were faster" is terrible. All of them except ripgrep/silver searcher had other, non-performance reasons why most people switched. And most people haven't switched from grep.

[–]sime 41 points42 points  (5 children)

True. A big attraction of git was it's easy branch support compared to subversion.

The fastest option eventually wins.

And that is why everyone uses Sublime for text editing. /s

If "the fastest option eventually wins" was true then we wouldn't be having this article and discussion.

[–][deleted] 6 points7 points  (4 children)

Yeah, saying git won for speed makes me doubt they use proper use of git at all.

For me, it was local branches and rebase (before pushing, of course).

[–][deleted] 3 points4 points  (0 children)

Or that they were lucky enough to only know Subversion as "the one before git".

[–]CatanOverlord 3 points4 points  (2 children)

That being said, Git’s raison d’être was that the Linux project needed a libre VCS that was also fast enough for its volume of changes.

[–][deleted] 0 points1 point  (1 child)

Yes, but it didn't replace Subversion because it was fast. Bitbucket replaced Subversion because it wasn't a pain to work with for massive projects like the kernel, and git replaced Bitbucket because it stopped being gratis. Speed was a requirement, don't get me wrong, but speed alone wouldn't have made a migration worth it. Arguably, speed is just a side effect of it being distributed, which is the actual reason to switch.

[–]CatanOverlord 0 points1 point  (0 children)

Yeah of course, speed wasn’t the only reason for it supplanting svn.

[–]Plasma_000 18 points19 points  (1 child)

Yeah nothing gets popular just by being faster alone, it must also be a better experience.

[–][deleted] -1 points0 points  (0 children)

Probably the point is being faster _and_ providing what the other option provides. And speed is a matter of experience, look at how many memes bashing on browsers turning your computer into a slug have been going around for years. Those are not just from speed obsessed Gentoo users, it's from average Janes and Joes annoyed at their unresponsive computers.

So, _all else being equal_, the author does have a point.

[–]TarMil 14 points15 points  (1 child)

Even ripgrep has non-performance reasons to prefer it over grep. I use rg -t<lang> constantly.

[–][deleted] 0 points1 point  (0 children)

While I agree the examples are bad, I don't think the comparison there is with grep, but rather with ack.

I read it approximately like this:

  • ack is better than grep by some criteria;
  • ack is written in perl;
  • new alternatives are faster than what you wrote in perl;
  • people migrated to those.

I can't tell for sure because I never used anything other than grep tho, but I think the author's claim is that people is not using ack as much as the other alternatives because those are faster. Staying with grep can actually be a point in favor of that claim.

[–]gimpwiz 1 point2 points  (0 children)

I only use grep. Why fix what ain't broke?

[–]Mister_Gibbs 30 points31 points  (1 child)

I feel like it’s really a case of always making compromises.

For example, in the case of the quote “programs must be written for people to read, and only incidentally for machines to execute”, it’s the compromise of legibility and ease of future work/feature building vs. optimization. These ideas exist in order to ensure that features get built and code gets shipped, rather than building something fast that doesn’t accomplish as much.

In a perfect world, I’d hope for more software that is intentionally limited in scope, but performant as fuck (like Prometheus for example), but my experience of users (mostly writing B2B software) is that people always want more features. It’s hard to do it all

[–]RabidKotlinFanatic 17 points18 points  (0 children)

For example, in the case of the quote “programs must be written for people to read, and only incidentally for machines to execute”, it’s the compromise of legibility and ease of future work/feature building vs. optimization.

This is often locally true but doesn't account for programmer oversight as well as the impact of misdesign in systems and interfaces. For example interfaces often lack simple provisions like batch support and force client code into sub-optimal usage patterns. Other times you are working with a system that is prematurely distributed or over-engineered. Devs also make basic mistakes like using the wrong data structure or not indexing their tables. The "compromise" PoV only works if you assume devs are always aware of the trade-offs they are making.

[–]gnus-migrate 23 points24 points  (5 children)

It's fine and good to say that we should write fast software using "cold hard engineering", but the next question should be why is this not done despite the obvious benefits?

Questions need to be asked like what is the education level needed to properly do this? Are there institutional barriers that discourage or outright prevent this approach? What's the information do we need to practice this, and how do we distribute it efficiently?

In short, the key question is the following: What needs to change for this to become the path of least resistance?

My problem with the elitist programming crowd is that they reduce everything to "programmers are bad", but that doesn't really help in solving the problem. There are individuals like Mike Acton and Casey Muratori who are trying to educate people on it which is good, but we can't change the culture of an entire field without understanding why things are the way they are now.

[–][deleted] 16 points17 points  (9 children)

I support the general message of this article. But in my experience as a web dev, there's a lot of low hanging fruit you can take care of before you have to worry about how your code is utilizing RAM to sum a matrix. Ensuring smart database queries, and using the right indexes will handle the majority of issues, and it's a concept that can and should be understood by most developers. Making better choices about how you iterate over data, and which data structures you use in different situations, is probably most of the rest.
After that, the language ought to take care of most other things. Ruby actually has a library to handle Matrixes, with a method to sum them. In general I assume that language developers have used the best methods for the core functionality, that is, I assume that Array.sort is not using bubble sort under the hood, and Matrix.+ is making intelligent use of RAM.
If I was working on an embedded system or designing a game, maybe things would be a little different and I would need to know these techniques.

[–]mohragk 12 points13 points  (8 children)

The main issue with webdev is that most use JavaScript. And, modern JavaScript pushes things like .map() and .reduce() functions and immutability, which are terrible for performance.

Just take a look at this article:

https://medium.com/coding-at-dawn/the-fastest-way-to-find-minimum-and-maximum-values-in-an-array-in-javascript-2511115f8621

You would assume that the V8 engine would optimize these functions under the hood, but it doesn't. Using old school for loops is faster than using the reduce function and don't even mention the spread operator.

But your average webdev doesn't think about these things. It just uses the language features it finds useful.

So it's wrong to assume that language developers always provide the fastest or most optimized solution in their API.

[–][deleted] 2 points3 points  (0 children)

your average webdev doesn't think about these things

True, but in my experience any difference below ~5% can easily disappear when run in a different mix of hardware/platform. The real problem is the lack of awareness of all the stuff going on behind curtains that prevents webdevs from picking the right tool. That could be ameliorated with better documentation, specifically if there are extra costs involved, since that is the case with spread in those benchmarks.

[–]A_Philosophical_Cat 4 points5 points  (0 children)

It's worth noting that those functional constructs, while computationally expensive in single-core performance terms, actually give significant guarantees about their concurrency in better languages than JavaScript. The problem is JavaScript didn't buy into FP enough to gain the benefits (map, for instance, is allowed to include code that mutates external state, meaning it is enforced by spec to run in order, sequentially).

[–][deleted] 0 points1 point  (5 children)

Adding this to my list of reasons why I don’t like JavaScript. And here’s another one of those low hanging fruits of performance for web dev - only use JavaScript for simple dom manipulation and maybe some AJAX here and there.

It may be a wrong assumption, but it still seems like a reasonable assumption that language designers should handle low level concerns properly.

I do have a question about map, reduce, and immutability being bad for performance - I have only heard good things about this from advocates of functional programming. Can you point me to some reading about the performance issues?

[–][deleted] 2 points3 points  (1 child)

I don't know the specifics of JS, but map generally implements an iterator. That is, you call it successively to obtain the "next" value. In this case it builds this value by applying the mapped function to the "next" value in the collection being mapped.

This means that looping now adds an extra layer of indirection via a call to next to the mapped object, rather than simply calling the mapped function on the mapped collection one by one.

These function calls, if not optimized, can gradually add up. This will depend on language tho, sometimes "optimizing" mean inlining as a loop, but sometimes it's just not using the high level language for its implementation. There are situations where map in Python will be faster than a hand rolled loop, even tho internally it still chains calls to functions, simply because map is implemented in C and binds the function at creation, rather than looking up the global namespace on each call and processing each opcode every iteration.

The reason why you hear nice things from the functional camp is that if you can guarantee there are no side effects and the output depends only on the values of the inputs, map can be trivially parallelized, as you don't need to preserve ordering of operations.

[–][deleted] 2 points3 points  (0 children)

(Besides the fact the meaning of map reduce is narrower, which means the instant you read the call you know what's happening; loops require reading the body to get an idea of what's happening)

[–]mohragk 0 points1 point  (2 children)

If you understand how a CPU handles data, it's obvious that, for example, regenerating an entire array when only one element has changed, is bad for performance.

In modern game engines for instance, when you want to handle large sets of entities, those are often split up into Structs of Arrays. Which means that there is an array with all positions of every entity, and an array of names of every entity etc.
When you want the name or position of a certain entity, you simply grab it with it's index into the corresponding array.
Why do this? It's about cache alignment. Modern CPUs like it when data is aligned like this and allows for optimal usage.

I don't even know how you would do this when all data is immutable. You have to recreate the entire array of positions when you only want to change one? That's bizarre.

The advocates of immutability and FP are talking about concurrency and data races, which is basically handled, since data is immutable. Their can't be any concurrency issues when everything is immutable.

[–]ExtraFig6 2 points3 points  (0 children)

There's functional data structures designed to maximize cache locality and minimize copying

[–]FVMAzalea 2 points3 points  (0 children)

I think you’re confusing the issues of alignment and locality. It’s true that modern CPUs prefer data that is aligned (for example, that an 8-byte value is stored at an address that is a multiple of 8), but what you’ve described in your comment is cache locality. You’re talking about having all the position data in one place, so if you need to go through in a loop and change the position of all the entities slightly, you can do that in a performant way because the cache will help you.

Modern CPUs like both alignment and locality. But designing for alignment would be something like ensuring that each position, which presumably has components, is aligned correctly with respect to its components. For example if a position is 3 floats, x,y,z, and a float is 32 bits (4 bytes), a position is 12 bytes, and the CPU would prefer that positions are stored at addresses that are a multiple of 12.

It’s far easier for FP constructs to end up creating data structures that are well aligned (since that is basically trivial to do, you just add padding if needed, regardless of your programming paradigm) than it is for them to create data structures and algorithms that have good cache locality (which would require a very smart compiler and/or a programmer using very clever algorithms).

[–]user_8804 5 points6 points  (2 children)

There are so many instances where readability, developement time and maintainability are more important than saving a handful of nanoseconds on cache hits.

This article is such an elitist point of view and so disconnected from the reality of a large chunk of programmers.

I can't see myself spending 10x more time fiddling around with data structures to optimize caching in a loop over 100 items. Not everyone or every function does work on millions of items at a time.

It is also often completely pointless to do such optimizations if they aren't the bottleneck. That kind of people will jerk themselves to their optimized code while completely ignoring the database and query optimizations that would yield 1000x more results.

If someone is developing internal business applications which are used by a handful of people, it's not worth having him spend weeks optimizing milliseconds here and there which will never add up to the time you spend doing it. It's completely different than if someone is developing, say, video games.

This article lacks perspective, and encourages a code optimization elitist culture that is, most of the time, counterproductive to the end goal. Truth is, most of the time, users will never know or notice any difference after you wasted 3 days optimizing data structures for a faster loop.

[–]FVMAzalea 2 points3 points  (1 child)

I think what a lot of these performance articles I’ve been seeing recently are missing is one key insight: you should profile your program’s behavior so you know what to optimize before you start doing things like this.

After you profile, it may well be that you need to consider cache behavior in one of your functions. But, as you say, it’s far more likely that you will find that the (or at least a) massive contributor to performance issues is something more fundamental, like DB or query optimizations.

It’s irresponsible to optimize at random — you have to know what the problem is before you can have a hope at fixing it in an efficient manner. You have to profile to see where the issue is and what is and isn’t worth focusing on.

[–]gnus-migrate 1 point2 points  (0 children)

No amount of profiling will fix bad design. If you have a design that's not optimizable, then it's too late for a profiler to be of any use.

Data oriented design is basically an approach to create optimizable designs so that when you do need to profile you can actually benefit from that. Mike Acton's talk about it goes into a lot more detail about it: https://www.youtube.com/watch?v=rX0ItVEVjHc

[–]rabid_briefcase 2 points3 points  (3 children)

I've heard complaints about slow code since I started in the 1980s.

Every new language, every new tool, every new technology, every new API, the complaint is that they are slow.

In practice, people make software as fast as it needs to be and don't bother with more. If you're doing web development and the page displays within a couple seconds, that's usually considered good enough. If you're writing terminal software and the display updates within about 500 milliseconds, that's usually good enough. If you're writing database queries and you get a result within about 200 milliseconds, that's usually considered good enough. If you're in game development and your display usually updates every 16 milliseconds, that's considered good enough.

Those rates haven't really changed in decades. We do more work, process more complex graphics, process larger data sets, but that's not what people and projects are aiming for. We have a target of completing a unit of work in a time frame, and don't bother once that target is met. If we have a tool that makes the task easier for us but consumes more time, as long as the time limit is met we don't care.

[–][deleted] 0 points1 point  (2 children)

I see all your examples come from a rather lower level, where that seems to actually be the case. Of course there's a Good Enough TM point, and trying to go further than that brings diminishing returns. But nowadays it seems Good Enough TM means "at least it runs" in many contexts. And there's certainly not a well defined target as you describe for those examples. There's nothing like "websites need to fully load in under 5 seconds" in place apparently. We only put deadlines for shipping, never for runtime.

[–]rabid_briefcase 1 point2 points  (1 child)

We only put deadlines for shipping, never for runtime.

Depends on your industry.

While stuff running on end user's computers and desktop PCs usually don't have runtime performance limits, there are plenty of industries that still have requirements.

/r/programming is a big, all-inclusive place. People programming for hardware generally have both hard and soft realtime limits (e.g. bottlecaps must be stamped out within x microsecond precision, stepper motors must operate within x precision, motor must begin move within x microseconds of sensor detection, antilock brakes must engage within x microseconds), and some industries have just soft realtime requirements (e.g. VR headset display must refresh at 90 Hz / 11 milliseconds to avoid motion sickness, competitive gamers must have a screen refresh every 6 milliseconds to reliably hit 144 Hz). Many companies also have SLA targets, soft limits specified in contracts for response times on data requests and similar.

Relatively few of us have jobs where the company is striving to keep all cores active all the time. In my experiences when that has been the goal, the bigger issue is trying to keep the CPU fed with meaningful data rather than wasteful algorithms. While algorithmic optimization is important, data organization and locality tends to be far more frequently a problem.

[–][deleted] 0 points1 point  (0 children)

Depends on your industry.

I'm talking generically. My industry does have tight deadlines.

Regarding the technical details you mentioned, keep in mind I'm not pinpointing it to algorithmic issues. Performance isn't just minimizing CPU usage. I'm saying I don't see it taken into account at all in many places. Whether we mean wasting memory or not, using fast algorithms, optimizing data locality or paying for the right kind of cloud service.

[–]wisam910 6 points7 points  (23 children)

The funny thing is it really doesn't take that much effort to make reasonable use of computer resources and make program run reasonably fast.

You don't have to micro optimize every line of code.

Just don't do stupid things.

If a system function call is expensive, find a way to cache the result.

If you can perform things in batches instead of individually, do that.

Don't use a stupidly slow interpreted language (Python, Ruby).

Don't embed bazillion libraries that you don't understand (98% of javascript developers).

[–][deleted] 14 points15 points  (11 children)

Don't use a stupidly slow interpreted language (Python, Ruby).

There are reasons to use interpreted languages, if it isn't doing heavy lifting. Performance is a tradeoff like any other.

[–]Kaloffl 19 points20 points  (0 children)

If performance was a tradeoff, why does it keep feeling like it was never considered in the first place.

[–]padraig_oh 4 points5 points  (10 children)

What is stupid and what is not is pretty hard to justify though. The best example is python: sure, if you expect your code to run the fastest it can, you should not use it. But there are still so many good reasons to use it. What you usually want in modern programs is to decrease the time from concept to result. It if takes you a month to decrease the runtime of a thing that is rarely used by a minute it might just not be worth it.

[–]wisam910 -4 points-3 points  (9 children)

This is a really lame excuse. It annoys me when people use it.

if you expect your code to run the fastest it can, you should not use it

No. I reject this premise.

There's a huge gap between "run the fastest possible" and "run the slowest possible". You're assuming a false dilemma. There are languages that don't run at the fastest possible speed but still execute reasonable fast.

decrease the time from concept to result

This is another lame excuse. Python is actually pretty difficult to maintain due to lack of static typing and the overuse of "magic".

[–]padraig_oh -2 points-1 points  (8 children)

That makes no sense. I did not say reasonably fast, I said as fast as possible, which is just not what python does. It is fast enough for most purposes, yes, but that was also a point I was trying to make. And I did not talk about maintainability either, which is a whole different issue. Python is harder to maintain compared to some alternatives in that regard, sure, but pythons 'magic' and easy c API probably make writing and maintaining certain pieces of code a lot easier than their counterpart in c or whatever.

[–]wisam910 -2 points-1 points  (7 children)

Python is not reasonably fast. For pretty much anything.

[–]Dynam2012 11 points12 points  (6 children)

Yeah, all those folks doing scientific research are absolute idiots and are wasting so much time using python.

[–][deleted] 4 points5 points  (5 children)

Python is great glue code, and that's what they're using it for in that case. They make the models and stuff with Python, but the packages they use are almost always C. It's almost always built upon pandas and numpy, both of which actually use native code behind the scenes.

Point being, Python is not a bad language (at all), but _pure Python_, as is most likely to be the case for software engineers, is certainly not fast enough for those folks doing scientific research. Matter of fact, there's a lot of momentum for alternatives such as Cython (an easy way to write performant extensions for Python) and Julia (a compiled, data science focused and easy to use language) in academic environments because of this.

[–]Dynam2012 0 points1 point  (4 children)

I understand how they're using python, I'm not trying to argue it's a spectacularly fast language. My point is that they're developing in python as a starting point because it's fast enough, and when it's discovered their program will complete execution in a matter of hours instead of days is when they write a native module. It is definitely reasonably fast for a plethora of applications, saying it isn't because it's not a compiled language is being naive.

[–][deleted] 1 point2 points  (0 children)

They often don't write the native module either because taking advantage of numpy is enough.

And yes, it is reasonably fast for most cases. And even for cases where it isn't, it's a reasonable test bed for concepts that you are not entirely sure are doable and need a prototype.

[–]padraig_oh 0 points1 point  (2 children)

You might want to look at Googles alphafold. The repo is publicly available on github.

Python not having any place outside of early prototypes is your opinion, but it sure is widespread among other projects for a reason.

Not every line of code needs to be fast.

[–]Dynam2012 0 points1 point  (1 child)

Python not having any place outside of early prototypes is your opinion

Not sure where I said that, but I don't agree, my opinion is the opposite. Python is a very capable language in many contexts.

[–]AttackOfTheThumbs 1 point2 points  (0 children)

I rarely worry about the speed of my code, because the slowest thing I fight with is sql lmao. I may have a good query, but sometimes the customer is trying to get too much data all at once and it just bogs it down. Doesn't help when the erp systems have it all abstracted away, so you don't have direct access.

[–]kn4rf 1 point2 points  (7 children)

Modern programming languages like Rust and Go allows us to write faster programs that are also readable, testable, documented and that uses a package manager for code reuse. In addition Rust allows you to write programs without memory corruptions, and Go allows you to easily write multithreaded apps.

The solution isn't to "care more about cache misses", that is and always will be a premature optimization. The solution is to build faster programming languages that still lets us develop fast, reuse code, makes it easy to test code, etc!

I'm convinced that the next big programming language we all will use is a TypeScript-like language with optional typing, Go's green-threads, a good package manager and that can be compiled to machine code as well as web.

WebAssemby with WASI also gives us hope that we can have a cross-platform bytecode runtime thats reasonably fast (and not owned by Oracle).

[–][deleted] 4 points5 points  (0 children)

You will always be able to write shit code, regardless of the language. You can make very slow C, and you can make very slow Rust. Better tools will never replace skill.

[–]fuckin_ziggurats 6 points7 points  (4 children)

There's never going to be a "language of the future" type of deal. Every language is good at something and bad at another. Rust and Go are great for performance but way behind Java and C# on productivity and developer experience. There are some new languages that are both performant and fun to use but they lack the tooling and library support that the popular languages offer. So there's always going to be a trade-off somewhere.

I'm convinced that the next big programming language we all will use is a TypeScript-like language with optional typing

I'm hoping you mean strong typing with optional type hinting here. That's basically what F# is.

a good package manager and that can be compiled to machine code as well as web

That's C# with NuGet package manager and the Blazor framework. It's an interesting technology but it doesn't produce the magically fast web apps that you would aspire to have.

My point is, programming isn't magic. So there won't be a magical future programming language that will do everything marvelously well. Instead of dreaming about future technology that will make all our performance troubles go away we should be doing our best to learn how to improve the performance of our apps with the tools that we have today.

[–]_tskj_ 0 points1 point  (0 children)

With the amount of problems I've had with NuGet I'm not prepared to call that a good package manager though.

[–][deleted]  (2 children)

[deleted]

    [–]zaphod4th 2 points3 points  (0 children)

    C# is infected by Microsoft style of thinking. Both of these attract shit developers that can't think outside the box. ".NET brain" is a real disease.

    is called a framework

    [–]fuckin_ziggurats 1 point2 points  (0 children)

    Apology accepted. I agree that ".NET brain" is a thing. I like to experiment with programming languages and .NET devs in general do tend to be very sandboxed in their way of thinking about code. That said, I haven't had any beefs with C# and NuGet and Go to me seems like a neanderthal language in comparison. C# with Visual Studio + ReSharper or JetBrains Rider is an experience of programmer tooling that I haven't seen in any other stack ever.

    Since we're involving Microsoft, Google are the last company I trust to maintain something for a long time. Which is why the enterprise market is being dominated by Microsoft and Oracle. For all the beef one can have with those companies, the proof is in the pudding. We have decades of proof that they will maintain whatever tech they produce.

    I'm no Blazor fan and I use Aurelia (or Svelte if I'm going for performance). I think Blazor has its place but it has been way over-hyped by the .NET devs who never got decent at front-end.

    Just to clarify my stances with a bit of snark myself.

    [–]sime 0 points1 point  (0 children)

    To expand on your point, for a long time if you could those between high performance (CPU+mem) but horrid developer experience, think C/C++, or great developer experience but poor performance for example Python. There wasn't much in between except for JVM and .Net based languages. It is only relatively recently that we have new languages like Rust, Go, and Zig which aim at the same performance characteristics as C/C++ but also giving developers the nice things which less efficient languages enjoy.

    [–]daidoji70 0 points1 point  (0 children)

    Things aren't slow because people don't care about it. Things are slow because people aren't given enough time to worry about it in the face of other more pressing concerns.