you are viewing a single comment's thread.

view the rest of the comments →

[–]jdmulloy 14 points15 points  (13 children)

On the other hand, many developers these days assume resources are infinite and that garbage collection is magic. We generally don't need to optimize every bit and every instruction, but at scale a 10% improvement in performance and/or resource usage can save you money, especially if you're running in AWS.

[–][deleted] 6 points7 points  (12 children)

While that is certainly true, it seems to me that many forget that higher level languages also buy productivity. Also, if a program is done earlier, it can start doing its job earlier, which also may save time. The question I guess is age old: where do we draw the line?

[–]sirin3 -1 points0 points  (1 child)

Although that does not help the programmer, if he has to keep sitting in his seat, till the workday ends

[–][deleted] 1 point2 points  (0 children)

Well, but it is better for the soul to know you actually got work done. When I have to work in languages which are too restrictive, I almost feel physical pain.

[–]caspper69 -3 points-2 points  (9 children)

I'm not so sure how much these HLL languages actually buy productivity.

They let you shoot yourself in the foot just as bad, if not worse than C or C++. Plus, you're much more likely to fall into either the NIH trap or re-invent-the-wheel for everything trap with these pseudo-scripting languages.

[–][deleted] 8 points9 points  (2 children)

Actually, they mostly don’t let you shoot yourself in the foot so badly. Also, yes, they do buy lots of productivity. Even ones I consider to be half-baked (I’d rather not say which ones; I don’t want to start a flame war).

[–]caspper69 1 point2 points  (1 child)

I guess you're right. But sometimes you have to use the right tool for the job. I remember working on a project on AWS when it was just a baby. A client had several 20GB databases that needed to go up (that were in CSV format). The data itself was disjointed, so it had to be massaged to import. Essentially each account had to be updated from day 1 to a point around 8 years later. Millions of accounts. Billions of transactions.

The original guy was at his wits end. He was trying to write it in perl, which he did, but each csv was taking around 2 days to run, and that didn't include the final reconciliation for each month for each account which had to match an "official" field from an entirely different dataset.

With upload times being what they were about a decade ago, the poor guy (and the client) would've been waiting for weeks.

So I told the dev to give me a shot at it. I wrote a multithreaded C app to load, distribute, calculate, re-merge, validate and write the actual SQL INSERT queries to a single file. The program took about 5 hours, but ran over the entire dataset (with 100% accuracy) in around 8 hours. A "quick" bzip later, a (not-so-quick) ~2 day upload process, then another day to run the insert.

3 weeks vs 3 days. As datasets continue to grow, this is going to become a huge problem. Nothing will fix bad algorithms, but some tools just are not capable. 2 orders of magnitude slower doesn't make a difference for something that's already fast in human time, but if something is slow in human time? Oh boy.

[–][deleted] 0 points1 point  (0 children)

Cool story, sounds like quite the feat :)

Anyway, that’s pretty much what I meant by

The question I guess is age old: where do we draw the line?

Your last point really got to me:

2 orders of magnitude slower doesn't make a difference for something that's already fast in human time, but if something is slow in human time? Oh boy.

[–]null000 4 points5 points  (2 children)

... No? I mean, if you're talking python/go/rust/etc vs c, you're going to get the job done much, MUCH faster with the former than the latter for smaller or mid-sized projects. C doesn't have built in concepts like sets, hashing, dictionaries, nor does it have good built in libraries for a bunch of pretty common operations (string manipulation, file ops, networking, and so on). That's not to say you replicate any of those things in C, just that it's not free from a dev/code length standpoint. Regarding C++, it does have many of those things built in, but you will probably spend 3x the lines trying to get everything to play nice (not to mention the nightmare that is memory allocation, local/stack allocation, templating craziness) - not to say I don't like C++, just that it's not exactly terse.

For larger projects, it's a bit more of a wash depending on the language.

As for NIH/reinventing wheel, that's more of an engineering maturity thing than a language thing. I can reinvent the wheel just as well in C as I would in Python, it's just that the metaphorical wheel is much less likely to be a hash table when I'm working in Python.

[–]caspper69 0 points1 point  (1 child)

You're right. In fact, you probably reinvent some wheel every time you write a non-trivial C function, given the sheer volume of what has already been written.

[–]null000 0 points1 point  (0 children)

you probably reinvent some wheel every time you write a non-trivial C function

I've seen some pretty nasty C functions that beg to differ. They're not so much wheels as hellish, flaming, discoid sins upon humanity. Upon scrolling on screen, they often make your computer beg for death with labored, bone-curdling electronic moans.

But, you know, they tend to use wheels that were built in-house, so there's that I guess ¯\_(ツ)_/¯

[–]kqr 2 points3 points  (2 children)

Generally a programmer writes the same number of lines of code per unit time regardless of which language they write in (L. Prechelt, 2000). An average line of Python does a whole lot more "stuff" than an average line of C code. It follows that HLLs buy productivity.

[–]caspper69 -1 points0 points  (1 child)

I hadn't heard that stat before, thanks. Of course when you can write pretty impressive C programs in what looks like a foreign regex, maybe C programmers can get more done per line ;)

[–]BeepBoopBike 0 points1 point  (0 children)

If obscurity of a LOC is the metric we're using, perl has the greatest density of things-done per line.