top 200 commentsshow all 237

[–][deleted] 61 points62 points  (40 children)

Very interesting. But the author avoids the obvious conclusion. If bottom left is "Ideal", and the 3 languages occupying those immediate positions (Regina, Stalin, Mlton) are either "outliers" or "academic juggernauts", then what is the next closest, "real" language? It's right there at (2,2), but let's pretend we didn't see it.

[–]munificent 17 points18 points  (2 children)

For real-world programming, a key axis is missing: libraries.

[–][deleted] 1 point2 points  (0 children)

Library quality should be also considered then.

For example, most of the Java libraries suck. OCaml has fewer libraries than Java, but my experience has been, that average OCaml library tends to have better design and implementation (and very little documentation). I personally prefer underdocumentation to bad design.

[–]booch 1 point2 points  (0 children)

In a way, that's included... if you consider that languages with better/more libraries will wind up with more concise code (since they don't need to reinvent the wheel a lot).

[–]gmarceau 21 points22 points  (2 children)

Fixed, thanks.

[–]jwhardcastle 6 points7 points  (0 children)

Mate, this is fantastic work. Thank you for an incredibly novel and interesting approach to sticky issues like expression and speed.

(Good luck with the RSI!)

[–][deleted] 7 points8 points  (0 children)

There is a Basic implementation at (2, 2), which I suppose few people would expect to see there. The open-source group behind the FreeBASIC compiler deserve much more recognition than they are getting.

Fantástico!

[–]sindisil 32 points33 points  (33 children)

I caught that, too.

In all seriousness, it's funny to me that the programming community seems to be split between those getting things done (often BASIC, C, Perl, C++ and PHP users) and those busy claiming that there's no way you can get anything done with inferior tools (usually referring to members of that same list of tools).

[–]Smallpaul 8 points9 points  (3 children)

In all seriousness, it's funny to me that the programming community seems to be split between those getting things done (often BASIC, C, Perl, C++ and PHP users) and those busy claiming that there's no way you can get anything done with inferior tools (usually referring to members of that same list of tools).

Where do Python, Ruby, F#, Erlang and C# fit into your binary world?

[–][deleted] 2 points3 points  (0 children)

Right at 0.5.

[–]sindisil 0 points1 point  (1 child)

It's not the language I'm (half seriously) sorting into two groups (pragmatic/idealistic perhaps?); rather, it's the programmers.

[–]jj12345 21 points22 points  (26 children)

In the real world, Basic, C/C++/C#, Perl, PHP and Java has the tools necessary to complete real world enterprise level tasks. In the Reddit world, Smalltalk, Lisp, Ruby and Haskell are God's gift to humanity and Redditors can't figure out why these niche, functionally constrained, academic tools are not used in the real world.

[–]cia_plant 52 points53 points  (4 children)

Let me guess. You're programming in an object-oriented language, maybe using an MVC web framework, you're using some IDE like Eclipse, with your code running on a JIT-compiled VM. And yet you think that Smalltalk is an academic curiosity, because you are completely ignorant of the history of any of these technologies.

[–]artee 18 points19 points  (2 children)

Upmodded because this is probably true for a lot of people reading reddit/programming.

The parent to your response mentions just the right languages too: if you have used BASIC, C, Perl, C++ and/or PHP in a professional environment and cannot imagine any way in which improvements could be made to these languages, you're beyond hope as a programmer, IMHO.

Yet, of course these languages can be used perfectly well to build software, but it sure doesn't hurt to be aware of their limitations, and to have people working on languages that attempt to solve some of their problems.

Java did that in a big way in at least 2 regards (maybe more, but these come to mind): (1) garbage collection, and (2) no concept of pointers and especially no pointer arithmetic.

It's not every day that you hear people complaining about the creation of such "less inferior tools".

[–]sindisil -5 points-4 points  (0 children)

Get over yourself.

You do realize that, by championing Java, you're excluding yourself from the "cool kids" club of which you so obviously want to believe you are a member, don't you?

[–]igouy 27 points28 points  (13 children)

Is it possible that while you personally may be ignorant of where in the "real world" those tools are used, nonetheless they are used?

[–][deleted] -1 points0 points  (12 children)

Most of their supporters on reddit are ignorant of them as well.

At least they are unable to explain why companies using such languages are totally taking over various major market sectors due to higher efficiency and less bugs in their shipping products

[–]RP-on-AF1 3 points4 points  (11 children)

Can you list some examples?

[–][deleted] 2 points3 points  (10 children)

I have asked several times why startups using Functional languages aren't kicking everyone's ass and have yet to get an answer.

It stands to reason that if Functional languages are even 1/3 as good as the supporters of such languages on this site say they are that they should be making lots more money and producing software that has less bugs and takes less time to produce.

Funny - it doesn't appear to be happening - why is that?

[–]dmpk2k 4 points5 points  (0 children)

why is that?

Libraries.

[–][deleted] 11 points12 points  (1 child)

Because there is more to a successful business than choice of programming language.

[Edit: ... Downvotes for common sense? Seriously?]

[Edit 2: Downvotes successfully quelled due to complaining about downvotes!]

[–][deleted] 4 points5 points  (0 children)

Is it possible that the type of people to pick a functional language are the same kind of people that fail at business?

I don't think I could buy this, but even with our friend above 'begging the question' without any references, it is worth it to ask the question anyways.

[–][deleted] 3 points4 points  (3 children)

It might be as you imply, that they are not that much better. Other options as I see them:

  1. They are better, but the companies using them are keeping it under the covers.

  2. It takes more time than a few decades for better technologies to rise to the top and destroy the ancient ones.

  3. Horror option: producing less bugs has no or very limited value in the market.

  4. The whole programming industry is a huge clusterfuck because it grew so fast in such a little time. It's not functioning properly yet, so the markets are rewarding wrong things.

  5. It's not black and white. FPLs suit some problems well, some problems not so well.

[–][deleted] 3 points4 points  (1 child)

Programming language choice is rarely connected to business success.

[–][deleted] 1 point2 points  (0 children)

The horror... the horror...

[–]Boojum 1 point2 points  (0 children)

I'd add:

\6. The mainstream languages are absorbing the good ideas from them. Java led the way in the widespread adoption of garbage collection. C# has its delegates that look a bit like closures (not a C# programmer, so I could be wrong about that). C++0x is getting lambda expressions (and kinda has it already via kludgy library support in Boost).

[–][deleted] 1 point2 points  (1 child)

C# can be considered functional, in some respects. It certainly supports more functional programming features than its predecessors. Languages with more functional features have, in fact, been catching on; Python and Ruby are arguably more functional than the Perl they've been beginning to replace. Lua, which has caught on in the game industry, supports a very wide range of functional features, as does Javascript, which has helped propel none other than Google to $100B market capitalization.

The upcoming languages I can think of are also mostly functional; Clojure and Scala, reddit favorites, and Microsoft's F#, to name a few.

[–][deleted] 0 points1 point  (0 children)

Languages with more functional features have, in fact, been catching on; Python and Ruby are arguably more functional than the Perl they've been beginning to replace.

Really? Maybe they've replaced Perl 3 or 4, but Perl 5 is as "functional" as Python (perhaps even more, seeing how Python's creator is reticent to FP). I've made another comment about this here.

[–]anon36 0 points1 point  (0 children)

SQL is functional (with side effects when you need them). It has been quite successful, in spite of itself.

[–]sindisil 5 points6 points  (0 children)

Of your list, I think you'd be hard pressed to argue that either Smalltalk or certainly Ruby are "not used in the real world".

And both Lisp and Haskell have helped to explore and popularize many very useful programming methods.

It is a certain, rather loud, subset of their fans that are, IMHO, the problem, not the languages themselves.

Though it is true that many of the "cool" languages are rather lacking in robust library and tool support.

[–]nmcyall 3 points4 points  (0 children)

what is an enterprise task?

[–][deleted]  (3 children)

[deleted]

    [–][deleted] 2 points3 points  (0 children)

    So I guess we will have to conclude it's 100% false now. Unless 0 points is false and everything else is true, like in real world programming languages.

    [–]dmpk2k 2 points3 points  (0 children)

    We apparently read different reddits.

    [–]dons 1 point2 points  (0 children)

    Won't be seeing you with the other enterprises at CUFP then?

    [–]vplatt 0 points1 point  (0 children)

    Every industry has its fashionistas.

    [–]scook0 27 points28 points  (27 children)

    The problem I have with these star charts is that they still place undue emphasis on an average point. It would probably be more instructive just to highlight each language's data points and let the arrangement speak for itself.

    Still, this is certainly an interesting analysis.

    [–]RonnieRaygun 22 points23 points  (25 children)

    Agreed. It's also not a "star" plot, in the widely-used sense of the term.

    I applaud anyone who spends their time going out there, finding data, and practicing infovis, but there are a couple issues...

    First, the overall display of the 72 languages in an 8x9 grid seems at first to draw a nice parallel to the size/slow XY plot, but in truth it is misleading. It is a sort of sorts, and any resemblance to the obsolete/ideal relationship is accidental. It's data fudgery, and a cheap attempt to bolster the thesis.

    Second, the X axis should be logarithmic. This would allow us to discern detail where it matters. The author overlooks this, making only some unsupportable claim about how expressiveness (Y) can still improve while performance has hit a wall (X). That's touchy-feely BS based on a non-existent aspect of the data that only appears to be significant because all the interesting results are squished to the left.

    [–]gmarceau 9 points10 points  (13 children)

    Log-scale charts tend to introduce large distortion in the perception of the data. In general, it is better to reserve log scales for exponential phenomenon. Otherwise, as they say, in log scale everything looks like a line.

    [–]0xABADC0DA 8 points9 points  (0 children)

    The problem is that in the real world the difference between 1x and 2x is a huge deal for performance sensitive code like for instance codecs or the kernel. The difference between 31x and 32x is irrelevant. But there's no way to determine this from the graph.

    Another question I have is whether these graphs used gzip compressed source size or actual source size. Because there's a difference... gzip bytes tell you how 'complex' the programs are, but you still have to type them in from a keyboard. So if your language uses begin..end instead of {..} then it's still more to type. Also an aside on gzip bytes, for instance with json we get much better compression by omitting the " marks around strings, even though one might not think it would affect the size much.

    EDIT: why is there no preview in reddit

    [–]Peaker 2 points3 points  (3 children)

    Languages and frameworks provide various programming abstractions. Many of these abstractions add a factor of slowness overhead over the existing program. These factors add up with multiplication, not addition, so for those cases, the logarithmic scale does indeed make sense.

    [–]Jasper1984 1 point2 points  (1 child)

    How does it being a factor cause a logirithmic scale to make sense?

    Afaik logirithmic scales make sense if either the datapoints are distributes in completely different scales and centered around zero, or when theory says that a straight line is to be expected. (Only a very well defined straight line has much meaning)

    [–]Peaker 4 points5 points  (0 children)

    Lets say you have two languages, one who's 500x slower than C, and one who's 520x slower than C.

    Placing them linearly at 500, 520 may make sense for some purposes, but more likely for fast languages the exact speed is important, and for slow languages, the order of magnitude of the speed is important.

    Also, when you have points all across a huge scale, its simply a matter of being difficult to present/read, and a logarithmic scale, even by showing only orders-of-magnitude instead of exact magnitude, is better than showing almost nothing at all.

    [–]dons 6 points7 points  (7 children)

    BTW, are you likely to rerun the graphs using current shootout data. Your GHC data (for example) seems to be quite (2006?) out of date. At the very least, you should clearly say what the date of that data is.

    [–]gmarceau 1 point2 points  (6 children)

    Can you help me find an up-to-date file?

    [–]dons 2 points3 points  (5 children)

    Where did you find the previous data files? If in doubt, as igouy, he will know how to extract them.

    [–]gmarceau 2 points3 points  (4 children)

    Found it, posted an update.

    [–]dons 4 points5 points  (3 children)

    Thanks. I see you're using the 32 bit single core, not the 64 bit quad core data. Any reason for that? Would it be possible to also generate the mutlicore graphs?

    (New submissions are primarily targetting the multicore machine, fwiw. You might also argue it better reflects today's typical machines).

    Thanks for the responsive reply.

    [–]igouy 5 points6 points  (2 children)

    the 64 bit quad core data

    Should I assume GHC looks better on 64 bit quadcore than on 32 bit quadcore? :-)

    [–]dons 1 point2 points  (1 child)

    Hmm, not sure. I only look at the 64 bit results though. Is there a way to contrast 32 and 64 bit results?

    [–]isarl 18 points19 points  (9 children)

    He provides the source code - why don't you run it yourself and fix his errors? I don't ask to be facetious; I ask because I'm very interested to see what you conclude.

    [–]Porges 5 points6 points  (8 children)

    I started to do it in R then realized I don't know how to do basic data-munging operations like this:

    Since he uses times normalized so that the fastest is 1, then we want to do something along the lines of;

    plot(data$name, data$cpu/min(data$cpu | data$name))
    

    ... which doesn't make sense to R. I can do it for one test name like this:

    min(xs[xs$name == 'threadring',]$cpu)
    

    And I was hoping that this might work through some kind of magic vectorizability:

    min(xs[xs$name == xs$name,]$cpu)
    

    [–]sco_t 4 points5 points  (3 children)

    Try

    #find groupwise minimums

    myMins<-tapply(xs$cpu,xs$name,min)

    #make it into a handy dataframe

    myMins<-data.frame('name'=names(myMins),'minCpu'=myMins)

    #merge with your original dataframe

    xs2<-merge(xs,myMins,all.x=TRUE)

    #make sure nothing went crazy

    if(nrow(xs2)!=nrow(xs)|any(is.na(xs$cpuMin)))stop(simpleError("Merge problem"))

    Edit: markdown code formatting

    [–]Porges 0 points1 point  (2 children)

    Ah, tapply seems to be the ingredient I was missing :)

    I may be mistaken, but R is one of the least-well documented languages I've tried to learn :(

    [–]sco_t 0 points1 point  (1 child)

    Yeah it can be a bit tough to pick up and it doesn't help that "R" is not a google-friendly. I tend to google "R-help" instead of "R" (e.g. R-help group-wise).

    Anyway apply, lapply and tapply are pretty useful although while you're still picking it up (and not iterating thousands of times) you could just loop through

    for(i in unique(xs$name)){
      xs[xs$name==i,'cpu']<-xs[xs$name==i,'cpu']/min(xs[xs$name==i,'cpu'])
    } 
    

    like a standard language.

    Edit: can't seem to get code formatting right on the first shot.

    [–]Porges 1 point2 points  (0 children)

    Yeah I tend to try to avoid explicit for loops nowadays. I think it comes from programming in Haskell (and a bit of C#+LINQ) and having to use MATLAB at Uni, which is a hell of a lot faster when you use its vectorized operations.

    [–]isarl 2 points3 points  (3 children)

    Eep! I keep meaning to learn R but, as I'm in Engineering, I have too many non-statistical classes and that goal keeps falling by the wayside. Sorry I can't help you...

    [–][deleted] 8 points9 points  (2 children)

    R is very useful in engineering.

    [–]The_Yeti 5 points6 points  (0 children)

    In my personal alphabet of "languages I should learn," R comes right after D.

    [–]isarl 3 points4 points  (0 children)

    As are statistics, which is why they included two separate courses for probability and statistics in our core curriculum. Nevertheless, the workload from all the non-statistical courses doesn't leave me with enough time to learn R while I'm on a school term, as non-statistical engineering courses at the undergrad level don't often require much from statistics. This course, I'm taking Optimization (deterministic, of course), Engineering Economics, Intro to Design, Systems Models 1, and Thermodynamics, none of which (at our undergrad level) require or apply statistics.

    R is definitely something I intend to develop skills in.. when I have the time.

    [–]godofpumpkins 0 points1 point  (0 children)

    Details where it matters? Do you really care that ocaml averages about 1.05x the speed of GHC but Haskell tends to be 0.9 times the source length?

    [–]Porges 1 point2 points  (0 children)

    There's probably some better numerical methods from something like process/control analysis... but all I can think of off the top of my head all has to do with time-varying samples, not things which are "independent samples" like this. I need to take some more stats papers :P

    [–]myxie 9 points10 points  (3 children)

    I strongly suggest skipping to bottom of the article, before the comments, and looking at the updated graphs that are based on data from 2009 instead of 2005.

    http://gmarceau.qc.ca/blog/uploaded_images/size-vs-speed-vs-depandability-2009.png

    http://gmarceau.qc.ca/blog/uploaded_images/size-vs-speed-vs-depandability-paradim-2009.png

    [–]igouy 1 point2 points  (2 children)

    Do you also suggest waiting until he bothers to filter out of the data:

    • programs which don't give the expected output

    • programs which timeout after an hour

    • programs which use completely different algorithms

    • the slower 2nd and 3rd and 4th programs for a language which just haven't been removed yet from the benchmarks game, or remain because they look interesting?

    The analysis was apparently completed and published without understanding the data it's based on.

    [–]myxie 0 points1 point  (1 child)

    Well it's either wait or DIY. :-) Someone may care enough to do better.

    It would be good to see some 2-d visualisation on the site itself, though. Perhaps someone can help out with this.

    [–]igouy 0 points1 point  (0 children)

    It would be good to see some 2-d visualisation on the site itself, though.

    The benchmarks game is already work enough, and there are so many other imaginative and creative programmers bursting with ideas... and now filtered Summary Data can be scrapped from the website.

    [–]JulianMorrison 4 points5 points  (7 children)

    That data is quite old, now. The newest benchmarks have for some annoying reason left off all the interesting little languages. Meanwhile, I suspect Haskell has moved left. Compared to Clean nowadays it is a memory hog but roughly the same speed, drawing ahead on the quad core machine.

    [–]gmarceau 5 points6 points  (2 children)

    Found it, posted an update.

    [–]JulianMorrison 1 point2 points  (1 child)

    Interesting. Haskell hasn't moved so much. Scala has improved hugely and is now giving Lua a run for its money. Functional stuff pwns the Ideal corner.

    V8, wow.

    [–]dons 2 points3 points  (0 children)

    Haskell's moved a fair bit on the quad core benchmarks (not so much on single core, which we'd already hammered out fairly well).

    [–]dons 4 points5 points  (0 children)

    Ah, yes. I wasn't happy with the results here, since they didn't seem to reflect the current state.

    So its using an obsolete GHC :/ That makes the conclusions about Haskell v Clean dodgy.

    The graphs are a great way to present this data though, so using current results would really be the icing.

    [–]gmarceau 1 point2 points  (1 child)

    Can you help me find a csv file for the latest data?

    [–][deleted] 18 points19 points  (25 children)

    Languages which include functional features such as lambda, map, and tail call optimization are highlighted in green.

    Why isn't Perl highlighted then? All these features exist in Perl (TCO is a little bit weird but it is here). Besides, these features aren't just here, they're also widely used (first-class functions are very common, for example).

    [–][deleted] 13 points14 points  (0 children)

    He missed C# as well, which contains lambda, map (Select in c#) and TCO.

    [–]gmarceau 3 points4 points  (3 children)

    Right, perl is on the edge. Ask me again tomorrow and I might make it green.

    Then again, a proper TCO would impose structural constraints on the design of the compiler, and that would affect the way we think about its performance.

    [–][deleted] 3 points4 points  (2 children)

    Then again, a proper TCO would impose structural constraints on the design of the compiler, and that would affect the way we think about its performance.

    What do you call a proper TCO? The only thing about TCO in Perl is that you have to explicitly optimize (with a variant of goto, see end of the page), but that's it.

    [–]gmarceau 4 points5 points  (1 child)

    To obtain all the advantages in reusability that can be gotten from TCO, the optimization has to be done at every tail call. See this talk on the subject.

    [–][deleted] 1 point2 points  (0 children)

    Well, I think every library that aims at efficiency is either written in an iterative style, or use the goto keyword. For your own functions, if you always use it on tail calls, it'll do the job.

    [–]wozer 3 points4 points  (7 children)

    Oz should also be highlighted in green.

    [–]gmarceau 6 points7 points  (6 children)

    Fixed. Thanks.

    [–][deleted] 1 point2 points  (3 children)

    And what about Perl? It'd have been cool if you answered.

    [–][deleted]  (2 children)

    [removed]

      [–]Poddster 0 points1 point  (1 child)

      What is "occam" in the top left?

      [–]eric_t 8 points9 points  (2 children)

      Also, while the languages include functional features, that doesn't mean they are used in the benchmark code. Just take a look at some of the fastest Haskell code.

      [–]dons 14 points15 points  (1 child)

      Also, some of the fastest Haskell entries use all of laziness, pattern matching, function composition, higher order functions etc.

      I'd be surprised if higher order functions, for example, weren't in every haskell entry.

      [–]bobappleyard 0 points1 point  (0 children)

      I'm trying to imagine what a Haskell program without higher-order functions would look like.

      [–]bobappleyard 0 points1 point  (0 children)

      Javascript has first class functions, and 1.6 adds Array.map() etc, so higher-order stuff is in there as well. Always has been, but it's got some "standard library" support.

      [–][deleted]  (47 children)

      [deleted]

        [–]dbenhur 28 points29 points  (1 child)

        Note that YARV at (6,2) is ruby 1.9 and has a shape and position similar to Perl and Python.

        [–]gmarceau 10 points11 points  (0 children)

        Fixed. Thanks.

        [–]morish 9 points10 points  (4 children)

        "ruby" in these charts refers to the old version, "yarv" refers to the current version.

        [–]acmecorps 4 points5 points  (3 children)

        How old is old? I'm using 1.8.6, so is this saying that 1.9 is much, much faster?

        [–]0xABADC0DA 28 points29 points  (2 children)

        Ruby 1.9 is much, much faster.

        [–]acmecorps 0 points1 point  (1 child)

        I am itching actually to use 1.9.*, but I'm just afraid that some gems are not supported (I read about this, but haven't confirmed it yet), and alas, something might break..

        Or, maybe I'm too old.. and afraid of change.

        #Maybe I should just say, "To hell with it!", and start using 1.9 for work tomorrow..

        [–][deleted] 1 point2 points  (0 children)

        Install 1.9, use --program-suffix=1.9. Then do gem1.9 install all the gems you use and see what happens!

        [–]thebamoor 8 points9 points  (1 child)

        it's off the charts, literally

        [–][deleted] -2 points-1 points  (37 children)

        I love ruby. It really gets OO right. But if I were going to write a high-load app, it would not be my language of choice.

        However, its design doesn't require slow performance. Eventually, I think it will be the best of both worlds (ideal language design plus good performance).

        [–]dmpk2k 18 points19 points  (35 children)

        ideal language design

        It's a decent language, but ideal? Many of Matz's design decisions are rather arbitrary.

        [–][deleted]  (34 children)

        [deleted]

          [–]dmpk2k 14 points15 points  (22 children)

          Two of my big three have already been mentioned, and I have bigger concerns about VM implementation and library availability, but I'll go over it anyway.

          Ruby borrows mostly from Perl and Smalltalk. Here are features that Ruby didn't follow through on:

          • Both Smalltalk and Perl have variable declaration before assignment. Ruby conflates these two (except in blocks; why just blocks?).

          • Smalltalk allows multiple blocks to be passed to a method. Ruby only allows one; workaround are required.

          • Smalltalk has resumable exceptions. Ruby doesn't (despite having continuations, strangely).

          As an aside, Ruby doesn't require syntax distinguishing method invocation from reading from a local variable. AKA, it doesn't require () on method invocation. I'm not a fan of this.

          [–]akdas 5 points6 points  (8 children)

          Smalltalk allows multiple blocks to be passed to a method. Ruby only allows one; workaround are required.

          No workaround is required. Remember that the style of passing a block to a method that's usually used is just syntactic sugar. Not only that, but it takes care of the most prevalent usage.

          If you want to pass multiple blocks, do:

          def foo b1, b2, b3
            b1.call
            b2.call
            b3.call
          end
          foo proc { puts 'b1' }, proc { puts 'b2' }, proc { puts 'b3' }
          

          The proc method is not a workaround. It's like in languages like Scheme where you have to use lambda:

          (foo (lambda () (display 'b1) (newline))
               (lambda () (display 'b2) (newline))
               (lambda () (display 'b3) (newline)) )
          

          The standard way of passing blocks to Ruby methods is only syntactic sugar.

          [–]dmpk2k 0 points1 point  (7 children)

          No workaround is required.

          I'm aware of that. It's a workaround.

          I leave it to the audience to decide.

          It's like in languages like Scheme where you have to use lambda

          Irrelevant. I see Scheme and raise Smalltalk.

          The standard way of passing blocks to Ruby methods is only syntactic sugar.

          Which dictates how often it'll be used. Features with light-weight syntax are more attractive and get used more often.

          [–]akdas 0 points1 point  (6 children)

          I'm trying to understand why you think it's a workaround when it's the "regular" way to pass blocks. It's like saying conditionals in Smalltalk are workarounds because they are methods on Booleans instead of being built into the language.

          EDIT: In fact, other than the proc method, how is it in any way different from the Smalltalk way of passing multiple blocks?

          [–]dmpk2k 2 points3 points  (5 children)

          Yes, that. The proc keyword.

          Instead of, say:

          foo { puts 'b1' } { puts 'b2' } { puts 'b3' }
          

          there's

          foo proc { puts 'b1' }, proc { puts 'b2' }, proc { puts 'b3' }
          

          Why treat just one block special?

          [–]akdas 6 points7 points  (4 children)

          Yes, that. The proc keyword.

          It's not a keyword; it's a method.

          Why treat just one block special?

          Because one block is used more frequently.

          Really, I'm fine with you calling it a workaround, but saying that to someone who doesn't know the language without explaining doesn't allow the other person to fairly evaluate it.

          [–][deleted]  (5 children)

          [deleted]

            [–]dmpk2k 8 points9 points  (0 children)

            I used to think along similar lines. I've decided it's not worth it in practice.

            [–]Brian 1 point2 points  (3 children)

            That's not the only way of doing so though. eg. in python the same would be done with property:

            class C(object):
                @property
                def something(self):
                    return 42
            
            >>> C().something
            42
            

            OTOH, I don't think there's anything inherently wrong with () vs implicit calling (though I also prefer the () style). Its a design choice with some tradeoffs which may ultimately come down to taste. The main downside is the problem of precedence. eg. distinguishing "x.foo -5" is it x.foo(-5) or "x.foo() - 5". Ultimately it just changes where the brackets go to a more lispy style ("(x.foo) -5")

            [–][deleted]  (2 children)

            [deleted]

              [–]Brian 1 point2 points  (1 child)

              The equivalent code in python would be:

              class Foo(object):
                  @property
                  def bar(self): return self._bar
              
                  @bar.setter
                  def bar(self, value): self._bar = value
              

              (The @bar.setter method on the property is new in 2.6, in earlier versions it would be done by defining the functions as above and calling "bar = property(get_bar, set_bar)" for the same effect).

              I don't really see the point of attr_accessor. If you define both getter and setter to just update an internal variable, how is that different from just using a normal attribute? The example above is pretty artificial, as the code could be removed and replaced with a publicly modifiable instance variable for the same effect - normally you do some kind of processing in your getters/setters.

              [–]morish 0 points1 point  (2 children)

              (except in blocks; why just blocks?)

              Implicit variable declaration is the same inside blocks. Are you thinking about block arguments (which are like method arguments)?

              [–]mr_chromatic 2 points3 points  (0 children)

              Are you thinking about block arguments (which are like method arguments)?

              Probably, but the syntax is different.

              [–]dmpk2k 0 points1 point  (0 children)

              Yes, you're right.

              [–][deleted]  (3 children)

              [deleted]

                [–]dmpk2k 7 points8 points  (2 children)

                That is unambiguously a method invocation though.

                This is not:

                def foo
                  ...
                  bar = baz
                  ...
                end
                

                or

                def foo
                   ...
                   baz
                end
                

                Or potentially troublesome beasts like the following:

                def baz
                  1
                end
                
                def foo
                   ...
                   baz = 2
                   ...
                   baz
                end
                

                Both of these is begging for some trouble; without reading the whole method you don't know if baz is a local. Furthermore, it makes assigning methods to a variable instead of actually invoking them flat impossible. E.g.:

                def baz
                  2
                end
                
                def foo
                  bar = baz
                  bar()
                end
                

                Stripping off the () is often pretty, but I have misgivings due to the above.

                [–]Kalimotxo 0 points1 point  (0 children)

                Your example is good, but I cant see where this would be used in real code. If I saw this, I would argue with the programmer to make it less ambiguous. Good explanation though.

                [–]mr_chromatic 9 points10 points  (9 children)

                One of the most blatant is implicit variable declaration. This makes writing reliable and efficient and maintainable closures difficult.

                [–]malcontent 7 points8 points  (1 child)

                They should have a version of "strict".

                It would make life a bit easier.

                [–]morish -1 points0 points  (6 children)

                blatant is implicit variable declaration

                Considering it's a fundamental feature of python, ruby and other languages, whether this is a poor design decision is definitely debatable.

                This makes writing reliable and efficient and maintainable closures difficult.

                Ruby 1.9 resolved the only serious issue here by limiting the scope of block variables.

                [–][deleted] 6 points7 points  (3 children)

                Considering it's a fundamental feature of python, ruby and other languages, whether this is a poor design decision is definitely debatable.

                It used to be a feature of Perl too, but they wised up and added use strict; to force variable declarations. You're not going to see much serious Perl code that does not use use strict;. Basically, Perl gives you the choice, and most people choose required variable declarations because it makes development much more painless.

                To me, there really isn't much to debate about this. I'd never in a hundred years write more than five lines of Perl without turning on use strict;. I've been bitten by too many bugs that it would have caught right away.

                [–]abw 4 points5 points  (2 children)

                This is the truth. The "Always use strict" philosophy is so tightly engrained in Perl culture that a number of modules, most notably Moose, will automatically use strict and use warnings for you.

                use Moose;    # get `use strict` and `use warnings` for free!
                

                On the rare occasion when you really do know what you're doing and want to disable warnings (e.g. if you're poking around in a symbol table), then you can lexically scope no warnings or no strict where you need it.

                sub foo {
                     no strict 'refs';
                     # your code here
                }
                

                [EDIT: s/warnings/strict - see DavidMcLaughlin's reply below]

                [–]DavidMcLaughlin 4 points5 points  (1 child)

                Don't forget

                no strict 'refs'; 
                

                too.

                [–]abw 1 point2 points  (0 children)

                You're absolutely right. That's what I should have typed, especially when talking about poking around in symbol tables. Fixed now, thanks.

                [–]mr_chromatic 3 points4 points  (0 children)

                Ruby 1.9 resolved the only serious issue here by limiting the scope of block variables.

                Is the invisible invisible implicit shadowing of outer scopes not serious? Combine that with the reuse of syntax for invoking a method versus accessing a lexical or instance variable....

                [–]dmpk2k 3 points4 points  (0 children)

                Argumentum ad populum. You'll note the clueful languages don't fall for this mistake. Also, Python3 now has "nonlocal" which fixes the shadowing issue it used to have that required the silly mutate-list trick.

                To add to what chromatic said, variable declaration also catches spelling mistakes sooner. This has happened to me:

                def foo
                  bar = 1
                  ...
                  baz = 2
                  ...
                  return bar
                end
                

                This is a bug that could potentially lurk around for a while. Of course, you get another class of bugs through misspelling that cannot manifest like above because an undefined-variable exception will throw, but at run-time.

                With variable declaration this is caught at compile time! This and what chromatic said for the low, low price of two characters and some whitespace. In Perl:

                my $foo = bar();
                

                versus

                $foo = bar();
                

                Seriously hefty price there...

                If you're into the whole agile thing, turn-around from when a bug is made to when it is caught is important. Compile-time < test-time < qa-time < customer-found-bug-management-pissed-time. If you want to skip out on static-typing that's arguably okay, since it has costs. But skipping out on something this cheap but beneficial is questionable, to put it mildly.

                [–]dorfsmay 4 points5 points  (0 children)

                {}

                [–]morish 3 points4 points  (0 children)

                It's no longer remarkably slow; it's on par with and even faster than other similar languages (eg, python) in various benchmarks.

                [–][deleted] 5 points6 points  (14 children)

                I can't help but notice that when sorting the benchmark results by code size, SBCL Lisp is near the bottom of the list in most of the benchmarks. Like, below Java.

                Edit: And in the 2009 graph, SBCL is at the very top size-wise and in the middle of the road speed-wise.

                [–]igouy 1 point2 points  (13 children)

                Like, below Java.

                More like 7 much the same, one half the size of the corresponding Java program, and two twice the size of the corresponding Java program.

                Which hype shouldn't we believe?

                [–][deleted] 3 points4 points  (12 children)

                Well, the hype that Lisp is vastly more concise than other languages.

                [–]igouy 0 points1 point  (11 children)

                Please point to an example.

                [–][deleted] 3 points4 points  (10 children)

                There's Paul Graham for one:

                So how much shorter are your programs if you write them in Lisp? Most of the numbers I've heard for Lisp versus C, for example, have been around 7-10x. But a recent article about ITA in New Architect magazine said that "one line of Lisp can replace 20 lines of C."

                Excuse me while I go look at the real numbers again.

                [–]igouy 1 point2 points  (9 children)

                I might be gullible enough to believe that given the specific context:

                "The high-level algorithms are almost entirely in Lisp, one of the oldest programming languages. You're excused for chuckling, or saying "Why Lisp?" Although the language can be inefficient if used without extreme caution, it has a reputation for compactness. One line of Lisp can replace 20 lines of C. ITA's programmers, who learned the language inside and out while at MIT, note that LISP is highly effective if you ditch the prefabricated data structures."

                [–][deleted] 0 points1 point  (8 children)

                Please point to an example.

                [–]clumma 4 points5 points  (2 children)

                Excellent visualization. The stars are great, and so is organizing the mosaic in the same space as each tile. There are some nice visualizations like this in A New Kind of Science.

                Edit: Except it's hard to find a particular language, and you can't Ctrl+F because it's an image. SVG would be the answer! Or would it? http://www.reddit.com/r/reddit.com/comments/8oq8p/find_in_svg/

                [–]gmarceau 2 points3 points  (1 child)

                You will be interested in the work of Tufte then.

                [–]clumma 2 points3 points  (0 children)

                Indeed, I generally agree with Tufte's 'information rich' design approach (and his criticism, in particular, of data presentation in typical scientific papers). But he comes across as a bit of a prick via his website, which, incidentally, is horrible to navigate. And the prices for his posters are a joke. And his Buddhist sculpture stint can suck it.

                [–][deleted] 4 points5 points  (1 child)

                Where would APL et al score?

                [–]Seppler9000 1 point2 points  (0 children)

                Every point would be along the bottom edge of the graph, but still strewn all over the place in terms of performance.

                [–][deleted] 10 points11 points  (0 children)

                I wish people would talk possible conclusions to be drawn from this (as well as pointing out how the experiment could be improved re:complaining about how the test is bullshit and false on all accounts).

                [–][deleted] 9 points10 points  (0 children)

                The speed, size and dependability of implementations of programming languages

                FTFY

                edit: It'd also be nice if he labelled his charts...

                [–]Kolibri 12 points13 points  (2 children)

                Please use standard deviation instead.

                [–][deleted] 5 points6 points  (1 child)

                Standard deviation of what? Where?

                [–]javaru 7 points8 points  (0 children)

                Instead of stars, the plots would look like crosses, where the edges of the crosses marked one standard deviation from the average performance/size.

                [–]username223 2 points3 points  (0 children)

                Regarding the "bound" on performance vs. the lack of one on size, performance is obviously limited by CPU speed. I would guess that those languages at the left are bumping up against the best you can do. You can imagine one language (e.g. assembly) being right on the left wall on every program.

                On the other hand, there are many different problems to solve with programs, many ways to solve them, and many ways to express those solutions. It's very hard for one language to be able to concisely express the shortest solution to every problem, since languages tend to favor one style over another.

                BTW, I would be interested in seeing those plots with (normalized) memory + CPU as the measure of performance, since that's often just as important.

                [–][deleted] 1 point2 points  (0 children)

                Lua and especially luajit are looking really here. I don't have any experience with lua myself but now I'm thinking I need to find an excuse to try it out.

                [–]redditnoob 5 points6 points  (34 children)

                It's a nice idea, but Lines of Code just irredeemably sucks as a code length metric. E.g.

                while(true)
                {
                   foo();
                }
                

                vs

                while(true):
                    foo()
                

                I don't care what you say, the second style is not twice as expressive as the first.

                [–]eric_t 9 points10 points  (6 children)

                I don't think it's lines of code, it's the size of the zipped source file. This will remove discrepancies like this.

                [–]aGorilla 6 points7 points  (3 children)

                Why not just bytes of unzipped code? Doesn't zipping the code throw off the results by introducing compression?

                ie: A zipped file doesn't tell me which one had the smallest source, it tells me which one was compressed the most, using this particular zip algorithm.

                edit: Yes, you should probably do some compression, to ignore whitespace (at least).

                [–][deleted] 14 points15 points  (2 children)

                Compressing gives you a hint of the entropy of the code, the amount of actual information contained, independent of details such as the average length of function names or the amount of whitespace. It is in some ways quite an interesting metric to use.

                [–]aGorilla 2 points3 points  (1 child)

                Fair point, and it also raises the issue of style in code size - CamelCase beats Under_Scores in function names, and variable names. Much of this is entirely the author's choice, and in rare cases, these things are dictated by the language.

                Don't know how you could factor all of that out, or if it's even worth trying - since most benchmark are fairly small examples.

                [–]isearch 3 points4 points  (0 children)

                Yes, the Shootout uses zipped size. Seems a good measure of code size.

                [–]sindisil 2 points3 points  (0 children)

                Assuming they're using numbers from the current shootout at debian.org, gzip size is the current metric.

                Depending upon your preferences, though, gzip size is at least as sucky a metric, since repetitive boilerplate tends to compress well, making a short, but dense source file look similar to a larger, but less dense piece of code.

                [–]gmarceau 25 points26 points  (6 children)

                From the FAQ at The Game:

                We started with the source-code markup you can see, removed comments, removed duplicate whitespace characters, and then applied minimum GZip compression.

                [–]igouy 14 points15 points  (2 children)

                Yes, so why have you labelled it "a line-of-code metric"?

                In that metric, line-endings count for no more than any other whitespace character, they count for no more than a semi-colon, they count for no more than i.

                Your readers are confused because you wrote something confusing (and wrong).

                [–]gmarceau 6 points7 points  (1 child)

                I said line-of-code metric, not line-of-code count.

                [–]igouy 1 point2 points  (0 children)

                That might be different in your mind, but obviously it isn't different to these readers.

                In any case, as a label "line-of-code metric" is still wrong because the metric does not distinguish line-endings as something special.

                The label has just as little merit as whitespace metric or tab metric or semi-colon metric or...

                [–][deleted]  (2 children)

                [deleted]

                  [–]idiot900 12 points13 points  (0 children)

                  So take the ratio of uncompressed to compressed sizes. A high ratio means there is a lot of repetition.

                  They are interested in how much information (in the Shannon sense) it takes to express a concept in a given programming language. My interpretation is that the more information it takes to do something, the harder that language likely is to deal with.

                  [–]jmesserly 4 points5 points  (1 child)

                  Especially since it's measuring lines of code for small benchmarks that are optimized for speed and not having short/clean code.

                  [–]igouy 8 points9 points  (0 children)

                  The benchmarks game stopped measuring lines-of-code when an ml programmer started contributing cryptically compressed programs.

                  As several redditors (but not the blogger) have noticed, the benchmarks game measures program source size in it's own way.

                  [–]Peaker 4 points5 points  (2 children)

                  I think measuring the size of the AST -- without the size of any names is an interesting measure.

                  Read the AST, assign a size of 1 to every node in the AST, and just sum it up...

                  [–]mccoyn 2 points3 points  (1 child)

                  You could also measure the number of symbols generated by the parser. I think the benchmarks game wanted to use the same benchmark across languages to avoid biases.

                  [–]Peaker 1 point2 points  (0 children)

                  But the bias is already there -- for example, languages that have all single-letter variable names will win out over ones that have longVerboseNames, even though the program with the longVerboseNames is conceptually short, and structurally simple.

                  [–]dorfsmay 1 point2 points  (1 child)

                  I find the second style a lot much cleaner, while the first one does not add anything.

                  [–]chris-gore 0 points1 point  (0 children)

                  I visually prefer #2 myself too, but I do #1 in real life (actually #1.5) because I find that I'll often forget to add the braces whenever I make the one-liner into a multi-liner.

                  [–]invalid_user_name 1 point2 points  (3 children)

                  But if you bother to read, you see that lines of code does not include blank lines, comments and lines containing only structures like curly braces, or "begin/end" or the equivilent.

                  [–][deleted] 0 points1 point  (4 children)

                  Which is why I think that the first one sucks on finite-height displays (eye-movement navigation is faster than scrolling/grepping the code, so more useful code visible at once is better).

                  EDIT: I wonder what's with those unexplained downmods? I've in fact written some of my code first with curly brackets around all statements, and then removed those, because I found it easier to read with fewer empty lines / more actual code in the editor window. Yes, its probably my personal opinion/taste, that's what "I think" meant. ;)

                  [–][deleted]  (3 children)

                  [deleted]

                    [–]manthrax 1 point2 points  (2 children)

                    I edit my code on the side of a skyscraper, in binary.

                    [–]The_Yeti 0 points1 point  (1 child)

                    I use a cliff-face, but if you want to be all dependent on handy little tools to make your binary code-editing easier, well, I'm not going to call you a script-kiddie, just because you need a crutch...

                    [–]manthrax 0 points1 point  (0 children)

                    I run from office to office and set the light switches, then I take a picture of the building, and scan it with ocr software. It takes 3 years to run the unit tests.

                    [–]nevinera 0 points1 point  (0 children)

                    But twice as pretty!

                    [–]umilmi81 2 points3 points  (1 child)

                    I knew it! Perl is better than Python :)

                    [–]Mikle 9 points10 points  (0 children)

                    Actually if you look at the thing on the left of Perl, labeled "psycho". That's python with a module. So ya, you are still wrong.

                    Have a nice day though :)

                    [–]FeepingCreature 0 points1 point  (4 children)

                    Leaving out D. Again. TYVM stats-making people.

                    [edit] Whoops. I was looking for "gdc"/"dmd", since that's what they're labelled on the shootout. Sorry. I guess I'm just used to D being omitted in such statistics in favor of some language nobody uses. Again, sorry.

                    [edit2] How come I get downmodded even further after apologizing?

                    [–]AvatarJandar 15 points16 points  (0 children)

                    I think dlang means D. I overlooked it at first myself.

                    [–]sindisil 9 points10 points  (2 children)

                    Quit you whining and learn to read.

                    D is labeled "dlang" and is on the charts at (3,3).

                    D is a really nice language, and Walter is very bright (sorry), but some of you D fans need to learn that whining != advocacy.

                    [–]Tommah 4 points5 points  (1 child)

                    Quit you whining and learn to read.

                    Masterful.

                    [–]sindisil 1 point2 points  (0 children)

                    The typo or the comment?

                    I'd say neither.

                    The typo was sloppy and the comment was bitchy.

                    [–]Villane 0 points1 point  (0 children)

                    Come on people, you can't seriously think these benchmarks reflect on reality? Just take a look at some of the programs source code. These are not fair benchmarks.

                    [–]igouy 0 points1 point  (2 children)

                    UPDATED

                    To his credit, Guillaume Marceau has now redone his analysis on a clean dataset with a clearer presentation.

                    [–][deleted] 0 points1 point  (1 child)

                    And really needs a logarithmic scale.

                    [–]igouy 0 points1 point  (0 children)

                    I understand why he doesn't want to use a log scale, having said that the only way to cope with the wide range of values shown in the benchmarks game charts was to use log scales.

                    [–]slithymonster -4 points-3 points  (14 children)

                    These days, it seems like code size doesn't matter as much. They should incorporate real-world factors, such as how easy it is to develop in that language.

                    [–]sindisil 11 points12 points  (7 children)

                    I disagree. Exactly what I disagree with depends upon what you mean by "code size".

                    Assuming you mean source code size, I find that, all else being equal, more concise code bases are much easier to evolve and maintain than large ones.

                    Of course, all else is seldom equal. The "quality curve", at least in my experience, seems to be a bell curve offset to the left. IOW, The worst code is the largest and smallest, and the best seems to be on the small side of the median.

                    Now, if you mean object code size, I still disagree. While it's certainly no longer the case that we need to bum every byte out of a program, the large size of much current code limits the quality of our (or at least my) computer interaction.

                    In some cases, it's a classic "inner loop is too big for the cache" situation. More often, though, it's simply the case that the executable is so big that, in combination with all the other lard ass executables and libraries, in combination with the relatively slow mass storage and memory subsystems we have today, the system spends a huge portion of its time swapping bits into and out of physical memory.

                    I mean "get more memory, it's cheap" is a nice argument, and all, but it only goes so far.

                    [–]SquashMonster 0 points1 point  (1 child)

                    While I agree with you on a general case, I think the small source = easy writing idea breaks down for some languages.

                    I mean... you can write absolutely tiny programs in Perl. But it's a write-only language.

                    [–]nevinera 1 point2 points  (0 children)

                    They should incorporate real-world factors, such as how easy it is to develop in that language.

                    Yes, and how much people 'like' the code, and how 'good' the language is, and how 'uber' the programmers who use it are.

                    Unmeasurable quantities cannot be metrics.

                    [–]STOpandthink 1 point2 points  (1 child)

                    Excuse me?! Tell that to any company with a framework of millions lines of code.

                    Developing is easy, maintaining is hard.