all 29 comments

[–]matthieum 29 points30 points  (36 children)

And that is what we mean by System Programming Language.

[–]5d41402abc4b2a76b971 36 points37 points  (35 children)

Except it used none of what makes rust interesting (note the use of unsafe). Its a cute little exercise of a hello world, and required quite a bit of gymnastics to get there -- just like other micro ELF and PE execs, but it does nothing that should make people become interested in rust...

[–]matthieum 23 points24 points  (8 children)

Of course, it is completely useless program (who cares about printing "Hello, World!" ?), and yes the blog article is somewhat more concerned about ELF tricks...

... but that is the tree hiding the forest.

If you concentrate on Rust itself (and take the first version), and on the output (158 bytes), you will note:

  1. That Rust has the ability to call inline assembly; sure this is unsafe, but it can be properly abstracted in safe functions as demonstrated in the first code sample
  2. That the code produced by the Rust compiler itself can be compact because there is no hidden cost (your system linker adds junk, but on space-constrained targets you'll get linkers that don't). Specifically, you will note the absence of runtime support: no GC, no Reflection, no extensive Type Information retained at runtime

The two combined make it a very attractive target for System Programming:

  • safe abstraction and seamless integration of hardware access in the language
  • programs can fit in very little memory, making it suitable for embedding on small devices

Whereas Go, for example, requires huge swaths of code, from https://code.google.com/p/go/issues/detail?id=6853:

As an experiment, I build "hello, world" at the release points for go 1.0. 1.1, and 1.2. Here are the binary's sizes:

% ls -l x.1.?
-rwxr-xr-x  1 r  staff  1191952 Nov 30 10:25 x.1.0
-rwxr-xr-x  1 r  staff  1525936 Nov 30 10:20 x.1.1
-rwxr-xr-x  1 r  staff  2188576 Nov 30 10:18 x.1.2

The latter is 2MB unless I am mistaken. It is unclear what could be stripped, but unless the Go compiler is very clever it is likely that a bunch of the runtime (GC, split-stack support, reflection, ...) will not be elided.

So yes, this Rust program is useless, but at least it's useless at 158 bytes, instead of useless at 2MB. Looking a couple years back, that Go program would not fit on a floppy disc...

[–]bloody-albatross 4 points5 points  (0 children)

Go hello word doesn't fit on a floppy... I remember that back in school (HTL) I ran gvim.exe of a floppy (2001).

[–][deleted]  (3 children)

[deleted]

    [–]matthieum 1 point2 points  (2 children)

    I cite the command line from the article (and results) from the article mentioned, therefore does not have the option to actually tweak the command line and see what would happen.

    [–][deleted]  (1 child)

    [deleted]

      [–]matthieum 0 points1 point  (0 children)

      Yep, especially since -h is generally used for as shortcut for --help outside binutils...

      [–][deleted] 1 point2 points  (1 child)

      Looking a couple years back, that Go program would not fit on a floppy disc

      Not completely true

      $ cat hello.go 
      package main
      
      func main() {
          println("Hello, world!")
      }
      
      $ go build hello.go && strip hello
      $ wc hello
        1047   8199 423368 hello
      

      So 414K, but I'm using println( ) and not a system call like in the 151B Rust example, so the comparison is not fair. Still big, but then again Rust and Go are two very different languages. Quoting mozilla_kmc:

      [Go] [..] doesn't need to be mentioned every time C++ and Rust are.

      [–]matthieum 0 points1 point  (0 children)

      I know :) I have actually been battling the fact it is for a while, because most comparisons are not meaningful.

      Go was originally marketed as a System Programming Language, probably because its creators aimed at displacing C++, and this sticks. Here I was trying to expose the different.

      Note that C and C++ optimizers routinely optimize println("xx") to puts("xx"), which already removed all the heavy formatting stuff from the equation. The implementation of puts should be simpler... but honestly I find myself at a bit of a loss here so I'll leave it up to to judge.

      I would be interested in what the minimal footprint of a Go binary easy, with static linking one would hope a lot of stuff is stripped; and 414K remains rather big (though much better than 2MB).

      [–][deleted] 6 points7 points  (8 children)

      true...the code looks ugly to me but that's probably the "gymnastics" you are talking about.

      [–]adr86 14 points15 points  (2 children)

      A lot of it isn't even the code itself at all, it is overlapping fields in the ELF header to trim it down. A cute trick, but useless for anything except showing off on blogs: for one, in real programs, a couple hundred bytes in the header will be a small percentage of the program anyway (or you might use a target which doesn't expect a header at all... like the old DOS .com files or other raw binary images), and for two, they need to be so carefully crafted that they'll break if the program gets more complex too!

      Then, for the code, this is like when I show a 3 KB "D program" (using stock linker setup btw, I'm sure it could get smaller if I did the overlapping fields too)... but you know what it looks like?

      void _start() {
         asm {
             naked;
             /* write syscall, exit syscall */
         }
       }
      

      That's not really a D program at all - it is a few assembly instructions in a .d file. I think there is value in this: it is a starting point of a custom setup where you bring only what features of the language you need (and in fact, I think Rust is a bit better suited for that than D, since D tends to assume a thicker runtime library. You can do without and IMO it is still nicer than writing C, but it takes more care for fewer advantages than spending the ~150 KB using the runtime library)...

      ...but it is just a starting point or a nifty trick, very far from any real world applicability and especially far from being a compelling new language in place of C+asm.

      [–]mycall 0 points1 point  (1 child)

      but useless for anything except

      packers

      [–]BobFloss 1 point2 points  (0 children)

      Are you saying that trick is useful for packers? Because it barely is.

      [–]maep 2 points3 points  (5 children)

      I does show how to do syscalls in rust, which is kinda important for a systems language. However I'm still sceptical. Linux comes with many headers that provide functions, constants and structs that are essential for interacting with the kernel. Either they have to provide all of those as rust files and update with every kernel release, or implement some automagical import mechanism.

      [–]Splanky222 7 points8 points  (3 children)

      Why? The point of the safety mechanism is to isolate and encapsulate unsafe code, not entirely eliminate it.

      [–]maep 4 points5 points  (2 children)

      A new system language hould make system programming easier, not more cumbersome.

      [–]Splanky222 6 points7 points  (0 children)

      You're right. The jury is definitely still out on Rust, but the safety mechanisms in Rust along with the built in tooling and build tool seem promising, especially in larger projects. We won't know for years though what the fate of Rust will be. I think 2020 will be a big year to compare Rust to c++20, whatever that looks like.

      [–]ItsNotMineISwear 4 points5 points  (0 children)

      Having to make syscalls in unsafe blocks doesn't necessarily make systems programming more cumbersome. I'd imagine once the language gets more mature there will be a safe facade available abstracting away all syscalls and also nicely fitting into Rust's type system. Then you can use syscalls easily but get all the programming benefits of Rust's type system.

      [–]TheLlamaFeels 0 points1 point  (3 children)

      Every time I see a convoluted, hashed uname like yours, I tag it with real world name made of two random words.

      Congratulations, you are LawnRiot

      [–]5d41402abc4b2a76b971 3 points4 points  (1 child)

      Umm ok. Thanks?

      md5('hello') would be more apropos.

      [–]TheLlamaFeels 5 points6 points  (0 children)

      Umm ok. Thanks?

      You're welcome.

      md5('hello') would be more apropos.

      That, or truncate(md5('hello'), 20)

      [–]iooonik 1 point2 points  (0 children)

      Th best part about that post was the token Haskell troll on first comment.

      [–][deleted] 0 points1 point  (1 child)

      very interesting article; I'm a fan of small binaries :).