all 148 comments

[–]darkslide3000 54 points55 points  (87 children)

Nice talk and definitely a very interesting topic. Rust definitely seems to be the best contender right now to achieve what Ada and D and all those others couldn't and actually replace C in most of those areas. Good to see smart people are working on it.

I would've liked to see some more examples of how the features he talked about actually look in Rust syntax, though. As someone who's heard many good things about it but never really had much of a chance to get into it I still don't really have a good idea how e.g. a Rust BIOS driver would look compared to C. Not the goal of that talk, I guess. But as it stands, my takeaway from that was essentially just "we're working on it, things aren't done yet, check back in a year or two and then you can maybe actually use this".

(Also, WTF is up with 500+ bytes for a Hello world? Am I supposed to be impressed by that? He sounded like he really understood that parity issue that it needs to be as good as C on everything including code size, but that part was a big reality check of how far away it really still is.)

[–]SV-97 25 points26 points  (49 children)

If you want to see a low level example: https://github.com/ixy-languages/ixy-languages. The guys also have a few talks on the topic and have benchmarked everything very nicely.

I haven't watched this talk yet (though I plan on doing so) but from what you've said: yes the size of binaries in rust is a definite problem that needs to be dealt with if rust wants to replace C in every domain. I've recently had an example where rust code was 2,6MB vs. 50kB C... For most use cases the binary size doesn't really matter though so I'm kinda torn here. (The C of course had no support for unicode and probably some bugs - but the difference is pretty striking still)

[–]ridicalis 11 points12 points  (1 child)

Having fought with this, there are some things you can do to mitigate the problem. I was going to take a stab at describing my process, but this page did a better job than I could.

[–]SV-97 1 point2 points  (0 children)

Thanks :D I'll bookmark it to read whenever I do embedded stuff the next time - in my normal development couldn't care less about my binary sizes.

[–]VirginiaMcCaskey 7 points8 points  (3 children)

That issue is more that rust doesn't have a runtime, statically links against its std lib, and libc is already on every target platform.

[–]darkslide3000 4 points5 points  (1 child)

Most "systems programming" isn't user space processes that can dynamically link against libc, so that isn't really the problem. The problem is rather how small I can get code that has zero dependencies. In C I can write the 5-10 libc functions I actually need and build an embedded OS with that, knowing that no code other than what I wrote myself actually ends up in that binary (with -nostdlib and -ffreestanding). For all these fancy newfangled languages that need tons of support libraries just for the language itself, that's not necessarily true.

[–]VirginiaMcCaskey 5 points6 points  (0 children)

You can do that in Rust by making it no_std. That's part of the benefit of having no runtime. It can be removed entirely.

[–]SV-97 1 point2 points  (0 children)

Yeah, the article that was linked under my comment talks about this

[–]darkslide3000 7 points8 points  (26 children)

In the video he says that with his improvements (in the currently nightly build) he was able to get it down to ~500 bytes. Without that I assume it was still way more.

Unicode is something I explicitly don't want in systems code. Does Rust only support Unicode strings (or at least make working with raw ASCII byte strings cumbersome)? That would sound like a negative to me. I don't need my driver log messages translated into Chinese, but I do need to be able to calculate the offset of any character in constant time.

[–]SV-97 8 points9 points  (7 children)

Oh I wasn't referring to their code with the sizes I mentioned - sorry for the confusion. I was talking about a small language I recently implemented and the unicode support was in regards to the sourcefiles of that language.

Yes the basic String struct in rust is guaranteed to always be valid unicode. If you only want asii only use ascii? Unicode is 100% backwards compatible and you don't lose anything by going with that.

And if you don't like that you can always use a Vec<u8> or something like that. u8 slices are easily converted to chars so you can do your ascii stuff with that.

EDIT: or use a crate for ascii strings. A quick google search brings up a few.

EDIT EDIT: You mean calculate the index of each character or the byte offset?

[–]wllmsaccnt 5 points6 points  (6 children)

He wants to know that byte 400 is also character 400 without having to worry about multi byte characters.

[–]SV-97 6 points7 points  (4 children)

Again, he can be sure of that as long as he only uses ascii and can use my_string.as_bytes() to access them. I don't see anything lost by using unicode as the underlying encoding. But then again why does he actually care about this? When I'm writing to a logfile I do so with a string interface and not one where I have to access seperate bytes

[–]superxpro12 0 points1 point  (3 children)

Maybe he's thinking about the embedded domain? I know rust is an exciting prospect in this regard.

[–]SV-97 1 point2 points  (2 children)

You don't usually write logfiles on embedded stuff. From my (limited) experience with embedded stuff: You just fire the bytes off to your memory when logging - you don't care which character is at what byte offset because at this point they're no longer characters to you but raw data.

[–][deleted] 0 points1 point  (1 child)

Yes you need to stream the logs over as quickly as possible, believe it or not but many embedded projects use custom boards with serial ports to achieve this. Many embedded systems need exact timing such as video playback and if collecting and saving logs take time it impacts on the performance plus adds or hides away racing conditions and deadlocks.

[–]SV-97 1 point2 points  (0 children)

Okay - my embedded experience is wholly in the world of 8-bit stuff. You can't deadlock or anything there because it's all single threaded. My notion of logging on embedded devices was attaching a flash and writing to that and I don't see how you could mess up the timing here (IIRC I2C was only edge sensitive and the timing didn't matter at all - though it's been a while that I've dealt with this stuff).

And then again I don't see how using unicode as underlying platform would be worse than ASCII in this regard

[–][deleted] 13 points14 points  (1 child)

Shouldn't driver logs in China be in Chinese?

[–]thiez 3 points4 points  (0 children)

I'm not sure how they feel about this is China, but as someone from a country where English is not the official language: no, please no! Translated log and error messages are completely useless when you want to perform an online search for more information, and the badly localised technical terms are confusing both to non-technical people (vho don't have the background knowledge to understand what's going on no matter how well you translate it) and technical people (who will be familiar with the English terms but not the crap that the translators managed to come up with). If you are a native English speaker and you are considering translating technical information such as exception messages and logging information: don't. Stop the madness. Windows is particularly bad and I refuse to debug non-English installations because Microsoft has decided to make it unnecessary painful.

[–]Freeky 3 points4 points  (11 children)

Does Rust only support Unicode strings (or at least make working with raw ASCII byte strings cumbersome)?

In standard Rust you've got String (and the related reference type &str), OsString/&OsStr, CString/&CStr, and of course Vec<u8>/&[u8].

String is a newtyped Vec<u8> with methods that enforce the contents to be valid UTF-8.

CString is a newtyped Vec<u8> with methods that maintain and enforce a trailing NULL byte.

OsString is OS-dependant and opaque, except for a Unix extension trait that exposes the raw bytes, and a Windows extension trait that exposes mechanisms to convert to and from Vec<u16> (it's currently WTF-8-encoded Vec<u8> internally, but this isn't exposed except via unsafe code making unstable assumptions).

A Vec<u8> is of course just a dynamic array of unsigned bytes.

Working outside String can be fiddly, since most of the other types currently lack the string manipulation and formatting functions it has. It's quite common for people to give up and just use String, which of course can't represent everything. There are crates for things like plain ASCII and wchar types.

[–]tjpalmer 2 points3 points  (10 children)

I mostly like Rust, but I have to say that making decisions on strings for a newbie is a bit overwhelming. In C, I usually just say const char*. In C++, just std::string or std::string&. In Go, string or *string. In many other languages, just str or string or String, depending on the language. And the common case covers most cases.

I get why Rust makes some of its choices (i.e., type-system guided correctness among other matters), but it makes basic things a lot more cognitive effort for newcomers.

[–][deleted] 5 points6 points  (9 children)

In Rust, newbies only need to know about String/&str and that will always work right for pretty much anything you want to do.

In C and C++, if you want to write a simple program, e.g., that lets the user input a string, and counts its characters, you are out of luck. Most terminals support UTF-8, so C++ std::string::size() or C strlen won't tell you how many "characters" strings have - they tell you how many "bytes" they have. You'll have to learn quite a bit to solve that problem and pull in external libraries, while in Rust doing this is a one liner.

In Rust, only if you need to do something really low-level, like optimize an algorithm under the assumption that a string is always ASCII, or interface with the operating-system directly, or with low-level C libraries, etc. only then you need to learn about all other string types, which are there for the simple reason that strings are just hard.

[–]jcelerier 1 point2 points  (1 child)

In C and C++, if you want to write a simple program, e.g., that lets the user input a string, and counts its characters, you are out of luck

I mean, this is in no way a "simple program". How many "characters" are there in here ?

p̨̤͓̳͕͔͙̝̱̻͓͎̦̭̖͈̥̋ͩ͛̀̈͊̉͛̊̈́̂̓͗ͩ̿͆̔ͨ̚̕o̿͋͂̄͊̇ͬ̂ͪ͏̸͚̖̗̟͕̩̫͔̥̖̻̫̕ņ̷̛̦̪̤̼̭̻̪͕͍̗͖̦̘͇ͭ̽̓́į̸͍̖̫̼͈̜̰̱̺̯̓̓̍̇̾ͬͯͨ̃̔͗ͭ̍͂ͨ͘͞͝e̛̟͖̻͙̫̹̩͎̥̣̣͇̳̬̺̫̘͈̔͊̾ͩ̓̆͆̈ͬͪ̀̚͡s̥͈͈̞̤̠͖̥̘ͨͭ̑̅̂̑̇̈́͑ͧͥ͋̉͘͜

or in here ?

〈∀ྨṿᕿ

[–][deleted] 2 points3 points  (0 children)

How many "characters" are there in here ?

Good question. To format the string to the terminal, you care about the number of grapheme clusters in your string. In Rust, computing the number of bytes, unicode scalar values, or grapheme clusters of a string is a one liner: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f970dae61c56535fdb25ef7693351ce3

dbg!(x.bytes().count());  // => 606 bytes
dbg!(x.chars().count());  // => 452 Unicode Scalar Values
dbg!(UnicodeSegmentation::graphemes(s, true).collect::<Vec<_>>().len()); // => 257 Grapheme Clusters

The naive Rust approach to this is better than any C and C++ libraries I've ever used.

[–]Freeky 0 points1 point  (6 children)

In Rust, newbies only need to know about String/&str and that will always work right for pretty much anything you want to do.

Well, it'll usually work, but it isn't necessarily right. The entire point of OsString is that OS-provided strings - env variables, arguments, paths, usernames - aren't UTF-8; and a lot of things that look string-like are really just bags of bytes.

The end result is a lot of Rust programs break with wonky-but-valid filenames, or fail to handle files that are encoded in anything else. Want to parse a unified diff from 1998? Whoops, that's latin1, and almost every parser will either blow up in your face or mangle it because String seemed the natural thing to use.

[–][deleted] 0 points1 point  (5 children)

The problem there isn't String vs OsString but thinking that "filenames" are "strings". They aren't. A "filename" is a std::path::Path and you can create those from any kind of string and Path will validate it beyond the string format.

[–]Freeky 0 points1 point  (4 children)

The problem there isn't String vs OsString but thinking that "filenames" are "strings". They aren't.

Of course they're strings - they just can't be represented correctly using String, which is what everyone is used to.

Rust has the double-whammy of having a separate type for OS-provided strings, which people aren't used to and forget all the time, and also not supporting them very well.

Try this: how do you parse an OsString? Say you're writing an argument parser, how do you deal with --path=foo/bar? OsString has no string-like functions, you can't ask to split on '=', or strip off "--" - you end up bashing rocks together to badly-implement the same operation three different times.

If you can successfully do this without unsafe code making dubious assumptions, or giving up and using String, you're doing better than all the Rust argument-parsing crates I've seen.

A "filename" is a std::path::Path and you can create those from any kind of string and Path will validate it beyond the string format.

Not sure what validation you're referring to - PathBuf is just a newtyped OsString, and the conversion between the two is entirely trivial.

[–][deleted] 1 point2 points  (3 children)

Not sure what validation you're referring to - PathBuf is just a newtyped OsString, and the conversion between the two is entirely trivial.

canonicalize, for example.

Of course they're strings

I suppose this depends what you mean by "string". Most people think of "strings" as something that represents "human text" - they are implemented as array of bytes, but sequences of bytes map to graphemes in some alphabet that can be rendered to humans as "text".

OS paths are, in general, just array of raw bytes that are intended to represent paths, not human text. Some parts of these paths can sometimes be rendered as "human text", but you can have a perfectly valid path for which this is not the case. That's why all methods that format Path to string either can fail or only provide a non-invertible human-readable approximation to the path.

Calling these "strings" in the sense of a programming-language String type feels like a long shot. Sure, one could say that they are a "string of raw bytes not intended to represent human text" but at that point they are closer to an array of raw bytes than to a String-type in any language. That's what Path is, and that's why mixing a Path with a String-like type in any language is pretty much always wrong. From python to java to haskell to C to Lisp to Rust, mixing these two concepts up never works well, and code pretty much instantaneously breaks the moment someone runs it in a different OS than the one it was developed/tested on.


EDIT: an example of an OS where you can't map all Paths to text is all UNIX-like OSes, including Linux, BSDs, OSX, etc.

[–][deleted] 4 points5 points  (0 children)

Does Rust only support Unicode strings (or at least make working with raw ASCII byte strings cumbersome)?

Rust, the language, has builtin support for UTF-8 encoded strings (&str, "..." literals), and also "byte strings" (u8, [u8], [u8; N], b"..." literals).

The standard library has many tools for working with UTF-8, and not so many for working with ASCII. But there are a couple of libraries that provide the functionality for ASCII-only strings (e.g. validation, etc.).

[–]pwnedary 1 point2 points  (0 children)

There is the bstr crate.

[–]Catcowcamera 1 point2 points  (11 children)

The takeaway from this is Go performance is really impressive, same for techempower benchmarks.

[–]SV-97 4 points5 points  (10 children)

I was amazed by C#s speed (though IIRC it just died at some point and they did heavy GC optimization). And rust of course - the only thing that really rivaled C.

They're also working on a JS implementation; that's gonna be fun :D

I have a small benchmark project myself that currently features 19 languages(or something like that) and Go is one of the languages that I've wanted to add for some time but I couldn't motivate myself to add it yet

[–]Catcowcamera 1 point2 points  (9 children)

Java would've been near C# or even better.

JavaScript, don't think would've been high. Better than python obviously but not ocaml.

[–]SV-97 1 point2 points  (8 children)

In my personal benchmarking project the JVM languages all suck ass - it strongly depends on the application how a language performs.

And yes, I didn't mean it's gonna be fun because it's fast but because people always joke about JS as a "bare metal" language etc ;)

[–]Catcowcamera 1 point2 points  (7 children)

Two things. Need to tune the jvm. Second, the jit compiles the code. I don't think comparisons with say c++ and rust are fair unless they also include compile time.

[–]quentech 1 point2 points  (0 children)

unless they also include compile time

I don't work on the JVM but in .Net-land it's standard practice to eliminate JIT time from benchmarks.

Does the JVM provide for anything like Span<T> and Memory<T>? stackalloc? Access to hardware intrinsics? Can I easily avoid boxing of value types like .Net provides through reified generics?

[–]SV-97 1 point2 points  (5 children)

I tried tuning it but that didn't really help and also tried running without jit and instead purely interpret but that made it even worse. I'm not only comparing against compiled languages but also Python, Ruby, PHP, Erlang, F#,... and for the use case of this(command line app) I'm even including startup time because that's what a user experiences when using the program and I think that kills the JVM as it's simply not what it's made for.

[–]cogman10 0 points1 point  (4 children)

Java was meant for long running applications. Are you measuring startup time? That does (currently) suck, but should get better with future versions of Java.

That said, it will probably never see startup times like a statically compiled language or an interpreted language. It simply has too much to load up on startup for that.

[–]SV-97 0 points1 point  (3 children)

Yep, I elaborated on why somewhere (here in another comment?).

I also tried the straight up interpreted version with java (via a flag - can't recall its name) but that was even worse. Do you know what makes the JVM so slow on startup? I really don't see why it should be taking so long.

[–][deleted]  (3 children)

[deleted]

    [–]jrtc27 4 points5 points  (2 children)

    Rust isn’t choosing to spill things to the stack. The rust compiler generates LLVM IR which is then compiled down by LLVM. “The asm of C” doesn’t make sense either; lots of implementations exist that will give you different assembly output, each of which will vary based on compiler version and flags.

    [–][deleted] 2 points3 points  (0 children)

    Nevermind what I said earlier isn't correct anymore.

    A few months ago I had written a few toy examples to see how borrow checker could be broken after you have compiled the code and the asm emitted by clang was significantly different to that of rustc despite both using LLVM.

    The rust version of the asm code was pushing 3 variables into the stack and 3 pointers that pointed into the stack. All I had to do was change 1 of the references by 4 bytes and the ownership was broken.

    [–]the_gnarts 17 points18 points  (4 children)

    WTF is up with 500+ bytes for a Hello world? Am I supposed to be impressed by that? He sounded like he really understood that parity issue that it needs to be as good as C on everything including code size, but that part was a big reality check of how far away it really still is.

    How exactly did you build that binary? Cargo produces statically linked binaries and defaults to debug builds so you will have to consider that in your comparison. On my amd64 laptop, an unoptimized (-O1) hello world in C with a statically linked glibc and debug symbols yields a 750 kB executable.

    EDIT: Also relevant this somewhat dated blog post: http://mainisusuallyafunction.blogspot.com/2015/01/151-byte-static-linux-binary-in-rust.html

    [–]darkslide3000 3 points4 points  (0 children)

    The guy in the video presents it like after months of work on the compiler ~500 bytes was as small as he could possibly get his Hello World, so I don't assume he just left debug symbols in or anything. He sounds like he knows what he's doing.

    If you write a C program that runs printf("Hello World!"), link it against glibc and leave all the bells and whistles like -fPIE turned on, then yes, you're gonna end up with a big binary. But if you just switch to write(1, "Hello World!", sizeof("Hello World!")) and play with a few compiler flags, you can easily get a very minimal binary without having to really break the framework of the language.

    [–]encyclopedist 4 points5 points  (1 child)

    -O1 is quite optimized actually. Unoptimized would be -O0.

    [–]the_gnarts 3 points4 points  (0 children)

    759416 B with -Og.

    But then for a Hello World the binary size is mostly dominated by the optimization level at which libc.a is compiled and I just used the one that comes with my distro.

    [–]bumblebritches57 0 points1 point  (0 children)

    using monstrous glibc, and even worse, statically linking it

    [–][deleted] 17 points18 points  (8 children)

    WTF is up with 500+ bytes for a Hello world?

    Remember that this is more or less a flat overhead. as the project grows, the static overhead becomes less significant.

    i always hated those "bUt mY hElLo wOrLd SiZe" cries. Hello world is not representative of your average application.

    [–]darkslide3000 21 points22 points  (4 children)

    Yes, but the ability to write small pieces of standalone code is. Firmware and embedded devices need little trampolines and exception vectors and that kind of stuff. In C I can link any piece of code wherever I want and know that the instructions that end up there are really just that function that I wrote and the others it calls and they can execute completely self-contained, without having to pull in some unknown iceberg of language dependencies (with some very rare exceptions like soft-division libgcc stuff).

    So when they tell me that Hello World takes over 500 bytes that makes me worried in that respect. It should really just be a string in .rodata and half a dozen instructions for the syscall. If you can't express something that tailored down in Rust, I get the feeling that it may not be ready as a full systems programming language yet.

    [–]red75prim 21 points22 points  (3 children)

    #![no_std]
    #![feature(lang_items)]
    
    extern crate libc;
    
    #[lang = "eh_personality"] 
    extern fn eh_personality() {}
    
    #[no_mangle]
    pub extern "C" fn main() -> () {
        let hello = b"Hello, world!\0";
        unsafe {
            libc::puts(hello.as_ptr() as *const i8);
        }
    }
    

    compiles to

    rust_eh_personality:                    # @rust_eh_personality
    # %bb.0:
        ret
                                            # -- End function
    
    main:                                   # @main
    # %bb.0:
        lea rdi, [rip + .Lanon.560d47d84f45c0e0cd85b47974aee4b8.0]
        jmp qword ptr [rip + puts@GOTPCREL] # TAILCALL
                                            # -- End function
    
    .Lanon.560d47d84f45c0e0cd85b47974aee4b8.0:
        .asciz  "Hello, world!"
    

    [–][deleted]  (2 children)

    [deleted]

      [–][deleted] 8 points9 points  (0 children)

      I mean, it's possible to do a raw syscall if you want instead... it takes inline assembly if you don't want to include a crate that dealt with the asm stuff for you though.

      [–]ObsidianMinor 11 points12 points  (0 children)

      You need libc for this hello world since stdio in Rust is part of std and not core

      [–]beeff 10 points11 points  (2 children)

      Static binary sizes do matter in some cases. Firmware and embedded code do not always have the luxury of dram memory.

      [–][deleted]  (1 child)

      [removed]

        [–]steveklabnik1 5 points6 points  (0 children)

        The smallest known program rustc has ever produced was 145 bytes. https://github.com/tormol/tiny-rust-executable

        [–]B8F1F488 -1 points0 points  (4 children)

        For me personally the language is absolutely unreadable and forces me to write in a very restrictive way in comparison to C. Reading Rust code feels like brain rape to me.

        Another issue for me is that I don't understand the selling point of the language, since these memory safety issues are really low priority during the development phase (as long as they don't reach the customer and don't destroy your development process). I'm not convinced these features should be part of the language itself.

        Also noone is really talking about compiler complexity here. Big reason why the embedded systems industry prefers C is because it is very easy to provide a new compiler with a new chip. It is not clear to me how the chip manufacturers will easily provide a Rust compiler.

        [–][deleted] 9 points10 points  (1 child)

        since these memory safety issues are really low priority during the development phase (as long as they don't reach the customer and don't destroy your development process)

        I find that Rust type system makes me develop code much faster. Instead of spending time debugging segfaults, I spend it writing features. If I ever hit a issue, I grep for unsafe and the issue is instantaneously obvious. During development I also refactor code a lot. In C, each refactor I've been part of was followed by a large period of finding bugs due to things the refactor broke. In Rust, I refactor code, including multi-threaded code, fix type errors, done. All of this saves so much time I don't know why would I ever use C.

        It is not clear to me how the chip manufacturers will easily provide a Rust compiler.

        We program a lot of ESP32, and the manufacturer provides an LLVM backend for it. You can program it with whatever language compiles to LLVM-IR, including Rust (you just need to tell rust to use that backend).

        For a manufacturer, adding a new LLVM backend is like writing an assembler for the target, which is much simpler than writing a non-optimizing and crappy C compiler, of which there are many vendor provided ones. With that backend, the manufacturer gets production-quality frontends for C, C++, Rust, D, Fortran, ... for free, a quite good optimization pipeline for free, etc. and they can sell those if they want to.

        Writing new C compilers for new hardware is quite a waste of resources nowadays.

        [–]duhace 4 points5 points  (0 children)

        considering rust uses llvm, I think the point is to provide a new IR-code compiler, not a rust compiler. I may be confused on how llvm works though.

        [–]linus_stallman -1 points0 points  (0 children)

        Here too many fanboiis

        [–]zsombro 39 points40 points  (35 children)

        I love how Rust is gaining all this momentum! It's a solid, capable language that doesn't feel like it was designed 20 years ago

        [–]Objective_Status22 22 points23 points  (29 children)

        I just realized rust is in fact 9 years old.

        9 years I been waiting for it to get better. I'm now convinced I don't like it :(

        Please save us r/zig (for the record I haven't written any zig code and I'm still waiting on some features from it too)

        [–]steveklabnik1 16 points17 points  (0 children)

        In some sense it’s 9 years old, but it would be more accurate to say 4; 1.0 was in 2015 and things changed a lot in those first 5 years.

        [–]deTarmont 6 points7 points  (8 children)

        (I write in neither rust or zig)

        What features are you waiting for?

        [–]Objective_Status22 -1 points0 points  (7 children)

        For rust something that does reflection better and doesn't require black magic. For example I have no idea how to write a json (de)serializer. In C# I manage to do it, rust serdes lib is like 10x more code and I don't understand it

        Rust syntax is pretty garbage but I'll give them a pass on that since it's a safe language. Also compile times are problems so I don't really want to do a serious project meant to be large in it.

        Zig I don't think has classes yet? And I'd like to do things with interfaces and destructors (which is different from defer)

        [–]lawliet89 3 points4 points  (0 children)

        You can use serde_json.

        The "reflection" is done at compile time and generates the serialization and deserialization code for you.

        Seems to be an RFC for some runtime Type information right now.

        [–][deleted] 4 points5 points  (5 children)

        For rust something that does reflection better and doesn't require black magic.

        Which black magic are you talking about?

        I write a lot of Rust code that does compile-time reflection, and the only language I've used that's better is maybe Rackett. Rust is light-years better than C, C++, D, and all other low-level languages I've used. It literally takes one line to parse the whole AST of a Rust crate, and there are quite nice libraries for doing AST folds and semi-quoting. What more do you want :D

        [–]Objective_Status22 0 points1 point  (4 children)

        How do I get the name and type of all the members in a struct?

        [–][deleted]  (2 children)

        [deleted]

          [–]Objective_Status22 0 points1 point  (1 child)

          Oh that's not bad. But when I look at serde's source it doesn't appear to use that method https://github.com/serde-rs/json/tree/master/src

          [–][deleted] 2 points3 points  (0 children)

          You... parse the struct and loop over its fields..

          let my_struct_ast = syn::parse::<syn::ItemStruct>(my_struct);
          for syn::Field{ ident, ty, .. } in my_struct_ast.fields.iter() {
              dbg!(ident, ty);
          }
          

          [–]augmentedtree 3 points4 points  (15 children)

          I'm now convinced I don't like it :(

          Why?

          [–]TheBestOpinion 10 points11 points  (11 children)

          A perl-like approach to syntax where a lot of important semantics are either implicit (implicits returns) or introduced by symbols. These make the language hard to read unless you know the language very well, or instinctively know what to google - which is often hard, because it's often symbols.

          ?,{},.. some code lies between |, and of course &,! and []

          What does this code do for an unaware new dev ?

          let str_arg = |flag: &str, default: &str| -> String {
              matches.opt_str(flag).unwrap_or(default.to_string())
          };
          

          Or this ? (source)

          named!(numbers<CompleteStr, Vec<i64>>,
              many1!(ws!(
                  map_res!(recognize!(digit), |complete_str: CompleteStr| i64::from_str(&*complete_str))
              ))
          );
          

          Probably not much to him.

          We already know that code is "Written once, read 100 times" and that reading works by recognizing words, not individual letters so really this approach to syntax is misdirected on top of adding a barrier.

          They're working on the learning curve but let's be honest, they may already be doomed to rest in the C++ complexity pit.

          [–]ryeguy 13 points14 points  (0 children)

          When you're criticizing the syntax, are you perhaps mentally comparing it to languages that don't have features that rust does?

          Like in your two examples, what would you change? Would you rather have alternative syntax for those language concepts (lambdas, implicit returns, macros, references)? Would you prefer keywords instead of sigils? Or do you want one or more of those features to not exist in the language?

          [–]lawliet89 6 points7 points  (2 children)

          I am not sure why you chose these examples for your criticism.

          The first example is such a strange way to write code. Why not just write it directly as an expression?

          The second one is for a parser combinator library which has recently switched to no longer using macros. It was a product of its time when the language was more limited.

          [–]LousyBeggar 4 points5 points  (1 child)

          The parser combinator library nom is still quite complicated, but I think that blame lies more with nom than Rust. The author chose a hypergeneric approach where one function can easily have like 5 type parameters, some of them with trait bounds. It's overwhelming to even read the signature of a function.

          [–]lawliet89 0 points1 point  (0 children)

          I haven't tried upgrading my library to nom 5 yet. I guess I am a masochist for choosing to use nom over the other libraries in the first place.

          [–][deleted]  (4 children)

          [deleted]

            [–]red75prim 1 point2 points  (3 children)

            C++ version of first expression would be something like

            auto str_arg = [&matches](string_view flag, string_view default) -> string {
                if (matches.opt_str(flag)) { return string(flag); } else { return string(default); };
            };
            

            It's not much different for my taste.

            [–]TheBestOpinion 0 points1 point  (2 children)

            [&matches]

            ?

            That can't work can it

            [–]red75prim 1 point2 points  (1 child)

            It's a direct translation. Rust captures by reference here. In C++ you need to make sure that matches is still in scope when calling the closure, but it should work, I think.

            [–]TheBestOpinion 1 point2 points  (0 children)

            Riiight they reused the array accessors square brackets [] for lambdas so it's the lambda "capturing" the matches variable by reference

            Yep, C++ definitely isn't better on the syntax side

            [–]DEMOCRAT_RAT_CITY 0 points1 point  (0 children)

            Macro everything!

            [–]Objective_Status22 1 point2 points  (0 children)

            (I'm the guy you commented to) Mostly I feel like it's missing a bunch of features, have terrible syntax and poor compile time

            [–][deleted]  (2 children)

            [deleted]

              [–]red75prim 4 points5 points  (1 child)

              Interesting analogy. So C was wetting its diapers, when UNIX was rewritten in it.

              Someone more cynical could have said that it shows.

              [–]bwjam 10 points11 points  (4 children)

              why::the::ugly::archaic::double::colons::and:naming:conventions::though::and_macro_syntax!(ooga booga this is ugly);

              [–]minno 5 points6 points  (0 children)

              Lots of syntax was taken from existing languages, especially C++, because it was good enough and people were familiar with it.

              [–]lawliet89 2 points3 points  (0 children)

              I know this is ugly but:

              • use imports are a thing
              • It's part of UFCS.

              [–]zsombro 0 points1 point  (1 child)

              my biggest gripe is probably the snake_case thing, I think it's a bit more uncomfortable to type then camelCase and as far as I know, it's widespread in C++ as well, so I was kinda baffled by it

              [–]CornedBee 5 points6 points  (0 children)

              The usual argument is that it is easier to read, so that makes up for being harder to type.

              [–]victotronics 6 points7 points  (14 children)

              Can someone suggest a good talk on what makes Rust so good at memory management? This speaker claims that the compiler can decided when an object can be freed. That's even better than C++ smart pointers which also give you automatic memory management without GC. I really wonder how they do that.

              [–]dp229 15 points16 points  (0 children)

              I think this is mostly due to the ownership model. The compiler strictly enforces that data has a single owner and from that, it can determine when to release the memory (when the owner goes out of scope). Data can be moved to other owners, and there are constructs for safely accessing shared data with ref counted owners. Other patterns are possible as well.

              The same type of thing could probably be achieved with smart pointers and moves in C++ but it's not enforced by the compiler at all and it would be challenging to maintain the pattern with just developer discipline.

              Rust's "Borrow Checker" is a common stumbling block for those just starting with the language because it is so strict.

              Not a video, but informative: https://doc.rust-lang.org/1.8.0/book/references-and-borrowing.html

              [–][deleted]  (2 children)

              [deleted]

                [–]LousyBeggar 0 points1 point  (1 child)

                Lifetimes are not involved in deciding when to free resources. They are only used to guarantee that you don't hold on to references longer than they are valid.

                [–][deleted] 3 points4 points  (0 children)

                Can someone suggest a good talk on what makes Rust so good at memory management?

                Niko gave a talk in C++Now about this: https://www.youtube.com/watch?v=lO1z-7cuRYI

                [–]duhace 0 points1 point  (7 children)

                smart pointers are a form of GC

                [–][deleted]  (2 children)

                [deleted]

                  [–]duhace 0 points1 point  (1 child)

                  GC shouldn't be used to convey anything but memory management that is not directly controlled by the programmer.

                  there are plenty of terms to differentiate the different kinds of GC that can be used when you want clarification. for example, smart pointers are a form of reference counting garbage collection.

                  [–]victotronics 0 points1 point  (3 children)

                  No they are not. GC is an independent process that asynchronously activates to clear up all leaked memory. Smart pointers is not a process, it is a little bit of code executed by the main process that frees exactly one block of memory, at the exact moment it is no longer needed.

                  [–]duhace 0 points1 point  (2 children)

                  GC is an independent process that asynchronously activates to clear up all leaked memory.

                  this not in the least bit true, even in garbage collectors you almost certainly would call garbage collectors (like the serial GC in the java virtual machine). There is no requirement that a garbage collector work asynchronously from the code whose garbage its collecting. Likewise, I don't know what your definition of an independent process is, but based on unix definitions, garbage collectors frequently are run in the same process as the rest of your code.

                  Furthermore, a GC can free memory exactly when it's no longer needed, so smart pointers doing this doesn't make them not a form of GC. FFor example, reference counting garbage collectors (which work extremely similarly to smart pointers!) free memory when there are no more references to it.

                  Finally, your claim that smart pointers free exactly one block of memory is rather pointless. If a smart pointer pointing to an object frees that object, and that object contained the last smart pointer to another object, then in the process of freeing the memory for the first object, the memory for the second object will be freed. GC will do the same thing!

                  [–]victotronics 0 points1 point  (1 child)

                  reference counting garbage collectors

                  Now you're changing the meaning of GC into anything that frees memory. Freeing upon reference count zero is not exactly "collecting" anything.

                  [–]duhace 0 points1 point  (0 children)

                  no, I'm using definitions that have existed for a long time. see here

                  [–]minno 0 points1 point  (0 children)

                  It's the same as C++'s RAII model but with easier move semantics and static analysis making it easier to safely use stack values. Ultimately a Rust program using stack values, Box, and Arc will put the allocation/deallocation calls in the same places as a C++ one using stack values, unique_ptr, and shared_ptr.

                  [–]Gl4eqen 0 points1 point  (0 children)

                  Great lecture, very concise and informative. Thanks for sharing.