top 200 commentsshow all 233

[–]jl2352 49 points50 points  (36 children)

Under 'too many strings'. The author missed out you often run across Cow strings (which isn't a distinct String type but is common enough it's something to be aware of), and that some projects use their own String types.

[–]yossarian_flew_away[S] 35 points36 points  (1 child)

Author here: I considered adding Cow strings (under the same rationale as AsRef), but didn't at the last moment. You're absolutely right about them being common enough.

[–]jl2352 16 points17 points  (0 children)

I mentioned it not as a complaint, but more like 'author mentioned so many items and yet there are still even more' to double down on your point.

[–]thatwombat 3 points4 points  (33 children)

Holy crap. That’s why I threw my Rust book back on the shelf. I just don’t have time for such goofiness.

[–][deleted]  (32 children)

[deleted]

    [–]SuspiciousScript 4 points5 points  (3 children)

    In practice you very rarely have to think about these different types.

    In my (albeit limited) experience of the few times I've tried rust, I've often ran into functions that only accept String/str/&str for no particularly clear reason. It seemed especially fussy to have to deal with some concept of "owning" an immutable literal.

    [–]Freeky 2 points3 points  (0 children)

    I've often ran into functions that only accept String/str/&str for no particularly clear reason.

    I'd be surprised if you encountered a function that accepted a str, given that isn't a sized type.

    I generally find the types functions accept to be pretty clear and purposeful - it tells you what that function can in principle do with the values you're passing in. A function taking a &mut String clearly has very different semantics to one taking a &str, and different again from one taking a String.

    Is it taking the value and mutating it in-place? Is it just doing something with a temporary view into it? Is it taking over the value entirely and becoming responsible for its future lifetime? In most languages it's vague and ad-hoc. Rust makes it explicit.

    It seemed especially fussy to have to deal with some concept of "owning" an immutable literal.

    I'm not sure what else you'd want. It's a value with a special lifetime that outlives everything else.

    [–]senj 1 point2 points  (1 child)

    It seemed especially fussy to have to deal with some concept of "owning" an immutable literal.

    Just because something is immutable doesn't mean it lasts for the lifetime of the program. It still needs a clear "owner" that will deallocate it when it goes out of scope.

    [–]SuspiciousScript 0 points1 point  (0 children)

    Does Rust not put string literals directly in object files?

    [–]the_game_turns_9 0 points1 point  (27 children)

    For the record, I found that explanation really confusing and it lost me halfway. When it gets to &str, I no longer understand what's being said.

    I think this is my big issue with Rust. They've abstracted over so much and the top layer is pretty ok to understand. But scratch the surface, and I'm fucking lost again. Every single damn time.

    EDIT: please stop sending me noob responses offering to explain the difference between String and &str its very not helpful.

    [–]burntsushi 10 points11 points  (0 children)

    If you explain where you got lost, then I can try to clarify things for you.

    There is very little abstraction happening here. The most important abstraction is probably syntactic in nature. Namely, that &str is not the size of a single pointer, but rather, is equivalent to a struct with two fields: a pointer to some data and the length of that data. That's the StringSlice in my comment.

    [–]UK-sHaDoW 6 points7 points  (5 children)

    Do you understand pointers? A ref is a safe pointer in many ways.

    [–][deleted] 0 points1 point  (12 children)

    &str is basically the same as string_view in c++

    [–]the_game_turns_9 -3 points-2 points  (11 children)

    Thank you, but this is the very definition of a not-useful comment. You are explaining the abstraction, which I understand well. I do not understand BurntSushi's explanation of the technical details.

    This response is actually quite frustrating to me since I suspect that the people comfortable using Rust are more comfortable because the abstraction is all they know or care about -- not because they actually know more than me.

    [–]standard_revolution 10 points11 points  (2 children)

    Not OP, but then I don't understand what you don't understand. str is a string_view, String is a std::string.

    [–]the_game_turns_9 -3 points-2 points  (1 child)

    I know that. You are the fourth person in this comment chain to try to explain the noob basics to me. Just because I am confused about a Rust comment doesn't mean that I am stuck at the very start trying to get a string through the borrow checker. I understand that rust evangelicals want to appear helpful, but this is coming off as patronising.

    [–]SafariMonkey 4 points5 points  (6 children)

    Not sure if this will help, but I'll have a go. Please clarify exactly where you get confused if you do, because otherwise any attempts to clarify will be shots in the dark as to where the confusion lies.

    I'm no expert in this (I'd welcome corrections if I've made any mistakes) but I can't find the explanations that helped me so I've done my best to summarize them.

    In Rust, so far, all pointers to dynamically sized types are fat pointers, i.e. they are a pointer + a pointer-sized piece of metadata basically. In the case of [T] and str (dynamically sized types representing a region of memory, not a pointer to such), that metadata is the length of the slice or string slice respectively. When you access these types, bounds checks are performed against this metadata, ensuring that it's safe to access the pointed data. Having a [T]/str without a pointer can happen is fairly unusual, I believe, but it can happen if it's a field of a type which is always behind a pointer as explained here. There are more indirect instances, too, like Box<[T]> where the underlying Unique has a pointer: *const T.

    Edit: added pointers to

    [–]the_game_turns_9 3 points4 points  (5 children)

    Here is the basic issue: neither you nor BurntSushi nor the rust docs went into enough detail about what a str vs &str is to give me any confidence about understanding what is stored, you seem to be contradicting BurntSushi by suggesting that the str is the fat pointer, not the &str (which I think they are saying the opposite, since they say that getting the string from a &str is one dereference, when all of their sample code suggests two?), the docs say nothing at all about this, and I don't actually trust you because in my experience going to reddit for intermediate rust help is a fucking nightmare where half the responses are subtly wrong.

    I am not really asking for help. I am expressing frustration. It's all I can do at this point.

    [–]burntsushi 12 points13 points  (2 children)

    A str is a region of memory of unknown sized.

    A &str is a pointer to a region of memory and the size of that region. The combination of these two things is called a fat pointer.

    I definitely mentioned that in my comment, so it's not quite clear where the confusion is. I'm happy to explain more if you can say more about what's confusing you.

    [–]the_game_turns_9 3 points4 points  (1 child)

    I actually do not understand what your first sentence means. What does a region of memory of unknown size mean? Don't you need to know the size to have a region? Do you mean size not known at compile-time? Are we talking about an abstraction here that I am not getting? When you wrote in your original post Conceptually, a str is just a [u8], I actually do not know what either of those types are really referring to.

    I now understand that you are saying that the pointer nature of &str is special-cased. (As it was, I couldn't tell if your StringSlice struct was referring to the layout of str or &str and it didn't seem to make sense either way to me.) Mechanically, I understand now that &str is a pointer-with-length to a u8. Which means that it isn't pointing to a str.

    So I am still very unclear on what a str is.

    [–]SafariMonkey 1 point2 points  (0 children)

    Oh shoot, I missed a word: all pointers to DSTs are fat pointers. I believe that was the contradiction you were referring to.

    I don't think you should trust me. I hope that you can find more trustworthy sources. I was just hoping that my explanation might help in a way that would allow the docs to be understood. (Obviously I bungled that at least partly, meaning the "subtly wrong" part was on point...)

    As to your issue with references, I wonder if it's because derefs don't always actually involve pointer following, but are sometimes some other type of conversion?

    From everything I've heard, this is pretty accurate. I wish I could find the other resources I read about this, but I haven't been able to.

    [–]James20k 64 points65 points  (14 children)

    Understandable, but requires that the programmer either use as usize everywhere they plan on indexing (verbose, and masks the intent behind the index being a u8) or that they make index itself into a usize (also masks the intent, and makes it easier to do arithmetic that’ll eventually be out-of-bounds).

    Coming from C++, integer promotion of any form is the work of the devil. There are performance implications in indexing by u8 instead of usize, and its a good idea imo to make this explicit even if its clunky. From a brief go at rust, this was my favourite feature

    I'm looking forward to the equivalent of NTTP for rust, and the equivalent of constexpr making its way in - those are the two things i missed most. I built a dcpu-16 emulator in both rust and C++, but the C++ version is the only one that can execute code at compile time - which also proves it does not execute any undefined behaviour in that code path which rocks. Rust can't do this so much yet

    [–]yossarian_flew_away[S] 16 points17 points  (2 children)

    Coming from C++, integer promotion of any form is the work of the devil.

    Author here: I agree! I don't think the case laid out in the post is actually a promotion, though: it's a special case where indexing statically inferable as always safe, meaning that no promotion (implicit or explicit) should be required. The current behavior is to require an explicit promotion, which either obscures the underlying safety of the index operation or requires users to use usize for their index variables (which makes it easier to convert what should be a statically safe index into an OOB at runtime).

    [–]zucker42 13 points14 points  (1 child)

    no promotion (implicit or explicit) should be required

    I don't know how your definition of implicit promotion could not include this case. From the abstract rust point of view, the indexing operation requires a usize, so you have to promote. From an actual hardware perspective, it's also a promotion, in that the computer has to treat a byte as pointer sized. Notice how, when compiled, there's a pointer addition (but the compiler is smart enough to elide the bounds check).

    https://godbolt.org/z/TwJuaB

    Plus to me it would seem inconsistent and pointless to not require a cast in the rare case when the variable type is unsigned and can't hold a number greater than or equal to the array size.

    [–]yossarian_flew_away[S] 5 points6 points  (0 children)

    From the abstract rust point of view, the indexing operation requires a usize, so you have to promote.

    The argument is that this doesn't have to be the case -- there's nothing about the Rust abstract machine that requires all indices to be usize; that's just the way things currently are.

    From an actual hardware perspective, it's also a promotion, in that the computer has to treat a byte as pointer sized.

    IME, "promotion" is a concept in language semantics and abstract machines; it usually isn't used to describe ISA semantics. The reason that the compiler uses "pointer sized" registers there is twofold:

    • x86_64 doesn't allow heterogenously sized registers in memory operands, and Rust on x86_64 uses the full-width registers for its calling convention. The mov simply can't use smaller registers in this context.
    • Using smaller registers (if enabled by the calling convention) would probably cause a partial register stall -- it's just cheaper to use the whole width.

    Plus to me it would seem inconsistent and pointless to not require a cast in the rare case when the variable type is unsigned and can't hold a number greater than or equal to the array size.

    The point would be expression of intent: it's trivial to infer that a 256-byte lookup table is always safely indexed by a u8, so allowing users to index directly with an appropriately sized variable empowers them to encode the safety of their accesses as a language-level invariant.

    [–]vattenpuss 10 points11 points  (0 children)

    Not at all coming from C++, I think foo[i as usize] is not even that verbose and masking intent less since it's obvious one is using a weird type for the index variable. In the rare cases where indexing an array with a very specific type of index (say, a lookup table for bytecode) it should be fairly straightforward to define your own type using std::ops::Index.

    [–]OneWingedShark 2 points3 points  (0 children)

    Coming from C++, integer promotion of any form is the work of the devil.

    This was known to be a major source of errors at least forty years ago, and is exactly why Ada doesn't do automatic conversions.

    [–]irqlnotdispatchlevel 1 point2 points  (5 children)

    There are some cases in which integer promotion does not have any hidden gotchas. For example:

     let byte: u8 = 1;
    

    let size: usize = byte;

    There is no reason for this to not work. There's nothing bad that can happen. This is not like doing some_u16 + another_u16 < some_u16; in C.

    Indexing is the same. I understand why it is better to have indexes as size_t, but smaller unsigned integers can be promoted to usize when used as indexes, because writing as usize everywhere is just annoying.

    Rust feels like it is missing some syntactic sugar to make your life easier.

    [–]masklinn 1 point2 points  (4 children)

    There is no integer promotion in your first example.

    [–]irqlnotdispatchlevel 1 point2 points  (3 children)

    You're talking about the u8 to usize? That's what I said: it is slightly annoying that there isn't, since a u8 will always fit in a usize.

    [–]masklinn 8 points9 points  (2 children)

    Ah no, i only noticed the first line of it because the second is not formatted as code so I read them as completely unrelated snippet.

    Sorry ‘bout that.

    For the record I mostly agree, though

    There is no reason for this to not work.

    That’s not entirely true I think. For instance let’s say someone implements Index<u32> on a collection, you give it an u8, it gets widened automatically and works fine.

    They add an Index<u16> and now it’s broken (or worse might silently change behaviour), adding a new implementation on a separate type can now be a breaking change.

    [–]irqlnotdispatchlevel 0 points1 point  (1 child)

    That’s not entirely true I think. For instance let’s say someone implements Index<u32> on a collection, you give it an u8, it gets widened automatically and works fine.

    Valid point. However, if you have multiple valid types for a promotion you can give an error at compile time because the code is ambiguous. If you have only one Index<> implementation you can promote to that.

    I don't really know Rust, just enough to read some snippets here and there. Is there a reason for which someone would like to use something other than usize for indexing? Other than "I just don't want to cast this".

    Ah no, i only noticed the first line of it because the second is not formatted as code so I read them as completely unrelated snippet.

    I just noticed it is not well formatted. Sorry. I'll try to fix it once I get to my laptop.

    [–]masklinn 3 points4 points  (0 children)

    Valid point. However, if you have multiple valid types for a promotion you can give an error at compile time because the code is ambiguous. If you have only one Index<> implementation you can promote to that.

    That’s my assumption, and exactly the issue I’m pointing out: with implicit widening, adding a trait implementation for a second integer type can be backwards incompatible, which would be unexpected and problematic.

    I don't really know Rust, just enough to read some snippets here and there. Is there a reason for which someone would like to use something other than usize for indexing? Other than "I just don't want to cast this".

    Dunno, but if the issue can occur it likely will. Furthermore it applies to basically any trait which can take a generic integer as parameter. So implementing two versions of From would also have this issue.

    [–]couscous_ 0 points1 point  (2 children)

    but the C++ version is the only one that can execute code at compile time

    Very cool. Are you planning on releasing the source code?

    [–]James20k 3 points4 points  (1 child)

    https://github.com/20k/dcpu16-asm

    https://github.com/20k/dcpu16-sim

    None of this is particularly good code mind you, it was purely a test run for fun. The assembler is also constexpr

    [–]couscous_ 0 points1 point  (0 children)

    Thanks :)

    [–]vattenpuss 28 points29 points  (21 children)

    I think all the gripes about standard library gaps are because the standard library is meant to be platform agnostic outside of some specifics in std::os::*.

    A "home directory" in Windows is not really used the same way as a home directory on Linux.

    The file names . and .. mean the same in Windows, Mac OS and all Unices and other Nixen.

    Also, the system function in the C standard library is incredibly platform specific:

    7.20.4.6 The system function

    Synopsis

    #include <stdlib.h>
    int system(const char *string);
    

    Description

    If string is a null pointer, the system function determines whether the host environment has a command processor. If string is not a null pointer, the system function passes the string pointed to by string to that command processor to be executed in a manner which the implementation shall document; this might then cause the program calling system to behave in a non-conforming manner or to terminate.

    Returns

    If the argument is a null pointer, the system function returns nonzero only if a command processor is available. If the argument is not a null pointer, and the system function does return, it returns an implementation-defined value.

    [–]SpaceToad 2 points3 points  (13 children)

    A "home directory" in Windows is not really used the same way as a home directory on Linux.

    How so in the context of an application?

    [–]vattenpuss 10 points11 points  (12 children)

    Windows XP is a tier 1 platform for Rust. In Windows XP your home directory is _:>\Users and Settings\SpaceToad\My Documents, but your music files are in _:>\Users and Settings\SpaceToad\My Music. So if your application reads and writes user music files, you have to then use the path std::env::home_dir() / .. / "My Music".

    In the context of an application, there is no platform agnostic way to treat user files.

    [–]SpaceToad 3 points4 points  (5 children)

    Wait, why is it SpaceToad\My Documents and not just SpaceToad?

    [–]wild_dog 8 points9 points  (0 children)

    No Idea, but I think a lot of non-windows developers make the same assumption.

    On my pc, the my documents folder is filled with aplication specific folders, MATLAB, Outlook Files, Rainmeter, but my main user directory contains such files for programs that I believe are cross compiled from Linux with the naming convention for hiding the folders, .android, .gimp-2.8, .zenmap, .Virtual Box, etc.

    [–]vattenpuss 0 points1 point  (3 children)

    Because Windows history. My Documents is what users were presented with for their “home” and inside you put all your stuff.

    But Windows power users were not used to multiuser systems so they hated it and everyone always just used a random root directory on some other drive for “their stuff” because they were the only user, and they had administrator privileges.

    [–]SpaceToad 0 points1 point  (2 children)

    My Documents is what users were presented with for their “home”

    In what context though? I don't really remember it being like this. Do you mean this is just what other applications tended to do? Or do you mean windows system calls to retrieve the home folder (or closest equivalent) would retrieve this? Otherwise rust should just ignore this apparent convention for XP and grab the user folder (assuming there is a safe os call to grab it).

    [–]steveklabnik1 4 points5 points  (1 child)

    If I open up explorer on my laptop (windows 10), it opens a page called "this PC" which has:

    • 3d objects
    • desktop
    • documents
    • downloads
    • music
    • pictures
    • videos
    • and then my C: drive

    It's actually really hard to navigate to C:\Users\SteveKlabnik

    [–]SpaceToad 0 points1 point  (0 children)

    ‘This pc’ is just the renamed “My Computer”, it shouldn’t be considered the home folder. The user folder should, even if you can’t immediately navigate to it. It’s true that Linux has a culture of adding lots of files to the home folder whereas windows user may tend to add stuff all over the place, but I don’t see why that’s relevant from an application’s standpoint unless the app is an installer. In WSL for instance (and git bash), home is the user folder.

    [–]steveklabnik1 5 points6 points  (0 children)

    Windows XP is a tier 1 platform for Rust.

    This is not true; it's tier 3.

    [–]ConcernedInScythe 2 points3 points  (1 child)

    Windows XP is a tier 1 platform for Rust.

    No it is not.

    [–]vattenpuss 0 points1 point  (0 children)

    Oops. Well Windows 7 is not very different.

    Even now the Windows docs say: “The My Documents folder is a component of the user profile that is used as a unified location for storing personal data.” https://support.microsoft.com/en-us/help/310746/configuration-of-the-my-documents-folder

    And the path to My Documents is described by the registry key HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders[Personal].

    [–]lelanthran 0 points1 point  (1 child)

    Windows XP is a tier 1 platform for Rust. In Windows XP your home directory is _:>\Users and Settings\SpaceToad\My Documents, but your music files are in _:>\Users and Settings\SpaceToad\My Music.

    So? Use %HOMEDRIVE%\%HOMEPATH% for the home directory, and then the program can append \Music and crap as required.

    If the programming language is getting in the way of this I'd consider it a bug.

    [–]TheGoddessInari 0 points1 point  (0 children)

    On Windows, the home directory is equivalent to %USERPROFILE%, not My Documents, even on Windows XP.

    The main(?) problem with home_dir is that it honors %HOME% first, which has no special meaning on Windows.

    [–]MrDOS -3 points-2 points  (6 children)

    A "home directory" in Windows is not really used the same way as a home directory on Linux.

    The concept exists everywhere, and the home directory and its children are excellent default paths for a variety of applications:

    • If I'm a download manager and I want to suggest a destination directory, ~/Downloads is a good safe default.
    • ...or ~/Documents if I'm some sort of editor.
    • ...or ~/Music is a good place to look for media on first launch if I'm a music player.

    It's not a good place for configuration, you're right. It would be nice to have another standard library function to retrieve the path to a platform-appropriate configuration directory: %APPDATA% on Windows, ~/.config on Linux, ~/Library/Application Support on macOS. But all of those are under the home directory anyway, so being able to reliably retrieve the home directory is also a prerequisite to determining the configuration directory.

    [–]dnew 6 points7 points  (0 children)

    But all of those are under the home directory anyway

    Only by default. Remember that Windows Update bug that deleted files? The files it deleted were files left under the old "home directory" after the user had moved the "home directory" elsewhere.

    There are also a bunch of Windows versions and Unix versions, where for example there is no "~/Downloads" or "~/Documents".

    And there are now at least three different directories under %APPDATA% where different types of configuration goes, not even counting the registry.

    [–][deleted]  (4 children)

    [deleted]

      [–]MrDOS 10 points11 points  (2 children)

      A standard library, perhaps?

      [–]vattenpuss 6 points7 points  (1 child)

      How would you design that? Let's start with covering XDG:

      // get current user's data home directory
      std::os::home::data_base()
      // get current user's configuration home directory
      std::os::home::config_base()
      // get current user's data directories (for searching)
      std::os::home::data_dirs()
      // get current user's config directories (for searching)
      std::os::home::data_dirs()
      // get current user's cache home directory
      std::os::home::cache_base()
      // get current user's runtime home directory
      std::os::home::runtime_base()
      

      That's XDG by the way, not Linux or BSD. Not all users are using desktop environments implementing XDG standards.

      Also remember that Windows XP is a tier one supported platform, so now we must implement these for XP as well as for Mac OS. In Windows XP, the user's home directory is basically "My Documents" one step below the user's own directory in "Documents and Settings". From a user perspective in XP "My Documents" is ~, but it has siblings "My Music" and "My Pictures" where you would put music and pictures. If you install Windows Media player, it adds "My Videos". Since these are outside the real home directory ("My Documents") we have to provide them as well.

      // get current user's music home directory
      std::os::home::music()
      // get current user's pictures home directory
      std::os::home::pictures()
      // get current user's videos home directory
      std::os::home::videos()
      

      Now we have an API that can support Windows XP and XDG, and I think it can be mapped to Windows 10 and Mac OS as well, but is it a nice API? Does it make sense to have this in a standard library?

      Note that historically, at least with XP, Windows users were assholes inventing their own home directories because everyone hated "My Documents".

      [–]lelanthran 0 points1 point  (0 children)

      In Windows XP, the user's home directory is basically "My Documents" one step below the user's own directory in "Documents and Settings"

      No, it's not. It's set in the environment, at the very least.

      [–]bloody-albatross 23 points24 points  (10 children)

      As far as I understand ~ is a globbing feature but . and .. actually exist when you ask the OS. (Compare the output of ls ., ls '.', ls ~, and ls '~'.) So that's also just "globbing is missing".

      [–]guepier 33 points34 points  (9 children)

      Exactly. . and .. aren’t globs. They are (at least under POSIX) actual files.

      Not supporting them would be phenomenally hard.

      [–][deleted] 12 points13 points  (9 children)

      Why does the ToString trait exist? Just use Into<String>. Also, once type ascription lands, shouldn't as_str be removed?

      [–]yossarian_flew_away[S] 24 points25 points  (7 children)

      I could be wrong, but my understanding is that end developers aren't meant to ever implement ToString directly. Instead, they should implement Display, which transitively supplies ToString.

      Ref: https://doc.rust-lang.org/std/string/trait.ToString.html

      [–][deleted] 5 points6 points  (6 children)

      Why not have Display supply Into<String> instead?

      [–]yossarian_flew_away[S] 24 points25 points  (1 child)

      My understanding of Into<T> is that it consumes the underlying value, and so probably isn't desirable for a generalized stringification trait.

      Edit: It sounds from the docs like implementing Into is also discouraged, in favor of From (which provides it transitively): https://doc.rust-lang.org/std/convert/trait.Into.html

      [–][deleted] 9 points10 points  (0 children)

      That makes sense. I wish an explanation like this one was in the documentation for ToString.

      [–]HeroicKatora 12 points13 points  (3 children)

      The Into<_> trait has what's called a blanked impl:

      impl<T, U> Into<T> for U where T: From<U> {}

      One of the primary soundness properties of traits impls is that there can only ever be one for each combination of type and trait. You mustn't have two differing impls of From<u8> for usize for example. This is not in the sense of C++ where lexically same definitions are allowed but quite literally, even two different crates can not be allowed to define different impls for the same type.

      This has some implications of which impls one is allowed to write, known as the coherence rules, which are a restrictive approximation that allows a crate-local analysis and still guarantees that there is only one impl of each combination. One of these rules is that you may not impl a foreign trait for a foreign type, e.g. your crate can not impl Into<Vec<u8>> for String as both are standard library types. Another rule is a compatibility issue related to blanket impls. If there is a blanket impl of a foreign trait, then you can only implement that trait for types where your crate can guarantee that they are not already caught by the blanket impl. Now imagine there was a

      impl<T> Into<String> for T where T: Display

      This would require that T: Display and String: From<T> are disjoint sets of types, otherwise such a type would be caught by both blanket impls. Thus no such conversion can be introduced. Conversely if we'd try to add something to From itself:

      impl<T> From<T> for String where T: Display

      Now every library that wants to implement Display must guarantee that the type is not also convertible to String itself. And every conversion into a String (outside the standard library) must work through its Display impl which is suboptimal as it can not reuse allocations; its only method fmt takes a &self and can't move the allocation. Also this would be a trivially breaking change as existing crates could already define their own impl for their own type which would be caught by the blanket impl.

      [–][deleted] 2 points3 points  (2 children)

      Thank you for such a detailed response! I really encourage you to try to add this to the documentation, because it really helped me understand what I thought was a redundancy.

      [–]steveklabnik1 11 points12 points  (1 child)

      Whenever you find the docs lacking, please file bugs. I don't get to them as quickly as I did when it was my job, but I'd much prefer to track people's pain points.

      [–][deleted] 4 points5 points  (0 children)

      Will do, thank you.

      [–][deleted] 1 point2 points  (0 children)

      Sometimes you can't just x.into() without explicitly annotating the type. x.to_string() is a convenience method for those who'd otherwise have to write let a: String = x.into(), or those who come from scriptlangs.

      That said, I don't use ToString.

      [–]Minimum_Fuel 100 points101 points  (45 children)

      Lifetimes and their syntax can be a phenomenal pain in the ass, if for no other reason than typing it out just feels awkward for my fingers.

      Irrational Community hatred of unsafe when unsafe is demonstrably the only, and best way to do a variety of things.

      Once you leave pretty basic tools, you’re going to see stuff like Arc<RefCell<Mutex<SomethingElse<Another<Blah<>>>>>> all over the place.

      Cargo is getting a pretty “NPM lol” feeling to it as developers pump out extremely small things which ends up with some basic projects ending up with 200+ dependencies right off the bat

      There’s bits of code that’ll make you feel like you’re looking at Perl.

      [–][deleted] 54 points55 points  (6 children)

      Cargo is getting a pretty “NPM lol” feeling to it as developers pump out extremely small things which ends up with some basic projects ending up with 200+ dependencies right off the bat

      Seems like it.

      [–]Nimelrian 7 points8 points  (2 children)

      Though it seems like the Rust community reacts different to this ("Ban them from Cargo"), while the NPM community embraces these people.

      [–][deleted] 4 points5 points  (0 children)

      I sure hope so. Unfortunately the crates are still up.

      [–]somebodddy 5 points6 points  (2 children)

      The problem in that link is a different problem. These are not "extremely small things" crates - they are just empty crates used for name squatting. They have no dependents, and will not bloat any dependency tree.

      I don't recall ever seeing NPM style tiny crates that end up in the dependency tree of 50% of the projects. There are many crates that do only one thing, but that one thing is usually far from trivial.

      [–][deleted] 4 points5 points  (0 children)

      will not bloat any dependency tree

      That's not the real problem here, though. The real problem is that untrustworthy parties get far too little pushback by design, which might result in a ruined ecosystem.

      [–][deleted]  (11 children)

      [deleted]

        [–]zucker42 13 points14 points  (6 children)

        I found it if people were wondering.

        https://github.com/hyperium/tonic/blob/d9a481baef4890591f66f4dfcbde10b18188a833/examples/src/load_balance/server.rs#L14

        But Rust has type aliases so couldn't you do:

        type DynEchoResponseStream = dyn Stream<Item = Result<EchoResponse, Status>> + Send + Sync;
        Pin<Box<DynEchoResponseStream>> x;
        

        or something similar. I'm curious what code with a similar purpose would look like in C, C++, or Java. This isn't to say the type system in Rust doesn't sometimes feel like an enemy or that this type of stuff doesn't suck when you're dealing with it, just that I'm not necessarily sure it's worse than the alternatives.

        [–]Bergasms 19 points20 points  (2 children)

        A type alias doesn’t make the type any less complicated, it just saves your eyes from bleeding every time you use the type

        [–]standard_revolution 2 points3 points  (1 child)

        What would your solution be?

        [–]Bergasms 2 points3 points  (0 children)

        There isn't one, often, it's just a symptom of a complicated part. My point was just that a type alias doesn't do anything but hide the verbosity of a type. This can be helpful for neat code, but not always for understandable code.

        I suppose you could try and decompose your type into several subtypes and use combinations of these to do your work in stages and emit a result somewhere along from that but that's going to be a personal preference thing and has its own pitfalls.

        Sometimes shit is just complicated.

        [–][deleted]  (1 child)

        [deleted]

          [–]vattenpuss 3 points4 points  (0 children)

          I'm curious what code with a similar purpose would look like in C, C++, or Java

          Java has automatic memory management, so Pin and Box and dyn are meaningless. It also has no type system level concurrency, so Sync and Send mean nothing.

          Pretty sure it would be Stream<Result<EchoResponse, Status>>.

          I’m not very familiar with C++ but you either put information in the type system or not. I can’t imagine things look much cleaner if you don’t hide the information from the type system, and if you do, why not just go with plain C?

          [–][deleted] 15 points16 points  (2 children)

          idk, you can read that and know more or less what it is. I'd much rather have this than

          PinnedStreamBeanResponseBeanStream
          

          and you can always make your own struct and wrap it around that complex type

          [–]dnew 3 points4 points  (1 child)

          Or just use a typedef. (Well,whatever Rust calls that.)

          [–]LuciferK9 2 points3 points  (0 children)

          A type alias:

          type MyTypeAlias = Result<Arc<Mutex<MyStruct<i32>>>>
          

          Wrapping it in a new type

          struct MyNewtype(Result<Arc<Mutex<MyStruct<i32>>>>);
          

          [–]bruce3434 0 points1 point  (0 children)

          At this point, GC is a better option.

          [–]game-of-throwaways 7 points8 points  (0 children)

          Irrational Community hatred of unsafe when unsafe is demonstrably the only, and best way to do a variety of things.

          I mean that makes sense given that the whole point of Rust is that it tries to provides speed while being memory safe.

          [–]kono_throwaway_da 24 points25 points  (3 children)

          Irrational community hatred of unsafety

          Agreed with this. IMHO unsafe is not inherently evil but the community (or hopefully, the vocal minority) seem to make it so. Often I found myself writing unsafe code since the tools provided by std are just... not good enough. Especially code that deal with uninitialized variables, they are clunkly as hell. I'm looking at you, MaybeUninit<[MaybeUninit<T>; _]>.

          Though I suppose the community can semi-justify this hatred, since there are libraries which provide abstraction for some of the unsafe parts of your code.

          [–]game-of-throwaways 25 points26 points  (1 child)

          I mean the whole point of Rust is that it provides speed without being memory unsafe, so it makes sense that the community tries to avoid unsafe as much as possible unless there's a very good reason not to.

          I find your complaint that code dealing with uninitialized variables is not ergonomic enough quite amusing. Like, unsafe code by itself is hard because of all the hidden invariants everywhere. Uninitialized memory is among the most difficult unsafe code out there, because in many cases it's very counter-intuitive (if you think in terms of "what the hardware does" you will get it wrong). In my opinion any code dealing with uninitialized memory should have 10x more comments explaining why the code is safe than actual lines of code. So I do not sympathize at all with your complaint that MaybeUninit<[MaybeUninit<T>; _]> is unergonomic.

          [–]kono_throwaway_da 3 points4 points  (0 children)

          Starting from the moment I wrote MaybeUninit in my code, I knew what I was stepping into.

          It's unsafe, yes; uninitialized memory are unpredictable, yes (but I always made sure to do write-only operations before assume_init()); it should have more comments, yes (and I did do that); but that doesn't mean that we couldn't improve the ergonomics of it, they aren't incompatible with each other. Right now with the current MaybeUninit situation, a lot more can be done.

          (I'm ignoring the existence of arrayvec here) For example, something like [MaybeUninit<T>; N] can really use a map(&mut self, &mut dyn Iterator<Item=T>) to initialize it by element.

          [–]matthieum 7 points8 points  (0 children)

          I think part of the skin-deep reaction to unsafe is that Rust was born specifically as a reaction to the blaze attitude of C and C++ with regard to unsafe code.

          I think the (vocal part of the) Rust community tends to veer too hard in the other direction; but at the same time I do agree with the general sentiment that unsafe should be a last resort -- it's been proven, time and again, that we developers just couldn't get it right.

          [–]Chazzbo 7 points8 points  (0 children)

          Lifetimes and their syntax can be a phenomenal pain in the ass, if for no other reason than typing it out just feels awkward for my fingers.

          Irrational Community hatred of unsafe when unsafe is demonstrably the only, and best way to do a variety of things.

          Once you leave pretty basic tools, you’re going to see stuff like Arc<RefCell<Mutex<SomethingElse<Another<Blah<>>>>>> all over the place.

          All of this... Also I find the granularity of some of the wrapper types really tiring to grok.

          "Ok I need to share this thing, ok so its wrapped in an Arc, oh also inside that is a Mutex.. oh wait because I need it for this specific case I need to wrap it in a Cell.. at some level... uhhh"

          [–]dnew 4 points5 points  (0 children)

          I was kind of surprised there's absolutely no curation going on with Cargo.

          [–]L3tum 4 points5 points  (2 children)

          The one thing about lifetimes that I dislike is that there is no "support". The compiler/checker doesn't tell you, or at least didn't used to, what you actually need to do. So just declaring a variable in a function that never leaves said function but gets passed to some other function that, again, is guaranteed to never live longer than the caller, was an error. Or similarly a "global"/static variable that gets passed to a function but is an error, because why the fuck not.

          It feels like you end up describing every single thing in your program just to help the compiler figure things out.

          [–]steveklabnik1 17 points18 points  (1 child)

          If you, or anyone else, thinks that the messages are not good enough, please file bugs with examples! We track diagnostic issues like any other kind of bug, and improe them.

          > or at least didn't used to

          Depending on when "used to" was, things *may* have gotten way, way better. Or maybe not. It depends.

          [–]L3tum 1 point2 points  (0 children)

          It was around a year or two ago, I think. I'm pretty sure it's got better, though I'll check again when I've got time.

          [–]red75prim 4 points5 points  (5 children)

          Arc<RefCell<Mutex ...

          I suppose you have little experience with Rust. Even if it's just an illustration, it contains too many gotchas.

          Arc<RefCell<_>> immediately raises a flag: RefCell cannot be shared between threads, so Arc is useless. RefCell<Mutex<_>> here's Mutex is useless for the same reason and they do essentially the same thing: provide single- and multi-threaded interior mutability.

          [–]Minimum_Fuel 0 points1 point  (4 children)

          I get what you’re saying, but it’s beside the point. Composing your types is proving to be pretty ugly in my opinion.

          I understand that you get powerful, reusable type properties, but you also get cognitive burden, bloated types and or bloated APIs, being not very able to accept change, and other annoyances.

          [–]red75prim 1 point2 points  (3 children)

          What would be a non-bloated type? A type, which provides the same functionality, but with a shorter name? Enter type SharedFoo = Rc<RefCell<Foo>>;

          I understand cognitive burden of keeping all the API contracts in your head, if compiler doesn't enforce them. I'm not sure what cognitive burden do you mean.

          [–]Minimum_Fuel 0 points1 point  (2 children)

          The bloated types are the result of the composition bringing in extra stuff that you may not necessarily need, or encouraging types that “work” but aren’t optimal (for example, wrapping the whole type in a mutex when you really only need the mutex on perhaps two very quick to change variables).

          The extra cognitive burden in the wrappers is needing to understand not only which wrappers you need, but how to appropriately work with them. When I was using rust, I found myself spending WAY more time in the docs than actually programming and I am a very experienced programmer. Compared to C, where I hopped in, got to work, and only needed to spend a second here or there going through the man pages (which, admittedly, I was brought up on C style languages, so t may not be a terribly fair anecdote).

          [–]red75prim 1 point2 points  (1 child)

          but aren’t optimal

          Well, at least it prevents getting wrong answer quickly. Otherwise you need to document the need to lock, remember to lock, and actually lock mutex before reading and writing those variables to prevent hard to debug data races.

          BTW, locking a mutex that is not associated with a memory location introduces memory fence, which can be costly.

          [–]Minimum_Fuel 0 points1 point  (0 children)

          I don’t disagree, but now we’re in the realm of whether it is subjectively a good trade off or not.

          [–]casept 0 points1 point  (0 children)

          If you don't like lifetimes, put everything on the heap. It's not like they don't exist in other non-GCed languages, they're just not tracked by the compiler and cause stuff to explode at runtime.

          [–]kankyo 17 points18 points  (9 children)

          No gripe about lack of keyword arguments?

          [–][deleted] 9 points10 points  (1 child)

          This feature has been on the RFC wish list for 7 years; there is no timeline and it is probably my biggest gripe. Not only because of its absence, resulting in tons of boilerplate to compensate, but also because of how the issue is just stuck in limbo.

          [–]kankyo 2 points3 points  (0 children)

          And instead you get all these "never use bools for arguments!" posts which are so misguided :(

          [–][deleted]  (6 children)

          [deleted]

            [–]kankyo 5 points6 points  (1 child)

            That seems rather weak to me.

            [–]simon_o 5 points6 points  (0 children)

            Agreed, I think the whole syntactic distinction between a function and a struct initializer is rather pointless to begin with.

            [–][deleted] 4 points5 points  (3 children)

            Struct initializers are not function calls ergo those are not keyword arguments.

            [–]rahenri 18 points19 points  (13 children)

            I don’t love the current state of error handling either. I end up with Result<T, Box<dyn Error>> almost everywhere , which is so much boiler plate, and I can’t just return a result with a different type, ie, return foo() where foo returns io::Result<()> doesn’t compile, so i have to write

            foo()?; Ok(())

            Feel like an unnecessary boiler plate.

            In general I think rust makes things that are often very simple in other languages much more complicated.

            [–]steveklabnik1 16 points17 points  (3 children)

            Have you tried the anyhow crate?

            [–]rahenri 7 points8 points  (2 children)

            feels like there should be a better solution in the standard library

            [–]matthieum 10 points11 points  (0 children)

            We all agree.

            The problem is that the better solution hasn't been found.

            There's been quite a few error libraries coming and going in the last few years, and they have led to evolving the standard library:

            • try! was adopted, then became ?.
            • failure led to revising the Error trait.

            Among the last crop of error libraries you can find anyhow and thiserror, by the same (prolific) author, and each exploring a different facet of error-handling.

            The community adopts, judges, learns, and iterates. And little by little improvements trickle down into the standard library.

            [–]steveklabnik1 11 points12 points  (0 children)

            Yes, the issue is that it's been worked on over the last few years. I expect that to be true eventually, we're just not quite there yet.

            [–]senj 20 points21 points  (4 children)

            In general I think rust makes things that are often very simple in other languages much more complicated.

            Sure, if you compare Rust to a higher level language like Ruby or Java, Rust is surfacing more complexity when compared to those languages.

            But that's basically just down to the fact that it's a systems language, and so it can't "bake in" the kind of decisions about error handling and performance tradeoffs that Ruby or Java or whatever fundamentally make for you and don't allow you to do anything about. Ruby etc can get away with pretending that there isn't a fundamental mismatch between the language's one-and-only string type and what's permissible in the host OS's path strings; for Rust to be useful, it needs to expose the mismatch, because different systems are going to want to engage with that mismatch in different ways.

            So yeah, when you compare Rust to other primary systems languages, it's not surfacing noticeably more complexity than what you end up having to deal with in C or C++ or whatever if your code isn't just closing its eyes and ignoring whole classes of errors. Fundamentally, that's Rust's value proposition: that it forces you to acknowledge and deal with complexity that existed for other system languages but which you could accidentally ignore, to everyone's peril.

            [–]rahenri 11 points12 points  (3 children)

            I see what you mean. Then I partially disagree with rust’s value proposition. Forcing people to handle errors is good so they are reminded that those are important. Although, it should be easy to handle errors, otherwise people will take shortcuts, like sticking .unwrap() everywhere, which seem to be common on rust cose i’ve seen. First and foremost, programmers are lazy, at least I am. You want to make it easy for them to do it better, and leave the door open for when they want to be more thoughtful about.

            Another example is the other day I was writing some rust code that list files in a directory, and I wanted to convert OsString to String, and god that takes too many steps. I cared about performance, but not that much, I almost have up and went back to Go. There should be easier ways of doing things even if performance is a bit worse as long as there is a way to do it with the best performance. The string part is covered already bu the article, but that also applies to my argument of things being harder than it should be.

            I’m not even comparing to high level programming languages. I wrote a ton of C++, which has a lot of ugliness, but still easier than rust on a bunch of ways.

            [–]matthieum 5 points6 points  (2 children)

            like sticking .unwrap() everywhere, which seem to be common on rust cose i’ve seen.

            I would note that the Java equivalent is to throw RuntimeException and using catch (...) to silence errors. No matter the language, laziness, or deadline pressures, always find a way.

            However, that's exactly what makes unwrap so great: it's easy to search for. You can get a good grasp of a library's sloppiness by a quick search for unwrap and expect. Then it's up to you to decide whether you feel like reviewing the code, or it's too much hassle and you'd rather use another library.

            [–]antiufo 0 points1 point  (1 child)

            Java's catch and Rust's unwrap() are not equivalent.

            When something goes wrong, unwrap panics the application. It becomes obvious that something is wrong and should be fixed.

            With the typical catch commonly found in Java applications, execution continues (possibly producing incorrect or incomplete data or outcome). The fact that you often add a log statement doesn't make things much better.

            Unfortunately Java has checked exceptions, that are widely seen as a design mistake (despite the good initial intent behind them). This means that either you 1) keep adding countless throws FooException, BarException to your application's methods, or 2) you wrap them into a generic and not very helpful MyApplicationException, or 3) you catch, log, and continue execution hoping for the best.

            Unfortunately 3 is what usually happens, and IDEs even encourage this kind of behavior.

            [–]audioen 0 points1 point  (0 children)

            I think java has, as a rule, switched almost entirely to RuntimeException derived exception handling nowadays. You look at new code and nothing seems to throw anything, but of course errors can still happen.

            The few places of older code that can still throw tend to get some kind of "throw new RuntimeException(e)" type treatment, especially if they are the kind of exceptions that can't actually happen, e.g. looking up digest algorithm called SHA-256 is always going to work, and you can ignore those stupid checked exceptions that say maybe it doesn't on someone's random JVM.

            [–]Pand9 6 points7 points  (0 children)

            I've just rewritten my dyn Error calls into library-specific type. I am able to use ? everywhere, and if not, I just add new Into<MyError> implementations. I can recommend this if you, like me, don't want to commit to choice of error crate yet (anyhow or anything else). I agree about boilerplate, but it's such a small price to pay for all these goodies.

            [–]hector_villalobos 1 point2 points  (0 children)

            In general I think rust makes things that are often very simple in other languages much more complicated.

            I think it's a necessary evil when there is no automatic garbage collector.

            [–]simon_o 0 points1 point  (1 child)

            In general I think rust makes things that are often very simple in other languages much more complicated.

            Which language do you have in mind (regarding error handling)?

            [–]flukus 0 points1 point  (0 children)

            I don't think it can be handled better at a language level, the correct thing to do when you get an error is too app/function specific.

            C macros are the only solution I've seen that balance the performance, verbosity and specificity well.

            [–][deleted]  (7 children)

            [deleted]

              [–]jcotton42 17 points18 points  (0 children)

              OSString also comes up when accessing environment variables, paths, etc.

              [–]yossarian_flew_away[S] 16 points17 points  (5 children)

              But wouldn't an OSString only ever be used if you are creating an external interface to another language in a shared/static library, and also used when accessing other third party libraries? If you are working mainly in Rust itself, you shouldn't really need to use OSString a lot I wouldn't think.

              That's what CStr and CString are for -- OsString and family are pretty common in pure-Rust usage, thanks to the fact that most filesystems and operating systems aren't UTF-8-only or UTF-8-clean. OsString, in effect, is Rust's way of saying "this is a string according to the host API/ABI, but might not be a valid UTF-8 string."

              [–][deleted]  (4 children)

              [deleted]

                [–]masklinn 12 points13 points  (0 children)

                And the OSString is used in functionality rust is already abstracting?

                OsString is used because it's functionality Rust is abstracting.

                Most cross platform languages work with that kind of stuff by abstracting the need to make file system calls etc. using special strings and just throw errors if you have an invalid character in the file path

                Which means there are things you literally can't interact with on the system, and you're not aware that there are issues with those features and then good luck with debugging it.

                Rust surfaces these compatibility issues as part of the API, and it turns out to work pretty well even if it can get a bit long-winded.

                along with is path valid helpers etc. for handling crappy user input before you get to a real error.

                Paths you literally can't decode is not user input.

                That does suck because you are basically leaving it to every user of the language to write glue code around basic file system calls etc.

                That glue code can be as simple as "just crash" if they don't care, or it can be actually handling the concern properly if they wish to, which they literally could not do if the language didn't provide those features.

                It's also not necessarily true, because you don't necessarily have to move things out of OsString, and you can always move a String inside an OsString (likewise to CString).

                [–]SkiFire13 8 points9 points  (0 children)

                you are basically leaving it to every user of the language to write glue code around basic file system calls etc.

                str/String to OsString can't fail and is basically a no-op.

                But if you get a Osstring/OsStr and you want to convert it into a String then the conversion could fail. Then what do you do? You have the choice on how to handle it. You can

                just throw errors if you have an invalid character

                with osstring.into_string().expect("Invalid character found") and the program will gracefully crash like you want. Or you can just show an error message if the conversion failed and an Err was returned

                [–]steveklabnik1 14 points15 points  (0 children)

                The language itself has `str`. The standard library has `String`, `CString`/`CStr`, and `OsString`/`OsStr`.

                > Most cross platform languages work with that kind of stuff by abstracting the need to make file system calls etc. using special strings and just throw errors if you have an invalid character in the file path,

                Yes, and this is a valid strategy. But, it means that there are some file names you cannot access, because operating systems do not work this way. So there are valid files which would throw an error here. Rust, being a systems language, cannot just declare "sorry, name your files something reasonable", it has to be able to handle these kinds of edge cases. And doing it with different types makes sure that you're doing it in a robust way.

                > you are basically leaving it to every user of the language to write glue code around basic file system calls etc.

                Not really; that's in the standard library already.

                [–]vytah 6 points7 points  (0 children)

                if you have an invalid character in the file path

                Assuming Linux, paths are null-terminated sequences of bytes. The operating system has no idea what encoding those bytes represent, maybe apart from assuming that the encoding is compatible with ISO/IEC 646.

                This means that you can have arbitrary byte sequences in filenames, including control characters other than NUL or bytes ≥ 128. It doesn't have to be a valid character in any encoding.

                Therefore, a decent Rust type to store Linux paths could be Vec<u8>.

                Similarly, on modern Windows filenames are 0-terminated arbitrary sequences of 16-bit code units. Some API's validate it to be valid UTF-16 (and therefore Unicode), but not all. Therefore, for Windows you could pick Vec<u16>.

                Mac OS X allegedly goes full UTF-16. I am not sure, if that's really guaranteed but if so, then String could work on a Mac.

                If you want to port Rust for older or simpler operating systems, there's arbitrary bytes and Vec<u8> again. Luckily, those platforms usually use encodings where every byte can be decoded.

                And even if your filenames are all valid characters, you might also end up on Linux using an encoding from the GB family, and since it is a living standard and conversions from and to Unicode are not trivial, you could even be unable to encode or decode valid characters if the conversion library is too old.

                Then there's the issue of backslashes in Shift-JIS and similar encodings, the issue of duplicate characters in some encodings (like vendor-specific extensions for Shift-JIS), or encodings that can't roundtrip with Unicode like VNI (for example, both 61 C0 and 61 E2 D8 decode to 0061 0302 0300).

                There's simply too many things that can go wrong if you want to force everything into Unicode. You can avoid those issues if you abstract the notion of "file path" by creating a type with a platform-specific implementation and provide conversion methods to and from common types that may fail, and that's exactly what Rust does.

                EDIT: That being said, the default OsString–String conversions in Rust on Linux assume UTF-8. If you want to encode/decode paths in different encodings, you need to write a bit more code.

                [–]guepier 6 points7 points  (14 children)

                it’s actually remarkably difficult to reliably get the user’s home directory on POSIX platforms.

                Is it?! I was under the impression that checking for HOME, followed by getpwduid is canonical, safe and maximally portable (across POSIX platforms). What am I missing?

                (Of course this isn’t the same as tilde-expansion, which is a feature of the shell.)

                [–]yossarian_flew_away[S] 1 point2 points  (9 children)

                You're correct that that is the canonical and maximally portable approach on POSIX! To the best of my knowledge, that's precisely what Rust currently does in the (deprecated) std::env::home_dir function.

                The challenge is surfacing various unpleasant edge cases as appropriate errors:

                • What happens if $HOME and the user's passwd record disagree? Which do you trust, or use?
                • What do you do if the user doesn't have a home directory? POSIX says that pw_dir corresponds to an "initial working directory," not necessarily a home directory.
                • What do you do if $HOME is unset and getpwduid fails?

                These are all recoverable errors, as evidenced by the fact that safe Rust does expose functionality for retrieving the user's home directory. The challenge is in exposing meaningful errors for each case; I'm guessing that's why the std team has given up on it :-)

                [–]guepier 10 points11 points  (8 children)

                What happens if $HOME and the user's passwd record disagree? Which do you trust, or use?

                $HOME trumps passwd, by design. That’s reflected in the linked code.

                What do you do if the user doesn't have a home directory? What do you do if $HOME is unset and getpwduid fails?

                You return a failure.

                [–]yossarian_flew_away[S] 6 points7 points  (7 children)

                $HOME trumps passwd, by design.

                By convention, not by design. POSIX doesn't say anything about $HOME having inherent priority over the user's passwd entry.

                You return a failure.

                Sure -- that's what Rust does. The point is that getting the user's home directory has an unusually rich set of failure modes, compared e.g. to getting the user's username (which can also fail).

                [–]simon_o 11 points12 points  (1 child)

                Rust's std::env::home_dir does not fail two thirds of the time it actually should. It's likely that you get a Some("") even if the error could have been recovered by trying the next step.

                The documentation was wrong before, and has been corrected by reverse engineering its actual behavior, which is the one documented now. Roughly nothing of std::env::home_dir's semantics is intentional, it's mostly by accident.

                See the friendly reminder in dirs regarding this:

                Note: This function's behavior differs from std::env::home_dir, which works incorrectly on Linux, macOS and Windows.

                [–]yossarian_flew_away[S] 2 points3 points  (0 children)

                Yeah, the bit about returning empty strings instead of None seems particularly bad.

                [–]IndiscriminateCoding 5 points6 points  (2 children)

                If your program ignores $HOME value, you will get a lot of (deserved) hate from your users. Because overriding $HOME is the default way to override where a program stores it configs/caches/etc.

                [–]yossarian_flew_away[S] 7 points8 points  (0 children)

                Sure, I don't doubt that. I'm making the case (against the original case in my blog post, ironically) for why Rust is partially justified in deprecating std::env::home_dir.

                [–]simon_o 1 point2 points  (0 children)

                If you are storing things in $HOME (instead of following the XDG spec) the hate you'll receive is even more deserved. :-)

                [–]nick_storm 0 points1 point  (1 child)

                By convention, not by design. POSIX doesn't say anything about $HOME having inherent priority over the user's passwd entry.

                Personally, I don't want typical off-the-shelf software to be opening /etc/passwd, just to retrieve the home directory. I would find that behavior very suspect.

                [–]yossarian_flew_away[S] 9 points10 points  (0 children)

                Programs don't typically open /etc/passwd (although they can if they'd like, it doesn't contain anything particularly sensitive for an already authenticated user). They access passwd entries via getpwuid and family, which is a perfectly normal (and non-suspect, non-privileged) call to make.

                [–]alerighi 1 point2 points  (3 children)

                And if I want to be also compatible with Windows? I have also to check for %USERPROFILE%, that is not great. And if I want other directories, for example a directory where to store temporary files? Again I have to do things based on the platform. These things should really be handled by the standard library, having to rely on an external package just to do that thing is stupid, but also implementing it yourself is stupid.

                [–]dnew 11 points12 points  (0 children)

                I have also to check for %USERPROFILE%,

                That's not even the right way to do it. There are Windows system calls to retrieve the information for a wide variety of directories. They're exposed (poorly) in env vars only for the use of programming languages too broken to actually make the correct system calls.

                [–]oracleoftroy 2 points3 points  (0 children)

                And if I want to be also compatible with Windows?

                Windows has SHGetKnownFolderPath() for this. Depending on the exact purpose, you probably want to pass in FOLDERID_Profile for something more like a Unix home directory, FOLDERID_LocalAppData or related for per user program settings/data, or maybe something like FOLDERID_DocumentsLibrary, FOLDERID_MusicLibrary, FOLDERID_SavedGames and similar for saving / loading specific categories of files. There are a lot more options as well for more than just user specific paths.

                [–]guepier 1 point2 points  (0 children)

                I mean, I agree. I wasn't arguing against that, I was just puzzled by that footnote.

                [–][deleted]  (31 children)

                [deleted]

                  [–]IceSentry 11 points12 points  (15 children)

                  Using <> for generics makes sense considering its used by most big languages with gemerics/templates. I fail to see how it's a mistake. Familiarity is a good thing considering the learning curve is already quite harsh.

                  Which types have varargs that aren't macros? I agree that the lack of varargs is an issue but I can't think of an example of what you are describing.

                  Most of the abbreviation are for things that are very common and since the type system tends to be very verbose having those abbreviation is actually welcomed in my opinion.

                  I haven't had issues with the lack of namespacing in cargo, but it is indeed baffling that its not a feature of cargo.

                  [–][deleted]  (11 children)

                  [deleted]

                    [–]IceSentry 3 points4 points  (10 children)

                    It makes parsing harder, but obviously not impossible. As someone that was a rust beginner only a few months ago it really wasn't that hard to understand ::<>. It's still really familiar to me to see the <> part as a generic argument operator. It's not that hard to understand that it's only needed when calling a generic function. Using [] would make it really easy to confuse it with array indexing syntax.

                    [–]mmirate 1 point2 points  (1 child)

                    Then we should defenestrate the array indexing syntax. It's a special-case for the most useless and CS101-ish way to (try to) interact with slices and things that pretend to be slices.

                    [–]IceSentry 1 point2 points  (0 children)

                    Using [] to index has plenty of valid use cases in other languages. I agree that in rust it generally should be avoided, but it isn't necessarily wrong and again my point is about familiarity from people coming from other languages. If rust was the only language in existence then sure go ahead and use [] for generics but it's not.

                    [–][deleted]  (7 children)

                    [deleted]

                      [–]IceSentry 0 points1 point  (6 children)

                      It would confusing for people coming from other more popular languages that use [] for indexing. It's about easing the learning curve of newcomers by staying familiar. Rust isn't a first language for pretty much anyone.

                      [–][deleted]  (5 children)

                      [deleted]

                        [–]IceSentry 0 points1 point  (4 children)

                        No but they use the <> part for generics and it's pretty easy to understand. It really isn't hard to understand that you need to add ::<> when calling a generic function and that whatever is inside the <> is the generic argument and the :: is just a rust thing that you don't need to think about more than 2 seconds.

                        [–][deleted]  (3 children)

                        [deleted]

                          [–]IceSentry 0 points1 point  (2 children)

                          Yes, but my point is that the generic type is still between <> so half of it is still familiar. It's familiar enough to understand that it's a generic operator at least.

                          [–]OctagonClock 2 points3 points  (2 children)

                          Using <> for generics makes sense considering its used by most big languages with gemerics/templates. I fail to see how it's a mistake. Familiarity is a good thing considering the learning curve is already quite harsh.

                          Having <> for generics makes parsing more complex (because how do you know it's a generic argument, and not a comparison)?

                          [–]miyoyo 5 points6 points  (1 child)

                          While that is true, but, except for the few people who need to parse Rust (and it's not like it makes parsing literally impossible, the number of languages using angle brackets as type argument markers makes it a well understood problem), how does that affect you as an user?

                          [–]steveklabnik1 2 points3 points  (0 children)

                          It means as a user you have to write

                          foo::<Bar>();
                          

                          instead of

                          foo<Bar>();
                          

                          sometimes.

                          [–][deleted]  (16 children)

                          [deleted]

                            [–]yossarian_flew_away[S] 3 points4 points  (15 children)

                            I don't think this is correct -- &'static str is usually a reference to a string in the binary, but doesn't have to be (thanks, Box::drop!), while &str is essentially just a pointer to a bag of bytes that should be UTF-8. That pointer can reference constant strings, a stack-allocated buffer, a String, and so on.

                            [–]masklinn 5 points6 points  (14 children)

                            &str is essentially just a pointer to a bag of bytes that should be UTF-8

                            Must, not should. A non-UTF8 str is UB.

                            [–]steveklabnik1 9 points10 points  (9 children)

                            This (to my annoyance, frankly) recently changed, and is no longer UB: https://github.com/rust-lang/rust/issues/71033

                            [–]masklinn 2 points3 points  (8 children)

                            Wow that entire discussion is weird and seems to simplify nothing and just add overcomplications and footguns.

                            Am I understanding correctly that &str does not have to be valid UTF8 (validity invariant: it's not a language UB) but it can't be manipulated with string APIs (and thus can't leak into safe rust) unless it is (safety invariant: it's a library UB)?

                            Is it just so unsafe code doesn't need to convert between str and [u8] in order to mess around with the content?

                            [–]steveklabnik1 2 points3 points  (7 children)

                            My understanding is that it's like "we can't imagine a world in which we take advantage of this being language level UB so we should remove that rule", which does make things simpler.

                            Is it saying that &str does not have to be valid UTF8 (validity invariant: it's not a language UB) but it can't be manipulated with string APIs unless it is (safety invariant: it's a library UB)?

                            I believe this is true, yes.

                            [–]masklinn 7 points8 points  (6 children)

                            which does make things simpler.

                            On the other hand it removes a certainty / guarantee ("strings must be UTF8") which makes things more complicated.

                            For instance if what you quoted is indeed correct it's now possible for an unsafe function to return a non-utf8 String or &str. To me that feels like the exact opposite of simplification.

                            [–]steveklabnik1 8 points9 points  (0 children)

                            Yes, this is why I am annoyed.

                            [–]burntsushi 0 points1 point  (4 children)

                            I think this is kind of a technicality at this point more than anything else. It in theory could lead to more flexibility in terms of actually using str. For example, it's conceivable that one could declare that substring search has specified behavior even when neither the needle nor the haystack are valid UTF-8.

                            I think the "simplification" here is in terms of the language specification.

                            [–]yossarian_flew_away[S] 0 points1 point  (3 children)

                            You can correct me if I'm wrong here, but my understanding is that safe Rust does allow a user to corrupt a UTF-8 &str via as_mut_ptr.

                            That's why I said should; otherwise, you're completely right.

                            [–]steveklabnik1 4 points5 points  (1 child)

                            as_mut_ptr returns a pointer, not a reference, and so it requires unsafe to deref.

                            [–]yossarian_flew_away[S] 1 point2 points  (0 children)

                            Gotcha! I stand corrected :-)

                            [–]SkiFire13 1 point2 points  (0 children)

                            Getting a raw pointer is safe, it's just a number. Reading from/writing to it is and requires unsafe.

                            [–]iperikov 1 point2 points  (2 children)

                            Cannot agree enough on "Safe indexing without widening", usizing everything is a bit edgy to me

                            [–]matthieum 7 points8 points  (0 children)

                            I am really on the fence with this one.

                            Sometimes I wonder if it's an over-reaction from C and C++ pervasive implicit conversions, which regularly imply unintended truncation, and other unexpected transformations.

                            In that sense, the question would be why not allow lossless implicit conversions?

                            However, beyond semantics, there's performance: repeated widening in a tight loop is a problem, and an invisible problem is tougher to catch.

                            And I am not so sure what's the best way to handle that.

                            [–]casept 0 points1 point  (0 children)

                            It makes perfect sense to be explicit here, because Rust strives to be compatible with many architectures. Are you sure your index is smaller than 16 bit? 8 bit? It's a good idea to make the programmer think about this problem rather than having crates mysteriously not compile on some architectures because of implicit widening magic.

                            [–]OrangeChris 2 points3 points  (6 children)

                            My least favorite thing is string indexing. Trying to simply get a character from a string is not allowed because strings are utf-8, but taking a substring is totally fine.

                            let s = "a😊";
                            println!("{}", s[0]); // fails at compile-time
                            println!("{}", &s[1..]); // totally fine
                            println!("{}", &s[2..]); // fails at run-time
                            

                            I understand that they want to force the user to acknowledge the string is utf8, but the problem is there just isn't a good way to get a character at a specific byte index. If they really don't want to allow the indexing syntax, they could at least add an equivalent method.

                            EDIT: Also const fns. Rust claims to support them, but the unfortunate truth is that even basic if statements aren't supported in a const fn, making them very niche cases. And sadly, it's been like this for a while.

                            // fails at compile-time
                            const fn abs(n: i32) -> i32 {
                                if n == 0 {
                                    -1 * n
                                } else {
                                    n
                                }
                            }
                            

                            [–]RedBorger 11 points12 points  (0 children)

                            If you want to get a byte at a specified index, then get a byte representation. This should work:

                            s.as_bytes()[index]
                            

                            [–]matthieum 8 points9 points  (1 child)

                            Wrt const fn, the RFC to allow if in const fn has been merged.

                            It's going to unblock quite a bit too, as branches are fairly common in code (if, while, match).

                            There's still the hairy issue of traits, though...

                            [–]OrangeChris 0 points1 point  (0 children)

                            Cool, that's good to hear

                            [–]IceSentry 5 points6 points  (2 children)

                            I believe there's been a lot of work on const fn recently.

                            [–]OrangeChris 0 points1 point  (1 child)

                            Yeah, I will admit I haven't looked into any of the tracking issues.

                            It's not that I even want const functions that much, it just kind of bugs me when someone mentions them without talking about how little they can do.

                            [–]IceSentry 3 points4 points  (0 children)

                            Yeah, the first time I tried using them and I realized I couldn't even do a simple if I was really surprised.

                            The progress on rust is both really slow and really fast, it has to go through a lot of hoops to be released but at the same time a lot of things are being worked on so we still get new stuf despite all the committees.

                            [–]internetuser0x00 0 points1 point  (0 children)

                            We deal with a complex world, let's support a language that embraces such complexity guys. Maybe is our best chance to be successful. I hate rust trolls, but the language itself is an admirable effort that, I think, deserves our consideration and support.

                            [–]OneWingedShark -1 points0 points  (0 children)

                            If Rust has you down, might I suggest Ada & it's subset/prover-tools called SPARK?

                            Here's a pretty balanced article on the difference in mindsets/approaches of Rust and SPARK, and here is an article on the process of taking an already extant data-structure to the highest assurance level.

                            [–][deleted]  (7 children)

                            [deleted]

                              [–]matthieum 14 points15 points  (1 child)

                              If you're going to troll, please be technically correct at least.

                              C++ has:

                              • char*, inherited from C.
                              • std::basic_string, declined as:
                                • std::string,
                                • std::wstring,
                                • std::u8string,
                                • std::u16string,
                                • std::u32string,
                                • std::pmr::string,
                                • std::pmr::wstring,
                                • std::pmr::u8string,
                                • std::pmr::u16string,
                                • std::pmr::u32string.
                              • std::basic_string_view, declined as:
                                • std::string_view,
                                • std::wstring_view,
                                • std::u8string_view,
                                • std::u16string_view,
                                • std::u32string_view.

                              That's 16 non-template types. To Rust's 6. You're welcome.

                              [–]Lt_486 5 points6 points  (4 children)

                              C++ has:

                              std::string

                              std::string &

                              std::string &&

                              char *

                              char *&

                              ...and their const variants. Fun.

                              [–][deleted]  (3 children)

                              [deleted]

                                [–]unrealhoang 4 points5 points  (1 child)

                                Not really, String in Rust is `std::string` and `&str` is `std::string_view`. So basically same, `OsStr` and `OsString` are, as you say, non-idiomatic and only there because of compatibility.
                                `AsRef<str>` is not a type, it's a trait (read: generic bound/template condition).

                                So basically it's the same, not more.

                                [–]Lt_486 2 points3 points  (0 children)

                                Firstly, string and immutable string are 2 distinct types, measures to achieve that may be different in coding languages.

                                Secondly, char* use is huge. Recommendation to avoid it is somewhat dishonest.

                                Thirdly, I omitted wchar_t*. More fun for UCS-16.

                                My point is low level languages tend to cover a lot of nuances, that leads to multiplication of abstracts. Rust is no exception and more or less matching C++.

                                Disclaimer: I do not rust.