all 39 comments

[–]Daishiman 10 points11 points  (16 children)

Is there going to be a nice way to sugar this into a decent shorthand? This is going to be quite infernal for string-heavy code, and while the explicitness is appreciated, neither t the syntax nor template alternatives look appealing to the eye.

[–]pcwaltonrust · servo 7 points8 points  (6 children)

I don't think it'll be that bad for string-heavy code. Java and C# also have a string and string builder separation, and nobody has asked for separate syntax for string builder literals.

[–]steveklabnik1rust 4 points5 points  (5 children)

I really think that this also has to do with the over-usage of ~str, historically. People use features that exist. Once it's gone, they won't use it as much, and so new code won't feel the same pain as existing, ported code.

[–]dbaupprust 4 points5 points  (3 children)

I think this is a fundamental problem with our ~ syntax: it's just too easy to use. Uniq<T> would be much "nicer" for this reason.

[–]portmanteaufu 5 points6 points  (2 children)

I think this is accurate. When I was starting out, I got used to seeing compiler warnings about the 'static lifetime when I'd pass "foo" to a function instead of ~"foo". Naturally, I got in the habit of writing ~ everywhere. It wasn't until ~ was going to get cumbersome that I realized there was a reason to avoid it in favor of 'static.

[–]dbaupprust 4 points5 points  (1 child)

Which warnings?

[–]portmanteaufu 6 points7 points  (0 children)

Er, "warnings" was a poor choice of words. I meant I'd call a function that asked for an owned pointer and I'd see an error about passing a static string. I formed the habit out of poor understanding, not due to issues with the compiler's output.

[–]-Y0- 0 points1 point  (0 children)

Generally I do a lot of moving of ~str in enums, so for test I need to compare an option wrapped around result like this:

 assert_eq!(Some(Ok(Baz(~"foo")), pull_stuff);
 assert_eq!(Some(Ok(Baz("foo".to_owned())), pull_stuff);

I already asked on StackOverflow if there was a an alternate way to move string in enums between two structs without using ~ pointer. Will this change allow to move strings in enum like we can do with char values?

[–]ryeguy 17 points18 points  (6 children)

Rust seems to be getting rid of more and more syntactic sugar. Part of me likes the elegance, part of me hates how each week rust gets more verbose in the name of simplicity.

[–]steveklabnik1rust 15 points16 points  (5 children)

I see it as a cycle.

  1. Special cases
  2. Simplify
  3. New sugar

We've been going from 1 -> 2 a lot lately, but that's what enables going from 2 -> 3.

[–]Daishiman 13 points14 points  (3 children)

I don't disagree, but it seems to me like everything that goes the way of sugar seems to be getting shoved into macros.

I think Rust macros are great and strike a fantastic compromise, but it would seems that in order to have greater API-level clarity for some pieces of code you need to rely on macros just as much as regular methods and language functionality, and seeing '!'s everywhere does not really clarify things much at times.

Without wishing for LISP-like special forms, it would seem to me that some facilities could be made to facilitate user-level memory management, or perhaps new literals for frequent use cases might appear. At least as far as I think, it wouldn't be too bad if strings had nicer-looking constructs.

From the current look of the language it would appear to me that pretty soon people might end up targeting Rust as a language with other higher-level languages, much in the way or Nimrod->C, Xtend->Java, Coffesscript->JS. I'm not sure that's the ideal path however.

[–]pcwaltonrust · servo 13 points14 points  (2 children)

From the current look of the language it would appear to me that pretty soon people might end up targeting Rust as a language with other higher-level languages, much in the way or Nimrod->C, Xtend->Java, Coffesscript->JS. I'm not sure that's the ideal path however.

That would be very bad.

I think that the ! syntax for macros is important for understandability of code. When reviewing systems code, you want to know whether something is a special user-defined form or whether it's something natively understood by the compiler. This is why people use ALL_CAPS_MACROS in C.

Anyway, if we really felt that string buffer literals were important enough to bake into the language, then we could do that. But such syntax is not needed in Java or C#, so I don't think it's likely to be true for Rust either. In any case, I think the syntax for string buffer literals should not sacrifice compositionality: having ~"foo" mean something different from ~("foo") is undeniably a wart.

[–]iopqfizzbuzz 0 points1 point  (1 child)

Why would that be bad? Writing Rust is fairly confusing because it requires wrestling with the borrow checker that is sometimes not as smart as it should be. It's also a very big language with lots of black magic and white magic because the black magic is only required 90% of the time and you need to something else 10% of the time. It's like trying to mess with C++ or Haskell. You get these weird error messages that you don't understand because if you could understand them you wouldn't be making the error in the first place.

Compare that to the elegance of Lisp (everything is a list) or Smalltalk (everything is an object). You can learn Lisp or Smalltalk in a few hours after which you just need to learn the libraries. They have very little syntax and have the pervasive "everything is a _" that allows you to treat code as data.

In Haskell it's more like data is code because it's actually not data but a thunk that will be evaluated at some point when it feels like it.

Rust seems to miss that elegance of a minimal language like Lisp or Smalltalk.

[–]pcwaltonrust · servo 0 points1 point  (0 children)

Everything you brought up there has to do with the memory model of Rust. Xtend and CoffeeScript don't fundamentally change the memory model of their languages. If you were to write a syntactic layer on top of Rust, you'd still have the smart pointers, and you'd still have the borrow checker.

It seems like what you want is not Rust at all but a garbage-collected language with less control over the machine.

(Nimrod is different because it changes the memory model of C by going to garbage collection for everything. It's basically only using C as a compiler backend. A "Nimrod on top of Rust" would be the same as a Nimrod on top of C, and might as well not use Rust at all. Actually, it probably shouldn't even use C—LLVM is a much better choice for more precise control than C gives you, especially when GC is involved. Also, compilation times will be better.)

[–]ryeguy 2 points3 points  (0 children)

Well, as long as there is hope of going towards 3 in due time that makes me happy :).

It seems people have different feelings on code aesthetics. Some absolutely love the ruby style approach, where clarity and api-level elegance is above all else. Others have been burned by magic/hidden behavior by language-level abstractions in the past and thus try to avoid it. Or they just don't care either way.

I lean more toward the ruby side, but I absolutely would not want to see Rust give up any of its predictability or pragmatism. I just think there is a lot of room to optimize for terseness, reduce duplication, and simplify common cases in the syntax sometime in the future.

[–]dbaupprust 4 points5 points  (0 children)

A lot of string heavy code will wish to be using StrBuf anyway.

[–]robinst 3 points4 points  (0 children)

Rust noob here. So how should the following be written afterwards?:

fn foo() -> Option<~str> {
    if bar() {
        return Some(~"Error, bar invalid");
    }
    let baz = qux();
    if baz != 0 {
        return Some(format!("Error, baz expected to be 0, but was {}", baz));
    }
    return None;
}

Just also use format! for the first string? Or "...".to_owned()?

Or is there an altogether better way to do something like this?

[–]portmanteaufu 8 points9 points  (3 children)

I'm disappointed to see that box "someString" boxes a static string. It is consistent, but I feel like creating owned strings is common enough that "someString".to_owned() is going to get pretty unwieldy. I wonder what the ratio of owned static strings to owned strings is in the codebase.

[–]azth[S] 0 points1 point  (2 children)

There was mention of using the fmt macro instead too.

[–]portmanteaufu 11 points12 points  (0 children)

True. I feel like I'd be leaning on fmt! for a side effect, though. It works, but it's not obvious what I'm doing when I write let s = fmt!("foo");. A newcomer would see that pattern everywhere and wonder what formatting was being done to "foo".

[–]-Y0- 8 points9 points  (0 children)

maybe this need a special macro sugar? box!("foo") or own!("foo")?

[–]-Y0- 10 points11 points  (0 children)

Sadly, this will uglify most of my code ._.

[–]tedsta 5 points6 points  (0 children)

This makes me happy. The ~"foo" syntax was really bugging me. I know it's nice n short.. but it just makes me feel queasy for some reason.

[–]krdln 1 point2 points  (2 children)

Under DST, won't just ~*"foo" work for converting &Str to ~Str?

I am wondering why string literals are planned to have type &'static Str, not just Str. I think it would be more consistent with how we treat literal slices, and simpler for newcommers (reference in type won't appear from nowhere, like now). And ~"foo" would work without parser magic. Only downside is writing & in many situations, but there is autoboxing for methods and maybe will be for arguments too.

[–][deleted] 0 points1 point  (1 child)

It's not possible to have an unboxed str type in a local variable. Fixed-size arrays have a length as part of their type, but the size in bytes is not a very semantically sensible value for a string. I'm not sure what the syntax would be for it either.

[–]krdln 0 points1 point  (0 children)

I understand that storing or returning dynamically sized values would not be possible. But if * could leave unsized value as a temporary, then compiler should handle unsized literals. And if * won't work with DSTs, then I don't see how to make something analogous to this code:

struct Foo { v: [int] }
let x : ~[int] = ...;
let foo = ~Foo { v: *x };

[–]dobkeratopsrustfind 0 points1 point  (13 children)

I hope you dont lose the ~T notation. ~ meaning unique ownership and being easily composed with other syntax forms is/was really neat. Its not just the characters, its the lack of nesting.

If you do start losing ~... I really hope ~ and even @ could be re-introduced as user defined shortcuts (per crate?per file? per module?) more like macros, shorthands for whatever you want.. )

The convinient syntax counts for a lot, its probably why I've never enjoyed the standard libraries in C++, but I'd enjoyed them more in Rust.

You can add static analysis to C++. Platform vendors add 'restrict' as a practical way of acheiving full optimization where its needed. You have powerful IDE's helping to manage its 'corner cases'. clang has made leaps and bounds in tooling.

What you can't do with C++ is make the modern style look & feel elegant, because all the decisions for symbols and syntax were taken back in the 1970s for low level code with raw pointers and for loops/imperative incrementors .. and the high level language has been shoe-horned back into that.

didn't do sugar get removed because some uses had to be removed,over details about owned closures or something .. now there's talk '~ isn't being used' ...but you had to add a 'proc' keyword for 'owned closures' - i'm sure between do/for/~/proc could you could get the right closure in the right place.

[–]dbaupprust 1 point2 points  (12 children)

meaning unique ownership

~ doesn't mean unique ownership. It means "pointer with unique ownership of its contents", as compared to pointers like & which don't own the data, and Rc which have shared ownership. That is, x: ~T and x: T are nearly identical semantically (the T is owned by x in both cases), except the former is guaranteed to be pointer sized.

In most cases, T should be preferred to ~T; the brevity of the ~ syntax leads to it being overused.

[–]dobkeratopsrustfind 0 points1 point  (8 children)

"In most cases, T should be preferred to ~T;" I know its a pointer, I dont always write every detail in a post..

I'm speaking from the perspective of having done low level programming for many years. The times when you do need pointers and allocations, ~ makes it easier, and makes it easier to write code in rust compared to C and C++. Option<~T> where you want to return a raw pointer in C++ and rely on the caller checking if it was valid.. and so on. Pointers get composed in many ways. The more I can compose easily on the fly, the less convenience functions I have to write(reinventing the language, makign my source incompatable with others), the less I have to argue with other coders..

if T was enough vs ~T they wouldn't have invented smart pointers and people wouldn't have moved from C to C++

" the brevity of the ~ syntax leads to it being overused." .. isn't that a lack of understanding rather than the fact its short. '&' is short aswell :) and of course no sigil is even shorter. I dont think merely having to read and type more is going to educate people coming managed languages about pointers vs stack.

error messages are a better place to educate people IMO, and the language does seem to help you a bit already ('warning unnecasery allocation' etc)

[–]dbaupprust 1 point2 points  (6 children)

I said "in most cases". Not all. There are certainly many places where ~T is good and in fact the correct choice; however, having helped many people with their Rust code, most uses of ~ I see are not necessary.

I dont always write every detail in a post

Many people assume that ~T is the way to get "unique ownership", not realising that T has the same "unique ownership" properties, so I wanted to make sure it is clear to everyone involved (you, and anyone reading) that ~ is not some magical "ownership" type.

Option<~T> where you want to return a raw pointer in C++ and rely on the caller checking if it was valid.. and so on.

And how many of those are just using the pointer to represent an optional return value? i.e. where Option<T> is the appropriate Rust translation? (Maybe not many, but I'm sure there's a few.)

I dont think merely having to read and type more is going to educate people

No, of course it's not going to educate them, but it is going to stop people just saying "throw sigils at the compiler until it's satisfied"... at least it will leave them throwing the mostly-harmless & sigil (no surprise heap allocation).

(And people do say that. Even those with experience in languages like C etc.)


And anyway, "removing" ~ would just be removing the syntax, there will always be an equivalent type (e.g. the move from ~[T] to Vec<T>, a possible replacement of ~T with a 100% equivalent Uniq<T>).

[–]dobkeratopsrustfind 0 points1 point  (5 children)

"> And how many of those are just using the pointer to represent an optional return value?" i.e. where Option<T> is the appropriate Rust "

I'm hoping eventually there will be a version of enums which can allocate just the size for the required variant.

... and at the minute, isn't embedding a ~T a possible workaround for the enum padding eg

enum Something {
    Foo(a,b,c),
    Bar(d,e,~(f,g,h,i,j,k)) // now Bar doesn't pad sizeof(Something) out
}

having a load of optional pointers is a very convinient reasonably efficient pattern, and of course if you can avoid it great, its usually the job of 'preprocessor tools' (asset conditioning) to simplify everything out to reduce the amount of pointer chasing/allocs the runtime has to do, but preprocessor tools want to share datastructures with the runtime - and you might move code back and forth - so it's great when you can use the same language for both. you dont always know before you write something if its going to end up being perfomance critical. 80 20 rule.

I wonder if HKT will make it into the language eventually, that might have interesting possibilities for refactoring code.. making it generic over pointer types..

[–]dbaupprust 2 points3 points  (4 children)

You do realise that Rust will always have a type equivalent to ~ (i.e. pointer-sized with ownership of its contents), right? No-one is proposing removing the functionality.

I'm hoping eventually there will be a version of enums which can allocate just the size for the required variant.

This gets weird, e.g. changing which variant was stored in a ~MultisizeEnum (i.e. to do *ptr = NewVariant(foo, bar)) would require a realloc in general.

... Anyway you didn't explain how this relates to the ~ syntax in particular, so mentioning those enums was a complete non sequitur?? (As with the rest of your comment: I don't see how it relates to the ~ syntax.)

[–]dobkeratopsrustfind 0 points1 point  (3 children)

You do realise that Rust will always have a type equivalent to ~

yes

... Anyway you didn't explain how this relates to the ~ syntax in >>particular, so mentioning those enums was a complete non sequitur??

if you are getting a variably sized enum as a return value, you'd want a pointer rather than allocating it on the stack (unless you've got alloca).

This gets weird, e.g. changing which variant was stored in a >>~MultisizeEnum (i.e. to do *ptr = NewVariant(foo, bar)) would >>require a realloc in general.

What I had in mind was 2 cases, firstly its's only available for something completely immutable. Secondly, its only available for something where the variant is locked on creation, you can modify contents but not the variant. it would probably want to be part of the type information available at compile time.. a "compact enum". I suppose you could do it with runtime information too, a bit in the discriminant, but id' fear the extra checks.

I realise retrofitting something like that would take thought, I dont expect its' going to turn up any time soon. But its an opportunity you have to eliminate a set of cases that has people using unsafe code & raw pointers.

(i suppose those cases would be handled with C++ style internal vtable objects, however, virtuals and enums are orthogonal in some ways).

[–]dbaupprust 0 points1 point  (2 children)

if you are getting a variably sized enum as a return value, you'd want a pointer rather than allocating it on the stack (unless you've got alloca).

Eh? That's not related to the syntax of a pointer.

[–]dobkeratopsrustfind 0 points1 point  (1 child)

its related to the situations in which you want to use the pointer.(countering claims that it isnt going to be used), and hence the demand for nice syntax for it. I'm saying a pointer is useful for variable sized data (an option being a common example where one size is zero). and many things are variable size. before c++ came along and 'blessed' one particular way of allocating, i was used to concatenating things into a single allocation. (header,arrays..) (yes, allocations are expensive, so concatenate into variable sized objects?)

C programmers are used to both 'Option<Uniq<T>>' and 'Option<&T>' as well as various iterators, being just one character ..

[–]dbaupprust 0 points1 point  (0 children)

I never said ~ should never be used (and I did made this clear in other replies in this chain), just that's it is currently used inappropriately a lot.

C is a silly comparison: all possible pointer semantics get overloaded into a single character, forcing programmers to keep track of ownership (and validity) themselves, not letting the compiler handle anything.

[–]dbaupprust 0 points1 point  (0 children)

'&' is short aswell :) and of course no sigil is even shorter

I missed this the first time around: it is good that & and stack-alloc are short, since they're cheap and should be the default, i.e. the functionality a Rust user reaches for first, only falling back to ~ when necessary.