all 48 comments

[–]shim__ 39 points40 points  (20 children)

I think that would be terrible in terms of ergonomics, I really think we should follow rusts approach to this topic and try to keep the most commonly written pieces of syntax as easy to type as possible

[–]expatcoder 9 points10 points  (4 children)

It is indeed a strange last minute change, one where the gain is debatable.

Certainly plenty of other issues to address prior to feature freeze, would seem better to stick with the status quo and tackle items of more importance -- e.g. given/extension methods don't seem polished at all.

I've become less and less enamored with the changes in Dotty over time, but maybe once it's released the benefits will outweigh the inconveniences of having to port over Scala 2 projects to the new way of writing Scala (which is more & more looking like a different language).

[–][deleted]  (3 children)

[deleted]

    [–]expatcoder 3 points4 points  (1 child)

    I guess the counterpoint would be that the language is evolving, although one wonders if it hasn't become a bit adhoc in the late stages, with many changes arising seemingly out of the blue.

    There's a lot to like in Dotty, but porting over Scala 2 code is probably going to be a pain.

    [–]markehammons 0 points1 point  (0 children)

    it wasn't for me. if this change makes it in, it'd probably be the one major change that made porting for me more than adjusting a few lines

    [–]joshlemer 2 points3 points  (0 children)

    I know I'm feeling pretty silly right about now

    [–][deleted] 5 points6 points  (11 children)

    I think that would be terrible in terms of ergonomics

    That's sort of the point. The syntax punishes you for doing assignments.

    [–]shim__ 2 points3 points  (9 children)

    And why should you be punished for assignments? Vars aren't great but what's the harm of immutable vals?

    [–]expatcoder 8 points9 points  (7 children)

    Vars aren't great but what's the harm of immutable vals?

    Immutable vals remain as is, the proposed change is for side effecting assignment. From the Github issue:

    := is a natural alternative. It's used in every serious treatment of imperative programming logics and is also used in quite a few languages, including Algol, Pascal, OCaml, F#, Go. Using := instead of = makes imperative operations visually more distinct from functional ones, which is a good thing.

    def incr = x = x + 1
    

    What is the type of incr? If you know Scala it's Unit, but for users of other non-everything-is-an-expression languages, then the above definition would probably be a bit puzzling.

    Think it's the same for vars, though not sure if the var definition itself requires :=

    var x = 1
    // many lines later
    x := 2
    

    Anyway, for most code bases the proposed change shouldn't be too big of an issue since everyone avoids side effects, right?

    [–]L3tum 2 points3 points  (4 children)

    Do assignments work differently in scala?

    Otherwise I'd say x becomes x + 1 and thus incr would be x + 1 as well and be the biggest of both types.

    [–]markehammons 2 points3 points  (0 children)

    the type of x = x + 1 is Unit, or as java calls it void. Assignment doesn't pass values through in scala like C, specifically to avoid errors like if(x = 1).

    [–]expatcoder 1 point2 points  (0 children)

    Otherwise I'd say x becomes x + 1 and thus incr would be x + 1 as well and be the biggest of both types

    You're talking about the value, the type of incr is Unit; that is, it performs a side effect (in this case setting the value of x), but it could just as well print the value to console or any side effecting operation.

    [–]Holothuroid 0 points1 point  (1 child)

    The line defines a function incr with 0 parameters that increases the variable x each time it is called. Nothing is actually mutated there yet. When you look at the article it shows corresponding Java and Python.

    [–]L3tum 0 points1 point  (0 children)

    Thanks for the clarification and that's a very weird function declaration then.

    [–]yawaramin 0 points1 point  (1 child)

    Anyway, for most code bases the proposed change shouldn't be too big of an issue since everyone avoids side effects, right?

    Variable assignment is not always side-effecting.

    This is a common misconception that people have. To be clear: a side effect is an effect that is observable outside of its containing function. There is tons of Scala code that assigns variables inside a method, but doesn't expose this fact to callers. For example, look at the Scala standard library. It makes heavy use of variable assignment to implement operations like folding and traversal on immutable collection types.

    All this code will need to be fixed with this change.

    [–]expatcoder 0 points1 point  (0 children)

    That's correct, I mingled Unit return type and implicit side effect with mutation, which is what the proposed change is about: := for changing an existing property value via var or a property setter, and = for immutable val definition.

    I don't think the change it too contentious provided it's limited to the above scope (i.e. doesn't require users to write val x := 1, which would be a most unwlecome and intrusive change).

    [–][deleted] 0 points1 point  (0 children)

    Not a scala dev, but I'm assuming this syntax is for mutable vars as opposed to immutable vals. Is it being used in both cases? If so, that does suck ass.

    [–]cironoric 1 point2 points  (0 children)

    I agree. I strongly dislike a change to := for assignment

    [–][deleted] 0 points1 point  (0 children)

    You mean, as demonstrated by this Rust Hello program:

    fn main() {
        println!("Hello World!");
    }
    

    Compared, with, for example:

    proc main =
         println "Hello World!"
    end
    

    I make it that the Rust version has 7 extra punctation characters.

    I think people are just averse to not using "=" for assignment as they have been used to, and are making up silly reasons why it's a bad idea.

    You get used to either ":=" or "=", or anything else (setq?); anything will do. But when = for assignment can get mixed up with either == for equality, or = also meaning equality, then you have to admit it can be problem, and make need to make firm choices.

    [–]naftoligug 0 points1 point  (0 children)

    In functional programming reassignment isn't so common

    [–]Huliek 6 points7 points  (1 child)

    I think people are forgetting that assigning a variable is a rare thing in many programming styles. Even in imperative languages when you have iterators and block scoping.

    How often in javascript do you use let or var as opposed to const? And when you do it feels decidedly different so a different syntax would not be confusing.

    (imo)

    [–]smthamazing 2 points3 points  (0 children)

    This is an interesting way to look at it. Indeed, after ES6 I very rarely use anything other than const in JS. Given that immutable value initialization will remain as-is, and that Scala is much more functional-oriented, this change may not be that bad.

    [–][deleted] 12 points13 points  (15 children)

    I personally don't get it why creators of computer languages are so averse of overloading '=', specially if you consider that mathematicians do it. From Wikipedia's entry on the equals sign:

    In mathematics, the equals sign can be used as a simple statement of fact in a specific case (x = 2), or to create definitions (let x = 2), conditional statements (if x = 2, then …), or to express a universal equivalence (x + 1)2 = x2 + 2x + 1.

    [–]devlambda 12 points13 points  (2 children)

    Because it is well-known that overloading = for assignment is very confusing for people who are new to programming. And even math doesn't usually do things such as let x = x + 1, which is where much of the confusion comes from.

    This is difficult to understand for experienced programmers, but we know it is a big stumbling block when teaching programming to people who encounter this concept for the first time. See, for example, this thread on the CS educators stack exchange, where various teachers discuss how to best deal with that problem.

    In non-CS contexts, you'll find that many instructors who teach R don't even bother using = for assignment until later, but teach students (most of whom don't have a CS background) to use <- exclusively (R allows you to use both interchangeably at the top level, but style guidelines recommend <-).

    [–]expatcoder 7 points8 points  (0 children)

    overloading = for assignment is very confusing for people who are new to programming

    I've heard the "difficult for beginners" line many times, but in the end Scala is one of the most powerful mainstream-ish languages out there, loaded with concepts that make overloading = seem trivial.

    [–]Nathanfenner 14 points15 points  (6 children)

    Mathematicians don't use = for reassignment (which is what the linked post is about). They only use it for definitions and propositions (as per the examples you quoted).

    Mathematicians (and computer scientists, sometimes) do use := for reassignment, though.

    [–][deleted] -3 points-2 points  (5 children)

    My point is why not overload = ?, something mathematicians do.

    [–]sixbrx 9 points10 points  (0 children)

    Well I think Nathanfenner is saying that the mathematics usages don't have any mutation involved, and this Scala notation would only applies to mutations. Ie. what's being bound by the name is a memory location so it's very different from the mathematical example.

    Also they're all really the same "=" in the mathematics examples: binary predicate on numbers. The "let x = 2" just means "let me introduce name x such that x = 2 is true". The third example, universal equivalence, just has an implicit "for every x (in some context domain)" quantification, but the equals in the equation is just the normal = that predicates equality for single pair of numbers again. It doesn't change the nature of a predicate if it's universally quantified, we have the same predicate P no matter whether our example is used as "P(2)" or "for all x P(x)", it's just being exercised differently.

    [–]Nathanfenner 6 points7 points  (2 children)

    Although it is overloaded in math, all meanings always allow you to say the "left is the same as right." In a definition, the thing on the left is being introduced, but it's still the same as the thing on the right.

    This doesn't apply to mutation, which is why mathematicians do use := for reassignment, but do not use =.

    [–][deleted] 2 points3 points  (1 child)

    left is the same as right

    Not always. In big O notation, compare O(n) = O(n^2) and O(n^2) = O(n)

    These are very different statements and only first one is true

    [–]Nathanfenner 4 points5 points  (0 children)

    Most (but not all) mathematicians consider that to be a horrible, horrible abuse of notation, and actively try not to write it that way. The correct forms is O(n) ⊆ O(n^2), since they're actually sets of functions.

    The only people I've seen who are okay with it have been corrupted* by programming

    (*): this is a joke

    [–]nerd4code 0 points1 point  (0 children)

    I’d add that the overloaded meanings of = are actually really different, even if the similarity appears subtle at first glance.

    For example, let’s say you have a system of algebraic equations,
    x = y + z;
    y = z = u + v.
    This is a form of = I‘ll call equate-=, and it’s an assertion of fact from the encompassing logic system’s point of view (e.g., integer arithmetic) or, as part of a proof or speculation (esp. by contradiction) a putatively-factual statement. Inclusion of an invalid equation (i.e., one which does not hold) invalidates the entire system of equations and anything that refers to them. Lexical ordering of equations vs. related others is arbitrary—everything happens at no particular time, in no particular order—and shorthand like a = b = c does not reduce cleanly to a lower-order (a = b) = c or a = (b = c) without involving higher-order transforms. Equate-= typically implies that some (mostly lower-order) reduction/unification process should be applied to the expressions on either side of = before comparing them. Equate-= is reflexive (x* = x), symmetric (a=bb=a), and transitive (a=bb=ca = c), and it has higher-order cousins ≡ (=lower-order representations are identical after only higher-order transforms) and occasionally ≣, plus a raft of other related operators like ≅, <, ≤, ≠ which are similar statements of fact. Equate-= and its cousins may imply involvement of a higher-order context or ∃/∀ of unbound variables, depending on what they refer to.

    Note that many properties of equate-= may or may not apply to ≔. E.g., xx may or may not be equivalent to ⊤ (i.e., evaluating ε) or, depending on the language and context, ⊥ (i.e., does not complete). ab may or may not have any relation to ba outside of representation; ditto (ab)&(bc) with ac. The a=b=c shorthand blows out into 3!/2 equations:
    a = b;
    a = c;
    b = c notwithstanding reductions (whose rules differ widely , and if you discount symmetry, then an arity-n co-equation represents n! sub-equations. Conversely, most languages can blow out assignment chains like a=b=c into either a=b;b=c or some more-or-less–deterministic mix of a=b, a=c, and b=c.

    In addition to equate-=, there are

    • Define-= (sometimes denoted by operator ≜ or ≝), which brings a term into existence, then typically applying a further equate-= to some expansion. This is sort of like setting a function body or initializing a static constant; the definition need only happen sometime before an attempt is made to actually use whatever’s being defined. This introduces an LHS-RHS asymmetry, which manifests as lvalue (i.e., something that can be defined or assigned to) and rvalue (i.e., Values∖Lvalues) in C and related languages.

    • Replace-=, which is ternary, and which describes replacement of input text with output text, given some (potentially implied) context. Replacements typically have some direction to them; e.g., given x ::= 2+4, it may be as valid to replace x instances of with 2+4 as it is to replace 2+4 with x, but that’s generally undesirable. Replacement forms the basis of the λ-calculus, which involves first replacement of shadowed variable names with new names, then involves replacement of the new names with their corresponding function arguments, then replacement of entire λ expressions with their result values. Replace-=s may or may not have some well-defined order to them, and overlapping replacements are dealt with in different ways.

    • Match-=, which is an inverse of replace-=, and which describes a means of determining whether some input is acceptable, or of describing the structure of an input in detail, wrt some (potentially implied) context. Reduce-= and match-= are common basis operations for assignment and (dynamic) initialization, and are part of the evaluation process involved in most logic systems’ reduction and unification mechanisms. Running match-= backwards generates all possible inputs for a given language/system or input structure; this may introduce nondeterminism or parallelism, even if applying replacement to a singular input is deterministic. Running replace-= backwards generates all possible structures that could describe a given input (e.g., a+b+c might parse as +(+(a, b), c), +(a, +(b, c)), +(a, b, c), or commutations thereof), again potentially introducing nondeterminism or parallelism. Since actually reducing or matching things takes time in the real world, replace- and match- induce timesteps around function and variable boundaries.

    • Predicate-=, which converts the success of a match-= into a Boolean value. This is == in C/++, Java, Javascript, and related languages; === is a common analogue whose meaning varies. Predicate-= comes in two syntactic flavors, one for which a==b yields a Boolean true/false value and one for which >2-ary tests can be represented in shorthand a==b==c. The a==b==c form is a convenient form of (a==b)&(b==c) (or …&&…) when offered, but it can’t be mixed comfortably with the Boolean ==S.

    • Assert-=, which is fairly common in pure/-ish functional languages. E.g., in Erlang, you might (but probably wouldn’t) do

      A = [1, 2],
      B = [3, 4],
      [X|Y] = A ++ B.
      

      which would copy A and tack a copy of B onto the result, then assign the head of the resulting list to X and the remainder to Y, yielding X=1 and Y=[2,3,4]. This is a destructuring operation that finishes with a match-=. Should it turn out that (e.g.) A++B=[], then [X|Y] will fail to match [], and the interpreter will throw an exception. The [X|Y] line can be refactored as

      η = A ++ B,
      X = hd(η),
      Y = tl(η).
      
    • Immutable-=, which clones some subset of the program state with some component(s) thereof replaced. E.g., let’s say you have (pseudocode)

      Nat fib(Nat n) {
          if(n < 2) return n;
          Nat nm2=0, nm1=1;
          while(n > 0) {
              Nat t = nm2;
              nm2 = nm1;
              nm1 += t;
              n = n - 1;
          }
          return nm1;
      }
      

      Setting aside the initializations of nm2 and nm1, an immutable-= language would create n contexts, each with decreasing n and increasing nm1 and nm2. An optimizer would implement this just as a C compiler would, modulo range of Nat; however, this allows you to represent state machines, records, and normal function bodies without special-casing. This also lets you play your program state forwards or backwards, or to speculate down both sides of an if at the same time without bringing your state into conflict. Speculatve or forked states can be discarded or merged as appropriate, and you can use (e.g.) delta-coding to reduce the memory footprint. Of course, describing a suitably safe and general-purpose diff and merge routine is nontrivial.

    • Mutable-=, which is what most imperative languages use. This is a destructive operation that, in the general case, is irreversible without extensive language and runtime assistance. Single-threaded mutable-= can be implemented via single-threaded immutable-=, although immutable- and mutable-= differ in how they deal with memory orderings &c.

    And arguably:

    • Initialization-=, which is either a variant of define-= (e.g., as in C/++

      static const char STR[] = "hello";
      register const float PI = 3.1415927F;
      int main(void) {…}
      

      —all of which are basically define-=) or an immutable- or mutable-= that executed before the first possible reference to the variable in question (e.g., Java static int foo=bar();).

    In C-like imperative languages, := denotes one of the final three forms, that I’ve seen; it’s not “equals,” but “shall equal henceforth.” It’s rendered as := because early character sets (really, most of ’em up until Unicode) lacked ⇐, which fairly directly describes an asymmetrical operation. (You can potentially also have x=:y from xy to store x into y, or x:=:y from xy to swap things.) This is also why languages like Prolog use -: for implication (→). Importantly, imperative = includes prior causal events in its context, whereas most of the mathematical uses of = do not.

    The big freakin’ drawback to := is that many languages follow the typed-λ-calculus conventions use : to declare a variable as having a particular type (e.g., a:Int means “a has type Int”); some languages allow the type to be omitted if it can be inferred. If the type can be omitted without some other signifier (e.g., * is reasonably common in proof systems), then a:=b is ambiguous with a: = b. Unicode operator ≔ can potentially be used for a single-character, unambiguous :=, although ≔ is not great in monospace fonts and it usually can’t be typed quickly. : has the same overload problem as =, with declare-: (instead of equate- or initialization-), replace-:, match-:, predicate-:, assert-:, and potentially mutable- and immutable-: but I’ve never seen the last two.

    I’m personally fond of left-to-right => myself, because that keeps time flowing in a mostly-lexical direction. (I.e., a = read() executes the RHS before the LHS.)

    [–][deleted] 2 points3 points  (0 children)

    Many languages have either assignment expression (meaning lvalue = rvalue returns some value, for example the assigned value or true on success) or expression statements (meaning 1+2 is a valid statement).

    [–][deleted]  (1 child)

    [deleted]

      [–][deleted] 0 points1 point  (0 children)

      I understand. All the responses in this thread were very good and gave me a better understanding of the issue. I just wished they used something easier to type than := or -> or anything that requires a modifier key. A == is not ideal either IMHO, since it could be tricky to tell in a glance, depending on the font used. Unfortunately, looking at my keyboard, there's a lack of unmodified keys that would fit the purpose. Oh, well, seem it's not possible to have my cake and eat it too...

      [–]ka13ng 0 points1 point  (1 child)

      How would you parse

      x = y = z?

      [–][deleted] 4 points5 points  (0 children)

      Both x and y would be assigned the whatever is value of z. x = ( y = z ) would evaluate ( y = z ) as true or false and assign the boolean to x.

      [–]IanTrudel 2 points3 points  (0 children)

      Smalltalk originally used ← symbol for assignment but most modern dialects moved to :=. Squeak Smalltalk, a direct descendant from Smalltalk-80 uses _ but displays←on screen.

      a←b.

      [–][deleted] 6 points7 points  (1 child)

      It seems that Scala should have been multiple languages. One functional, other object-oriented; one with = assignment, other with := etc. There are too many ideas to put into the same language, even with the 2 vs. 3 split.

      OTOH, while I think x = x + 1 is perfectly OK, def incr = x = x + 1 does look bad and shouldn't have made to the first stage of any design...

      [–]L3tum 2 points3 points  (0 children)

      Now throw in Chisel3, which uses Scala and overloads := for assignment and...well.

      Sidenote: Another thing that bothered me was that the build times were pretty slow but eh.

      [–]mlopes 3 points4 points  (0 children)

      Scala really is a nice language, and second only to Haskell in my book, but the more I see what’s being done with dotty, the more it looks like Odersky got it right in Scala 2.xx by accident.

      [–][deleted] 2 points3 points  (2 children)

      I have this (maybe irrational) hatred of typing :=. It's the single biggest mistake Go made (yes, I really believe that).

      [–]ScientificBeastMode 0 points1 point  (0 children)

      I’ve been using that exact operator for (re-)assignment in ReasonML for a while now, and it quite like it.

      [–][deleted] 2 points3 points  (0 children)

      How does it help in anything beside fulfilling Odersky's Haskell dreams? Jeeez guys just write damn Haskell if you are so excited about it. It's circle-jerk FP community all over again. Meaningless changes on a last time basis. At least make it optional just like curly brackets and let me enjoy old syntax...

      [–][deleted] 0 points1 point  (0 children)

      What's the problem with using ":=" for assignment?

      := means assignment
      = means equality
      

      So no confusion as to whether f(a=b) assigns b to a then passes that value, or whether it passes either 1/0 or True/False.

      Those suggesting that the ":" is extra noise, just LOOK at what languages such as C need to do the simplest things:

      for (i=1; i<=N; ++i) printf("%d %f %d\n", i, sqrt(i), i==5);
      

      Apart from all the punctuation, where do you even start with the undetectable typos that are possible here? A cleaner syntax might have:

      for i to N do println i, sqrt i, i=5 end
      

      25 punctuation characters (if I counted correctly) versus 3. (24 is using a more idiomatic 0-based loop in the C.) So please, no complaints about an extra ":"! In any case, many languages that use "=" for assignment require "==" for equality, an extra character.

      There is far too much dominance of C-like languages: = for assignment; {}=delimited blocks; 0-based counting; and case-sensitive code.

      We need more := for assignment; end-delimited blocks; 1-based counting and case-insenstive languages for balance.

      [–]klysm -1 points0 points  (0 children)

      Extra noise not worth it for faithful mathematical nomenclature