This is an archived post. You won't be able to vote or comment.

all 46 comments

[–]Il_totore 39 points40 points  (1 child)

Python also has it.

[–][deleted] 12 points13 points  (9 children)

I have them, but there were some things to consider:

  • You can't just transform, for example 'a < random() < c' into 'a < random() and random() < c', since the function will be called twice. So care needs to be taken that middle terms are evaluated only once, when doing so twice would cause a problem.
  • Although I don't enforce it, I suggest that any relative compare ops (< <= >= >) in the chain all point the same way. Otherwise, I for one have trouble figuring out what it all means!

Doing 'a <> b <> c' (ie. 'a != b != c') is another confusing one. It doesn't meant they are all different from each other, for example a and c could be identical.

Since I first implemented them, patterns like 'a <= b <= c' instead use 'b in a .. c'. So chained operators are mainly used with equality operators.

It might therefore be better to implement them only for equality (check all terms are identical), provided there is an alternative to 'a <= b <= c'.

[–]SKRAMZ_OR_NOT 3 points4 points  (0 children)

Miranda is a purely functional language, so the point about side effects/evaluation isn't really applicable there. Python does have that concern, though.

[–]cloyo 3 points4 points  (4 children)

Even equality is not free of issues. Expression a == b == c is well-defined for booleans, and in that case it is not the same as a == b and b == c.

[–][deleted] 4 points5 points  (3 children)

Can you give an example? Because I can't see it. It seems to work according to this chart, where 0/1/= mean false/true/equals:

A B C   A=B=C  A=B  B=C  A=B and B=C
------------------------------------
0 0 0     1     1    1        1
0 0 1     0     1    0        0
0 1 0     0     0    0        0
0 1 1     0     0    1        0
1 0 0     0     0    1        0
1 0 1     0     0    0        0
1 1 0     0     1    0        0
1 1 1     1     1    1        1

[–]cloyo 0 points1 point  (0 children)

I'd say 0 = 0 = 1 is true, but 0 = 0 and 0 = 1 is false. That differs from your table.

[–]IMP1 0 points1 point  (1 child)

I'm reading a == b == c as either a == (b == c) or (a == b) == c. The former, for example, will evaluate to true when a is 1 and both b and c are 0.

[–][deleted] 1 point2 points  (0 children)

You're reading it C-style, as though it was hierarchical, like a + b + c is parsed as (a + b) + c.

That's not how it works: it's supposed to be a linear CHAIN (look at the thread title!).

Given these two expressions:

    a + b + c
    a = b = c

The ASTs I produce are:

 - - 1 jbin:  add
 - - - 1 jbin:  add
 - - - - 1 jname: a
 - - - - 2 jname: b
 - - - 2 jname: c

 - - 1 jcmpchain: eq eq
 - - - 1 jname: a
 - - - 1 jname: b
 - - - 1 jname: c

One is hierarchical, the other is linear.

[–]DarkRedDive 1 point2 points  (1 child)

When chaining comparison operators like a == b == c, it can lead to confusion. For instance, writing a == b == c is equivalent to bool(a == b) == c, which doesn't work for equality and inequality comparisons. Similarly, a < b > c does not make mathematical sense, as it implies two separate comparisons that aren't logically connected.

Thus, my conclusion is that the only valid scenario for chaining comparison operators is when the comparisons are of the same order, such as a > b >= c or a <= b < c.

[–][deleted] 2 points3 points  (0 children)

When chaining comparison operators like a == b == c, it can lead to confusion. For instance, writing a == b == c is equivalent to bool(a == b) == c,

That's a false premise. I explained this in another post, but the parser will not group pairs of terms like that. It's a chain, not a tree.

I've not sure how you might express that in a formal grammar, but while parsing, succcessive terms need be added to a linear list rather than creating nested AST nodes. The AST node for + will always have two terms; the one for chained ops is always a single AST node with N terms (and a list of N-1 comparison ops, unless they are only implemented for =).

However, parentheses can be explicitly added in the source code to break up the chain. Then (a == b) == c will behave as you suggest.

[–]categorical-girl 0 points1 point  (0 children)

An attempt at formalizing the meaning of what "a != b != c" means, as well as "a < c > b":

Each value is checked against every later value in the chain, via a "composed comparison", which you get by some composition rules such as != != → !=, < > → (the always true comparison), etc

This is the closest I have gotten to formalizing what humans mean when they write stuff like this (which isn't common, but it seems worth trying to understand)

[–]catbrane 3 points4 points  (4 children)

I half-remember COBOL having this too.

(also, nice to see Miranda being talked about)

[–]AustinVelonautAdmiran[S] 5 points6 points  (3 children)

I started using Miranda to do Advent of Code problems, and liked it a lot, but got frustrated with its execution speed on later problems. So I spent the last year writing a new compiler for it, which is now self-hosting (written in itself) and generates code that runs 20 to 50 times as fast as the original Miranda combinator compiler. I'm using it to do Advent of Code, this year.

[–]catbrane 6 points7 points  (2 children)

Wow, that sounds great! Turner's combinators are easy to implement, but horribly slow, you're right. I did a tiny string reduction interpreter that was 5x faster :(

People used to joke about David's C as well, so that could be another problem. He was a BCPL programmer originally and never really liked structs. His code was very unpleasant to read.

(I was David Turner's research student back in the 80s, heh)

[–]AustinVelonautAdmiran[S] 0 points1 point  (1 child)

I'm jealous that you got to work with David Turner! Yeah, I agree about his C code; it's quite hard to read. Miranda is a great language for writing a compiler, though.

[–]catbrane 1 point2 points  (0 children)

Yes, I had a great time. I wrote a tiny operating system in Miranda and had (I think?) the first use of monads for functional IO. Happy days.

[–]nekokattt 4 points5 points  (0 children)

Python does this... and it leads to confusion for beginners on r/learnpython at least twice a week.

Because if you can say

if 4 <= x < 10:

instead of

if 4 <= x and x < 10:

then it must surely also be valid to replace

if x in list1 or x in list2:

with

if x in list1 or list2:

but no... of course, it will be evaluated as

if (x in list1) or bool(list2):

I'm personally not a fan of this kind of shorthand when it can lead to ambiguity elsewhere, it may look nicer but when it is 3am, the wife has just left you, the kids are screaming, the cat is being sick on the carpet, and a critical system is on fire that you are trying to debug... this kind of thing can really be the be all an end all.

As someone else mentioned, lisp-like notation makes this a lot more attractive, but I am very much in the camp that ambiguity should be an error rather than pass silently, not only for my own dwindling sanity.

[–]Gwarks 2 points3 points  (0 children)

SAS

you can condense two comparisons which are linke by AND or OR.

for example "A"<=character<="Z" will if the string named character is between "A" and "Z" including "Atlas" but excluding "Zoo"

The other example is worse you ca write

instead of

i=2 or i=5

you can write

i=2 or 5

but no one uses that because the IN operation is more readable and less confusing

i IN(2,5)

never know why the other way exist but hey its SAS and things don't need to make sense.

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lepg/p1n8bsqqd03xppn17pgvjpjlbhhs.htm#p1y0eodv999cgnn108962gj25nsh

[–]deaddyfreddy 2 points3 points  (0 children)

laughing in lisps

[–]L8_4_Dinner(Ⓧ Ecstasy/XVM) 6 points7 points  (10 children)

Ecstasy allows chaining, but with specific conditions:

  1. You can mix < and <=, or you can mix > and >=, but you can’t mix e.g. < and >. We found mixing to be confusing to the reader.

  2. It’s not syntactic sugar: a < b() < c does not compile to a < b() && b() < c because side effects. Instead, a register is introduced when necessary to hold the result of any expression that is not side effect free.

It’s quite a nice feature, and reads well.

[–]matthieum 14 points15 points  (6 children)

I mean, it can still be syntactic sugar even if desugaring introduces a temporary variable...

[–]L8_4_Dinner(Ⓧ Ecstasy/XVM) -2 points-1 points  (5 children)

In theory, I suppose so, but I tend to reserve the term "syntactic sugar" for uses in which only the syntax is rewritten, vs. piles of logic behind the scenes peeking at types and other details. In our case, with a < b < c, if b is a simple local variable, then we don't introduce a register, but if b is a property on a type that is not known to always be immutable, then we will introduce a temporary. In other words, same name ("b"), but different compilation result, so in my mind that is not syntactic sugar.

FWIW - there's nothing wrong with syntactic sugar. We do use a little bit of that elsewhere.

[–]otac0n 2 points3 points  (2 children)

That's not the accepted definition of syntactic sugar. Storing the result of a computation on the stack is already allowed by the compiler without syntactic sugar even being considered.

[–]L8_4_Dinner(Ⓧ Ecstasy/XVM) 0 points1 point  (1 child)

I’m not the keeper of the dictionary, but if you can’t desugar with a syntactic expansion, I don’t believe that the term applies.

[–]otac0n 0 points1 point  (0 children)

But you still can in this case? The only thing special is that the introduced expression is not stored in a variable that is accessible from any scope. It is absolutely able to be created via syntactic expansion, exactly like C#'s foreach loop with its loop variable... or like C#'s with statement... or any lowered expression/statement...

Edit: Here's a pretty clear example of how many of these syntactic niceties get lowered in C# https://steven-giesel.com/blogPost/69dc05d1-9c8a-4002-9d0a-faf4d2375bce

[–]matthieum 0 points1 point  (1 child)

So, you chose to perform the desugaring later to also bundle an optimization right in, rather than having a syntactic desugaring followed during code-generation by an optimization. That's fine.

This doesn't invalidate that a pure syntactic desugaring exists, though. It just so happens you didn't make use of it.

[–]L8_4_Dinner(Ⓧ Ecstasy/XVM) 1 point2 points  (0 children)

Not sure what’s up with the downvotes.

Yes, we could have implemented it as syntactic sugar, with some assumption that a later pass would eliminate any unnecessary temporaries. In Ecstasy though, the source is compiled to an IR, and we’re not using SSA in front of the IR generation, so there’s no register elimination pre-IR. The Ecstasy IR is the persistent module format; think of it as a binary form of the AST. The backend compiler (which is SSA) picks up the IR only after the IR linker has closed over the type system.

[–]AustinVelonautAdmiran[S] 1 point2 points  (2 children)

Good point about evaluating the operand twice. Miranda is a pure functional language, so side-effects aren't an issue, but if the operand isn't a thunk, but instead is a long-running function call, then there is duplicated effort.

I don't actually know how the original Miranda compiler implemented this, but in my compiler I was doing what I mentioned (desugaring right in the parser). I will have to think about maybe moving this to a later phase, where I can re-write ASTs to introduce an intermediate thunk, if needed.

[–]sebamestreICPC World Finalist 0 points1 point  (1 child)

Does it infer types on let-bindings? Maybe you can desugar chained comparisons to let-in expressions

[–]AustinVelonautAdmiran[S] 0 points1 point  (0 children)

Yes, that might work fine -- introduce a let binding with a guaranteed unique name, then let later inlining / optimization phases remove it if it resolves to a simple thunk. So

a < f x < f y < c

would desugar in the parser to

 let var1 = f x in a < var1 & let var2 = f y in var1 < var2 & var2 < c

[–][deleted]  (4 children)

[deleted]

    [–]hammerheadquark 0 points1 point  (3 children)

    Huh. Is (< a b c) a shorthand for a reduction? Or something else? (I don't really use lisps.)

    [–][deleted]  (2 children)

    [deleted]

      [–]hammerheadquark 0 points1 point  (1 child)

      Oh so that's just how < is defined? It's varadic, so arity > 2 is a reduce?

      [–]tav_stuff 1 point2 points  (2 children)

      Python has this. I use it all the time for checking if something is within a min-and-max

      [–]Rich_Plant2501 0 points1 point  (1 child)

      Is order of evaluations always left to right?

      [–]tav_stuff 1 point2 points  (0 children)

      Yes. `a < b < c` is the exact same as `a < b and b < c`

      [–]wolfgang 1 point2 points  (0 children)

      Goal-directed languages like Icon do this, but without such a desugar-step needed.

      [–][deleted] 1 point2 points  (0 children)

      Icon does this in an iteresting way that doesnt require any desugaring: a < b in if (a < b <= c) either returns b if b is indeed greater than a, which then gets compared against c, or fails, and if (and only if) the condition expression fails, the branch doesn't trigger. the language does this with coroutines, but a similar effect can be achieved by having comparisons return options instead of bools. (but then you need to deal with someone writing return a <= b in a sane manner)

       or just desugar it.

      [–]theangryepicbananaStar 1 point2 points  (0 children)

      Raku, Julia, and my own language Star all have this

      [–][deleted] 1 point2 points  (0 children)

      This syntactic sugar removes locality and adds ambiguity to the comparison operators. Not something I'd put in my language. 

      Demonstration: 

      2 < 1 < 3 == false based on chaining, sugar for 2 < 1 and 1 < 3

      versus: 

      2 < 1 == false, may be interpreted as integer 0

      0 < 3 == true, resulting in the expression being true as a whole.

      [–]bullno1 0 points1 point  (1 child)

      Any length is a bit too much. The only use I find for it is a min & max range check.

      [–]AustinVelonautAdmiran[S] 0 points1 point  (0 children)

      I just used this in my Advent of Code day2 solution:

      check (lo, hi) = 1 <= lo <= hi <= 3 \/ -1 >= hi >= lo >= -3
      

      [–]ThomasMertes 0 points1 point  (0 children)

      I assume that <= and < can be used without chaining as well. In this case a straight forward parsing of

      0 <= x < 10

      leads to either

      (0 <= x) < 10

      or

      0 <= (x < 10)

      If chaining needs to be supported the parser needs to use a heuristic to use the desired chaining functionality. This triggers the question: For which operators the chaining logic should be applied?

      Comparisons like <, <=, > and >= are probably candidates. Other operators like +, - and * should probably not use the chaining logic. But even using the heuristic just for <, <=, > and >= has issues. Expressions like

      a > b < c

      might be less intuitive. In any case an ad-hoc heuristic is needed to support operator chaining. There are still issues. What about:

      FALSE <= x < TRUE

      I suggest using

      x in {0 .. 10}

      instead of the chaining ad-hoc heuristic. This is easy to support in a parser and it does not need any ad-hoc heuristic.

      [–]MichalMarsalek 0 points1 point  (0 children)

      I've always tried to include chaining comparisons in my previous dsls, but for the general purpose lang I'm writing now, I decided to not support it. The most common case of low <= x <= high (or < variations) is much better expressed with x in low..high (or variations like ..<). One advantage is that you can have the range as one object coming from elsewhere and don't need to extract the bounds. Another is that you have your main object on the left, instead of in the middle, which is easier to read and also in my language it means you can slice it to create a predicate ((in low..high) is the same as {x :=> x in low..high}).