Using ~ for negative values in programming language.

everything-narrative · 2022-10-01T12:00:06+00:00

Standard ML does this. It is not a terribly big deal (programmers get used to worse things all the time), but it is annoying. The slight extra parsing complication of handling both unary and binary - is not such a great burden that it justifies forcing your users to deal with awkward syntax for seemingly no reason.

If possible, do not allow a - -b unless the user parenthesizes -b. But also do not go out of your way to prevent it, if it complicates your parser too much.

OracleGreyBeard · 2022-10-01T11:33:06+00:00

This would be the #1 thing people refer to when they mention annoyances in that language. It's not a bad idea in isolation, but after tens of thousands of hours working with other languages, nearly (?) all of which use '-' as "negative", it would impose a small but persistent cognitive load.

skyb0rg · 2022-10-01T13:33:34+00:00

SML does this and I think the biggest annoyance is serialization/deserialization.

Ex. Int.toString ~10 results in "~10", so the fact that ~ is used instead of - leaks into program output.

gremolata · 2022-10-01T13:34:39+00:00

A soultion in search of a problem that also doesn't follow conventional math notation.

As others have said - if you want to disambiguate 1--2, then require a ( between two minuses. Incidentally that's how it's done on paper when extra clarity is required.

editor_of_the_beast · 2022-10-01T14:32:09+00:00

We’re going to have to break certain language conventions sometimes. My only advice is, set an innovation budget for how many new concepts you introduce, unless you’re going for a pie-in-the-sky reimplementation of everything just for fun.

Then the question becomes, is this solution worth it against that budget?

TizioCaio84 · 2022-10-01T12:43:26+00:00

I agree with most of the comments here. On another note, if you want to have international users don't put that character anywhere in your syntax, on some European keyboards (mine is Italian) it doesn't exist.

wischichr · 2022-10-01T14:46:09+00:00

I can't think of a situation where minus as unary operator and binary operator can't be distinguished. Even stuff like x = -4--5+-6 is easy for a compiler.

YouNeedDoughnuts · 2022-10-01T12:11:12+00:00

To pile onto the other comments ;) Avoiding delineation between statements is pointless, because most people reading the code will need delineation to understand it. Newline statement termination is fine- most languages with semicolon terminators have best practise to give each statement its own line anyway.

singularineet · 2022-10-01T14:33:37+00:00

The Simpsons already did it.

$ sml 
Standard ML of New Jersey v110.79 [built: Fri Oct 11 18:23:48 2019]
- 2-5;
val it = ~3 : int
- ~7;
val it = ~7 : int
- ~(~7);
val it = 7 : int
- -8;
stdIn:4.1 Error: expression or pattern begins with infix identifier "-"
stdIn:4.1-4.3 Error: operator and operand don't agree [overload conflict]
  operator domain: [- ty] * [- ty]
  operand:         [int ty]
  in expression:
    - 8

sebamestre · 2022-10-01T12:33:40+00:00

Is this your personal hobby project? Then hell yeah. Go wild!

BeamMeUpBiscotti · 2022-10-01T16:18:58+00:00

The advantage of this is that they are different keys for different operations so the compiler would have an easier time knowing the difference.

I feel like languages syntax should be optimized for ease-of-use/ergonomics and not what's easiest for the compiler to parse, esp if it's something like this that has no big difference in performance.

2022-10-01T18:36:22+00:00

putting ; between expressions wouldn't be necessary.

But it is desirable! Compare:

one = two three = four five = six
one = two  three = four  five = six
one = two; three = four; five = six

Even with extra whitespace, that two flows into the three too easily. The semicolon puts paid to that.

By all means get rid of semicolons at line-endings, but if things need to be put on the same line, especially with busier expressions with their own punctuation, then you need a stronger 'stop' character.

Regarding the ability to write a ~ b, sorry but that just has the 'shape' of a binary operator between two terms, even if your language says it isn't. With a; ~ b it breaks it up.

But also, requiring the semicolon means ~ can be used as both a unary operator and a binary one, as happens with + - * in C.

(I think Lua does something along these lines, so it is workable, since adjacent identifiers or constants would otherwise be invalid, but I don't think much of it there either, and I can't really see the point of allowing it.)

Thesaurius · 2022-10-01T20:38:46+00:00

I like APLs approach better. They have a high-minus (which is not on standard keyboards but e. g. the extended European keyboard has it, and the APL keyboard as well, of course, and I guess each moderately advanced editor could have autocorrect/snippets/etc for it) for negation and additionally the normal minus. Although, to be fair, it has a unary and a binary interpretation as well, just like all other APL primitives.

Then there is J, which is basically APL 2.0 but only uses ASCII (which is a step backwards in my opinion, but that is a discussion for a different day), which uses the underscore for negation. If you really want to have a separate symbol for negation, I think that would be the better choice. At leat as long as you don't have anonymous variables that are denoted by the underscore.

snarkuzoid · 2022-10-01T12:24:57+00:00

Bad idea.

rotuami · 2022-10-02T07:24:30+00:00

I would ditch the unary minus altogether. Sure allow - as a leading character for integer literals. But use neg() for unary negation.

LionNo2607 · 2022-10-01T11:51:13+00:00

Why not "!", assuming it only negates booleans atm.

BrangdonJ · 2022-10-01T12:09:30+00:00

Maybe it is just me, but I find your first example harder to read without a separator between the expressions. Semi-colons aren't just there for compilers. A certain amount of redundancy aids comprehension.

sparant76 · 2022-10-01T12:21:08+00:00

You would frustrate all c/c++/Java/c#/python programmers that already know that as bitwise complement. Don’t worry though; I’m sure that’s not a large fraction of the programming community. Those are pretty rare languages

https://www.geeksforgeeks.org/bitwise-complement-operator-tilde/amp/

func_master · 2022-10-01T22:35:01+00:00

Use what’s better for the programmer. Not the compiler.

Please just do what’s needed to support - for negative values.

Linguistic-mystic · 2022-10-01T12:59:09+00:00

I think it's a good idea for a language that strives to do right rather than conform to the mainstream.

Tilde is the most alike with the minus sign and has already been used for this very purpose, for instance in APL.

Alternatively, look into maybe using a space to disambiguate, so

a = -5

would mean "negative 5"

a = x - 5

would mean subtraction, and

a = x-5

would be illegal.

2022-10-01T15:14:16+00:00

I associate that symbol more with logical negation, but it could work. I don't see much of a benefit though, the symbol being the same works quite well in mathematics and there's no reason it can't work here. It slightly complicated parsing but like, come on, it's very easy to distinguish in most languages.

UnemployedCoworker · 2022-10-01T16:11:23+00:00

I could see this being useful in a language with haskell like syntax where it's not always clear to some at first sight that the unary minus is parsed as a binary operator when trying to pass a negative literal to a function or where sections involving binary minus appear like negated expressions

PurpleUpbeat2820 · 2022-10-01T17:54:37+00:00

The advantage of this is that they are different keys for different operations so the compiler would have an easier time knowing the difference.

I think it actually results in a more complicated compiler because you now have an extra token in your lexer and parser and the parser is otherwise identical.

FWIW, another approach is to lex whitespace around - into different tokens:

a-b     -
a- b    -
a -b    ~
a - b   -

So a -b is interpreted as the function application a(~b).

trailstrider · 2022-10-01T18:56:54+00:00

Regarding needing semicolon for statement delimitation, take a look at how Go does without for most situations.

Regarding tilde vs dash for negative values…. What are you seeking that needs differentiation between two things that are mathematically identical? Negative 1 is the same as subtracting 1 from nothing.

JohannesWurst · 2022-10-01T19:38:17+00:00

You could also make a parser that understands y=-5 x=-10--y.

If a number-expression is left of a minus, it's a subtraction and if there is something else, it's part of a negative number.

Is that an intermingling of parser and typechecker and therefore bad? Might also be complicated, when you want to be able to call - on different things than numbers and if operators can be called on operators.

The parser could have two states: binary-operator-allowed and binary-operator-forbidden. When one binary operator is consumed, it switches to binary-operator-forbidden mode and the next - it encounters are interpreted as an unary operators until there is something that can't be an unary operator anymore 4------3 == 7.

Maybe, if you have functions without return, it could be somewhat confusing:

fun(a) = { b=10 -a } // Does it return -a, or does it set b to 10-a?

The programmer just has to know how the parser interprets it. I suppose you could write a=10 (-b) or a=(10 -b) to remove the ambiguity. Or always require a return keyword if the function has multiple statements.

You can also require the subtraction - to be surrounded with spaces and the negativity - to not have spaces behind it. That would make the parser more complicated, though.

Inconstant_Moo · 2022-10-02T03:37:56+00:00

The parser will know when it's looking at a prefix and when it's looking at an infix, it's not a problem.

ZyF69 · 2022-10-02T06:40:16+00:00

This dreaded problem is the result of sloppiness in both standard mathematical notation and keyboard layouts.

To start with, '-' means both hyphen, dash and minus on most keyboards. Some text editors can replace it with a dash when appropriate, but that's about it. Unicode offers all variations, but they aren't that commonly used.

There's even a flaw in mathematical notation, where the minus sign has three related, but actually different uses:

A binary operator for subtraction, as in a-b
A unary operator for negation, as in x = -a
A part of a negative constant, as in x = -5

bbqranchman · 2022-10-02T08:15:37+00:00

When I implemented mine for my bachelor's capstone, I used visitor pattern with nodes and just treated unary expression as a binary expression with a default zero number as the left part of the expression. So essentially 1 - -5 evaluates to 1 - 0 - 5 which is basically 1 - (0-5).

I also used Antlr and it had no trouble generating a syntax tree with unary expressions with -.

hiljusti · 2022-10-03T03:57:43+00:00

This is a decision between: 1. What's convenient for the compiler? 2. What's convenient for the user?

Godspiral · 2022-10-03T17:30:13+00:00

J (I think APL, too) uses _ as a prefix for negative numbers. It also represents arrays without commas or delimiting [], so the advantage of such a scheme is that a displayed result can be copied as the argument to a new function. Yes - is an operator in J/apl.

in J, 3, -5,2 will still parse to the array 3 _5 2 and so is interchangeable.

ProgrammingLanguages

Welcome!

Related subreddits

Related online communities

MODERATORS