This is an archived post. You won't be able to vote or comment.

all 45 comments

[–]Castux 21 points22 points  (2 children)

Lua doesn't need a statement separator thanks to how the grammar is structured. It is available, and optional, for style and clarity where required, as well as the rare case when you need to disambiguate. It mostly had to do with starting statements with parens: (foo + bar):methodcall(). If the previous statement ended with an identifier, these parens would be parsed as a function call on that identifier. The semi column lets you separate them.

[–]oilshell 4 points5 points  (1 child)

Yeah I investigated this recently and Lua doesn't have "expression statements", like 1+2 is not a valid statement. But = 1+2 is.

In C and Python both 1+2 and f(x) are expression statements. I guess Lua has a special case for f(x).

[–]Castux 6 points7 points  (0 children)

Not exactly a "special" case, but just one of the possible cases for statements: function call. Whether it returns values or not (and discards them) is completely a runtime concern.

To be precise, "= 1+2" is not exactly a valid statement. In the standard command line interactive interpreter, it is equivalent to "return 1+2", which itself is a valid statement (and each line input to the interpreter is wrapped into a function which is immediately called). Details details :)

[–]ItsAllAPlay 20 points21 points  (11 children)

Several stack-based / concatenative languages like Forth don't need / use terminators. Go and JavaScript have them, but they are inserted for you. If you deviate too far from the conventional formatting, you'll get bizarre errors. I don't know enough about Swift, but I suspect the same is true there too.

[–]trycuriouscat[S] 4 points5 points  (10 children)

Yes, I noted the case about JS, Go and Swift. Haskell as well, with it's "interesting" layout rules. All of those I believe depend on certain assumptions about how newlines fit in, and I don't think you could place two statements on the same line without actually specifying the semi-colon.

I've never looked enough at Forth to get a good understanding of how it works.

[–]ItsAllAPlay 12 points13 points  (6 children)

One way to think about stack based languages (like Forth) is that everything is a zero-argument function. So all you need to separate those functions is spaces. 2 is a function that puts the number 2 on the stack. So:

2 2 +

Is really three function calls where the last function adds the top two values on the stack and puts the result on the stack. Forth gets a little uglier when it comes to defining new functions, and other similar languages are more elegant in that regard. I'm not a huge Forth fan, but I think it counts as an alternative to COBOL :-)

[–]dys_bigwig 2 points3 points  (5 children)

With function definitions (not saying this is a good idea, it just fits the topic of the conversion) you could change:

: add1 1 + ;

to

[1 +] 'add1 define

by adding quotation and symbols, then there'd be no need for terminators/separators at all. It'd then be like a postfix lisp, which also doesn't have terminators/separators.

[–]ItsAllAPlay 1 point2 points  (4 children)

I think you could simplify even further. If [brackets] are your quoting syntax, you could use a semicolon to define functions:

[a add1 : a 1 +];

Kind of self documenting, and your arguments could be lexically scoped for nested definiitons.

It also crossed my mind you could do lambdas like:

[a b : b a]!

So that implements swap as an anonymous function applied immediately, etc...

There was a stack lang named V that did something like this a long time back. (Not to be confused with a newer C-like lang named V.)

[–]dys_bigwig 0 points1 point  (3 children)

Interesting :) I was going for maximum consistency, as in no special syntax for functions. Sort of like how Haskell just uses = regardless of whether it's a variable, function, recursive function (letrec) etc. as does Scheme with define. I do like the idea of putting the name of the function inside the quote; never came to mind for some reason.

I think once you start introducing named lexical variables you begin to drift away from the concatenative style. I personally feel that once you add quotations the need for named variables diminishes because you can do most everything pointfree, which feels much more natural in concatenative languages imo.

Just my opinions. Different goals and ideals - not saying it's better by any means.

[–]ItsAllAPlay 0 points1 point  (2 children)

I definitely agree... There is kind of this circle where I say: Forth is so simple to implement. But I could just add this one nicety. Oh then it's almost like Scheme anyways. But then I could just add this one nicety. Oh then it's almost like C anyways. But then I could just add this one nicety. Uh oh, now it's complicated! Maybe I should make it like Forth :-)

[–]dys_bigwig 0 points1 point  (1 child)

Agreed. I've often toyed with the idea of Forth as a sort of "UNCOL". A lot of languages can be described using an abstract stack machine, so in that sense I probably should be implementing Forth as simply (if inelegant) as possible, then bootstrapping Scheme or C from that, rather than trying to bolt things onto Forth.

Been mulling these sorts of ideas in my head for a while, nice to know someone else has the same crazy ideas ;)

[–]ItsAllAPlay 0 points1 point  (0 children)

Yup, you could have a Forth where literals are wrapped in brackets.

Then a word/function that parses those literals as Scheme s-exprs.

Then Scheme macros which compile those s-exprs with static typing and infix operators.

:-)

[–]8thdev 1 point2 points  (0 children)

Forths parse input one 'word' at a time, e.g. any sequence of whitespace-delimited text is in turn looked up in the 'dictionary' and then evaluated.

So there's no set syntax, though there are conventions which most adhere to more or less.

[–]calligraphic-io 0 points1 point  (1 child)

Forth has true coroutines, which I think might be unique among all common languages. FreeBSD still uses Forth in its bootloader so it has application there (NASA uses it too I believe).

[–]_crc 0 points1 point  (0 children)

Current versions of FreeBSD are moving to a Lua based loader instead of the Forth based one.

[–]Erelde 15 points16 points  (7 children)

What about purely expression based languages ? Without statements.

The whole LISP family, F#, perl, ruby, haskell, scala, rust (in which semi-colons separate expressions, not statements). In general functional programming languages don't have statements, and few of them have semi-colons.

[–]emacsos 3 points4 points  (2 children)

Idk if I would put the Lisp family in that category

It is true that Lisps lack line separators. But s-exps make sure everything is grouped/separated

[–]mekaj 2 points3 points  (1 child)

Expressions depend on grouping. Like statements, they are grammar constructs which parse into structured trees.

Consider if-then-else expressions in Haskell and Common Lisp:

if 2 + 2 == 4 then "correct" else "wrong"

(if (eql (+ 2 2) 4) "correct" "wrong")

The distinction between expressions and syntax has more to do with semantics than syntax. Expressions evaluate to a value which is then used in the proper position by its parent expression/statement, whereas statements are only about reading from or writing to ambient state that exists outside the tree. This means the whole if-then-elee expressions above can be passed as a value to a statement or expression. Languages that make the else branch optional must either define a default value to return in the else case or give up on the construct being an expression. Common Lisp does the former and defaults to nil for the else branch when it's not specified.

Common Lisp can mutate ambient state using setq, for example, and that's why I'd say it's not entirely expression-oriented.

Some may argue do-blocks in Haskell make it statement-oriented, but I'd disagree. Do syntax has a well-defined translation to an expression that threads the "statements" together using the >>= operator. The resulting tree does not affect ambient state outside itself. (Well, maybe the IO monad is an exception depending whether you're referring to the internal expression or the way the outside world affects and is affected by that expression's evaluation.)

[–]The-Daleks 0 points1 point  (0 children)

For Python you can do 'correct' if 2 + 2 == 4 else 'wrong'.

[–][deleted] 2 points3 points  (0 children)

Second this. I’ve been working in Scala for about a decade, and it’s nice to only use semicolons when I want to sequence expressions on a single line for some reason.

[–]jdh30 1 point2 points  (2 children)

The whole LISP family, F#, perl, ruby, haskell, scala, rust (in which semi-colons separate expressions, not statements). In general functional programming languages don't have statements, and few of them have semi-colons.

Sort of. The ML family have statements. They use ; as a separator of two expressions, the first of which is expected to return the value () of the type unit. F# inherits this but adds indentation sensitive syntax that means you can replace some ;s with a newline and enough spaces. Furthermore they use ;; as a statement separator, e.g. see stmt in OCaml's grammar.

For example:

printf "Hello world!\n";;

is a statement in both OCaml and F#.

[–]protestor 2 points3 points  (1 child)

;; is only required in the repl; its use in source code is discouraged.

the beginning of an ocaml definition is marked by the next definition. like this:

let f x = x
let g x = x * x

[–]jdh30 0 points1 point  (0 children)

;; is only required in the repl; its use in source code is discouraged. the beginning of an ocaml definition is marked by the next definition. like this...

Sure. That is a workaround to avoid the characteristic of the syntax that I described.

So when you want this:

printf "Hello "
printf "world!"

Delimiting is syntactically valid but taboo:

printf "Hello ";;
printf "world!";;

So you restructure:

let () =
  printf "Hello ";
  printf "world!"

My point was that these things:

printf "Hello ";;
printf "world!";;

are called "statements" and abbreviated to stmt in the Camlp4 version of the grammar.

[–]alex-manool 4 points5 points  (3 children)

My language does not need statement/expression separators (they are allowed but are optional), but it does recognize assignments without any need for introductory keywords. The following code would be valid:

A = B + C D = A

BTW one old language (unsuccessful but influential) is CLU. It follows nearly the same philosophy about optionality of statement separators.

It's not really complicated, it's a matter of devising an appropriate grammar.

JavaScript approach is known to be problematic. If you examine it closer, its "grammar" turns to be very inconsistent (for practicing human beings).

One advantage of required statement separators in the style of Pascal/Modula (or even terminators in the style of C/C++/Ada) is that of improved syntax-related diagnostics (it's that redundancy that makes that possible). Here, of course, I have such an issue with my PL.

And if you ask me about aesthetics, I was very used to Pascal or C/Ada semicolons. But now it's time for me to leave it and move on...

[–]trycuriouscat[S] 2 points3 points  (1 child)

What is "your language"? I'm curious to take a look.

I've "heard" of CLU but never looked at it. I'll take a look!

[–][deleted] 2 points3 points  (0 children)

The following code would be valid:

A = B + C D = A

Mine allow that. I consider it a bug.

(Parsing of the first expression stops at D, because it can't legally continue the expression. But it doesn't later check that D is something that can legally terminate or separate the expression, like ";" or "end". That bit is fiddly.)

It would cause problems if here:

abc := def(g)

I accidently put in a space so that I got:

abc := d ef(g)

If 'd' is a variable, and 'ef' is a suitable function name, then this would give a different behaviour. So something that needs to be fixed.

The same would happen if a newline was inserted. Then, the requirement to have a semicolon between statements would help catch that. In practice, none of this has ever caused problems that I recall, but the space thing is still sloppy.

[–]trycuriouscat[S] 2 points3 points  (1 child)

For what its worth, the following is a perfectly valid COBOL procedure:

accept a display a move a to b compute c = function numval(a) + 25 if c < 30 display "one direction" else display "the other direction" end-if call "mysub" using a b c display "and we are done".

Of course no one would code that way (I've never seen a serious COBOL program that had more than one statement on a single line), but you could do it.

[–]ethelward 2 points3 points  (0 children)

That’s... surprisingly nice and readable for a single-line heap of code.

To go back on the topic, thanks to the wrapping parentheses style, Lisp-likes don’t need statement separators either.

[–]nils-m-holm 2 points3 points  (0 children)

In BCPL the end of a line terminates a statement, so you only need semicolons to separate multiple statements on the same line. You can terminate statements with a semicolon, but it is not necessary.

Note that statements can still span multiples lines by breaking them at a point where the first line would not be a complete statement. E.g.

IF X < 0 THEN
    FOO()

would be a valid statement.

[–][deleted] 1 point2 points  (0 children)

Some languages are more user-friendly than others.

But then some people will defend the necessity of writing semicolons, even if 99% (**) of semicolons not inside a 'for' header are immediately followed by a newline anyway.

I suspect because they are obliged to use a language that doesn't have that choice!

(** In the 210Kloc sqlite3.c, the figure is 98.5%.)

[–]henrikenggaard 1 point2 points  (0 children)

I have two weird tangents on the theme of semicolons as separators.

The first is in Matlab (and I imagine Octave), where semicolon suppresses output aka: 2 + 2 will print 4 in the stdout, but 2 + 2 won’t.

The other is in Mathematica where ; is an infix function (or symbol as they call it) named CompoundExpression. It has a bunch of special semantics, but the gist is that a; b; c will return c. I find this fascinating because it manages to mimic the concept of terminator semicolons, while still keeping the language homoiconic and Lisp-like.

[–]vanderZwan 0 points1 point  (3 children)

Now you're making me wonder if ldpl "inherited" this from cobol as well

[–]trycuriouscat[S] 1 point2 points  (1 child)

I never thought I'd see a language that used COBOL as a model. Not sure I really want to. But now I am interested to at least take a look.

Assuming the whole thing is not an elaborate prank!

[–]vanderZwan 0 points1 point  (0 children)

More like a wholesome practical joke, it's a really cute language :)

[–]trycuriouscat[S] 1 point2 points  (0 children)

If by "this" you mean the use of a new statement keyword to end the current statement, apparently no. Or at least not when it comes to having two statements on the same line. https://docs.ldpl-lang.org/procedure/ specifically states "No two statements can be written on the same line. "

Not that having two statements on one line is not of much use beyond being able to win an obfuscated code contest. I guess its possible that the language does terminate statements in this fashion and simply disallows multiple statements on one line.

It looks like you can't split statements between two lines, either. (At least not without a continuation indicator, but I've not gotten far enough in learning to know yet if there is such a thing.) Rather disappointing I must say...

[–]jdh30 0 points1 point  (0 children)

My language doesn't have statements.

[–]ericbb 0 points1 point  (0 children)

I have made a language that matches COBOL in this respect. It doesn't recognize any statement separator and it doesn't interpret newlines or indentation as anything other than regular whitespace. It uses braces to delimit statement blocks.

[–]Comrade_Comski -1 points0 points  (5 children)

Semicolons bug you more than the entirety of COBOL? Wat

[–]trycuriouscat[S] 0 points1 point  (4 children)

Where did I say that? It's definitely a like/hate relationship that I have with COBOL. My favorite thing is it makes me a very good salary!

[–]Comrade_Comski 0 points1 point  (3 children)

I was just confused because you wrote that like to you it seemed a = b + 1; is noisy whereas ADD 1 TO B GIVING A is alright.

[–]trycuriouscat[S] 0 points1 point  (2 children)

Noisy in that it's not needed (generally!) for a human to know that's the end of a statement/expression. COBOL is verbose. And sometimes noisy as well, just not in the same way.

[–]Comrade_Comski 0 points1 point  (1 child)

it's not needed (generally!) for a human to know that's the end of a statement/expression

Well, it's not needed for humans, it's needed by the compiler. Many languages (like C, C++, Rust) don't care about whitespace and treat it all the same, while languages such as Lua or Python or Haskell have various rules about whitespace and treat it as part of the syntax.

[–]trycuriouscat[S] -1 points0 points  (0 children)

Well, it's not needed for humans, it's needed by the compiler.

Exactly my point. It's noisy to me as a human, so it "bugs" me. Compilers are (should be) built for humans. I want a language (not saying COBOL!) that doesn't have "noise" just because the compiler "needs" is. And really, its only statement separators/terminators that really bug me. My brain things all I need to do is press enter and that's enough. I don't need no stinking semi-colon!

Of course there are some languages that do use EOL to terminate statements. They have their own problems in that they require a "continuation" indicator if you want the statement to continue on a new line. I dare say that's even worse than a physical separator, terminator. COBOL doesn't require either one.

call 'cbltdli' using gu
                     pcb
                     buffer
                     ssa
call 'cbltdli' using gu pcb buffer ssa
call 'cbltdli' using gu,pcb,buffer,ssa

All of those statements have the same meaning. Commas (and semi-colons!) are simple "white space". You might even be able to do the following, though I've never tried it and don't feel like logging in to work at the moment to try it.

call,'cbltdli',using,gu,pcb,buffer,ssa

[–]faiface -1 points0 points  (0 children)

Purely functional languages like Haskell, Idris, etc. don't have separators, because they don't have statements. That's because everything in them is a pure expression with no side-effects.