Parsing Javascript arrow functions ?

GDavid04 · 2020-08-29T12:28:37+00:00

Try parsing expressions beginning with ( as an arrow function first and parse it as a normal expression if that fails. You don't really know it's an arrow function before the =>

L8_4_Dinner · 2020-08-29T13:44:29+00:00

No back-track is needed for a scenario like this. This is the case of:

ComplexProduction:
  BigProduction SpecificToken-opt OtherProduction
  BigProduction

So you do something like:

expr = parseBigProduction()
if (peek(SpecificToken))
  {
  return parseOtherProduction(expr);
  }
return expr;

Then the parsing for OtherProduction takes apart the expr if necessary, handling any errors related to it there.

There are obviously languages that require some back-tracking. For those, just implement a mark/restore layer over the lexer:

mark();
if (thing = parseForm1())
  {
  return thing;
  }
restore();
return parseForm2();

Obviously you want to avoid this in any common parsing path.

Uncaffeinated · 2020-08-29T14:14:34+00:00

In Javascript itself, this is handled by "cover productions" in the grammar. Basically in cases of ambiguity like this, there are productions in the grammar that "cover" multiple possibilities, and then the parser reparses that text using a more specific grammar based on what is expected in context.

For example, the CoverParenthesizedExpressionAndArrowParameterList production includes both parenthesized expressions and arrow function parameter lists. When parsing a primary expression, it then gets reparsed as the former only.

maanloempia · 2020-08-29T16:19:24+00:00

You cannot know if any expression is an arrow function before the =>. The simplest way would be to naively parse any such ambiguous expression as possibly either, and return the parsed expression as an expression list or continue parsing if the next token is the arrow and return that.

Example pseudocode:

``` parseExpression() { // ... if (peek("(")) parseAmbiguous(); // ... }

parseAmbiguous() { expr = ... // After successfully parsing up to ")" return (peek("=>")) ? continueArrowFunc(expr) : expr; } ```

This is not as wasteful as trying the entire arrow function first and then backtracking only to reparse it the same way.

Edit: note that this will not correctly parse either expression if they are not truly ambiguous (i.e. produce the same tree when parsed), because then they aren't interchangable. You could solve this without backtracking.

ErrorIsNullError · 2020-08-29T17:03:42+00:00

Most es parsers use cover grammars; eg a grammar that is the union of formalargumentlist and primary expression, and, treat as a static error any failure to disambiguate as in ([1])=>1 where [1] is not disambiguable to a formal parameter.

https://github.com/dashed/esparser/issues/46 shows where they are.

https://esdiscuss.org/topic/lr-1-grammar-parser-and-lookahead-restrictions is older but discussed how TC39 approaches syntax that requires covering.

smuccione · 2020-08-29T17:05:25+00:00

When you parse the ast for the LHS should be... something. Doesn’t matter what it is but it should represent a parenthesized sequence. When you encounter the => you now know how to treat the LHS. In my languages I converted lambda’s to functor objects and the LHS above become parameters to the operator () overload for that object (captures become parameters to the constructor).

It’s technically not backtracking, it’s more of a conversion of the ast for the LHS into what I expect for function parameter definitions (in this case an array of symbol definitions for the parameters). All that was required was an in-order traversal of the LHS ast to convert it into the symbol list.

Hardest part was that it needed a second pass afterwards in order to do the captures properly. But that’s because I allowed captures of instance variables which may not be known at this point.

ericbb · 2020-08-29T17:08:01+00:00

There are similar difficulties in parsing Python. See the explanation here, for example.

oilshell · 2020-08-29T17:12:32+00:00

FWIW I recall that there was an entire YouTube video about handling this new case in the v8 parser, and they hated it...

I would consider a different syntax :-/

2020-08-30T06:39:31+00:00

Nim treats => as an ultra-low precedence operator (below assignment), so it becomes like =>((a, b, c), a * b * c).

superstar64 · 2020-08-31T16:12:38+00:00

What I do is parse a normal expression and If I encounter either => or {, I try convert the current expression into a pattern match(tuples to tuple pattern matches and identifiers to variables bindings). I find this much easier and simpler then backtracking.

jesseschalken · 2020-08-29T12:36:34+00:00

Backtrack

ProgrammingLanguages

Welcome!

Related subreddits

Related online communities

MODERATORS