gecko-dev/third_party/rust/jsparagus/js_parser
..
__init__.py
es-lexical-simplified.esgrammar
es-simplified.esgrammar
es.esgrammar
esgrammar.pgen
extract_es_grammar.py
generate_js_parser_tables.py
lexer.py
load_es_grammar.py
parse_esgrammar.py
parser.py
README.md
slash.esgrammar
try_it.py

jsparagus/js_parser: Generating a parser for JavaScript

In this directory:

  • esgrammar.pgen A grammar for the mini-language the ECMAScript standard uses to describe ES grammar.

  • es.esgrammar - The actual grammar for ECMAScript, in emu-grammar format, extracted automatically from the spec.

  • extract_es_grammar.py - The script that creates es.esgrammar.

  • es-simplified.esgrammar - A hacked version of es.esgrammar that jsparagus can actually handle.

  • generate_js_parser_tables.py - A script to generate a JS parser based on es-simplified.esgrammar. Read on for instructions.

How to run it

To generate a parser, follow these steps:

$ cd ..
$ make init
$ make all

Note: The last step currently takes about 35 seconds to run on my laptop. jsparagus is slow.

Once you're done, to see your parser run, try this:

$ cd crates/driver
$ cargo run --release

The build also produces a copy of the JS parser in Python. After make all, you can use make jsdemo to run that.

How simplified is "es-simplified"?

Here are the differences between es.esgrammar, the actual ES grammar, and es-simplified.esgrammar, the simplified version that jsparagus can actually handle:

  • The four productions with [~Yield] and [~Await] conditions are dropped. This means that yield and await do not match IdentifierReference or LabelIdentifier. I think it's better to do that in the lexer.

  • Truncated lookahead.

    ValueError: unsupported: lookahead > 1 token, [['{'], ['function'], ['async', ('no-LineTerminator-here',), 'function'], ['class'], ['let', '[']]

  • Delete a rule that uses but not since it's not implemented.

    Identifier :
      IdentifierName but not ReservedWord
    

    Making sense of this rule in the context of an LR parser is an interesting task; see issue #28.

  • Ban loops of the form for (async of EXPR) STMT by adjusting a lookahead assertion. The grammar is not LR(1).