Why Sponsor Oils? | blog | oilshell.org

Pratt Parsing Index and Updates

2017-03-31 (Last updated 2023-07-03)

I've noticed that my November posts on expression parsing with Pratt's algorithm are popular and still drawing readers.

So here's an index of them, including yesterday's update, with related links.

Table of Contents
Original Series
Pratt Parsing and Precedence Climbing Are the Same Algorithm
Review of Pratt/TDOP Parsing Tutorials (JS, Java, Python)
Pratt Parsing Without Prototypal Inheritance, Global Variables, Virtual Dispatch, or Java (Python)
Pratt Parsers Can Be Statically Typed
2017 Updates
Precedence Climbing is Widely Used
Code for the Shunting Yard Algorithm, and More
2019 - TypeScript
2020 - Rust
2023 - Elm

Original Series

Pratt Parsing and Precedence Climbing Are the Same Algorithm

After parsing arithmetic in OSH, I noticed that the precedence climbing algorithm is a special case of the earlier Pratt parsing algorithm. But see the update below.

Review of Pratt/TDOP Parsing Tutorials (JS, Java, Python)

I reviewed four articles, in order to motivate a different code structure.

Pratt Parsing Without Prototypal Inheritance, Global Variables, Virtual Dispatch, or Java (Python)

Crockford deserves credit for reviving Pratt's algorithm, but I think his idiosyncratic style leaked into almost every future exposition of it.

In this article, I explain the code in my pratt-parsing-demo repo.

  1. I prefer to use a pair of lookup tables for precedence, rather than token objects with dynamic dispatch.
  2. I prefer to enclose the entire parser in an class, with the current token as a member variable, rather than as a global variable.

Why? I tend to use classes somewhat formally, either for dependency inversion or for maintaining invariants on state. If everything's a class, then nothing is.

Pratt Parsers Can Be Statically Typed

A minor update to correct a misleading notion.

2017 Updates

Precedence Climbing is Widely Used

In November, I recommended dropping the "precedence climbing" name to avoid confusion. After looking at real code, I've changed my view.

Code for the Shunting Yard Algorithm, and More

An encyclopedic update from Jean-Marc Bourguet.

2019 - TypeScript

The post How Desmos Uses Pratt Parsers may be useful if you prefer to read code in TypeScript. See desmosinc/pratt-parser-blog-code.

They also compare Pratt Parsing to using Jison, a LR(1) parser generator like Yacc or GNU Bison.

2020 - Rust

Simple But Powerful Pratt Parsing. An introduction to the algorithm, with code in Rust. (reddit comments)

The post says that the algorithm is both recursive and iterative, which is a good point.

Here's a related observation that may also help. You can divide recursive algorithms into two categories:

  1. Algorithms where the recursive function is also pure. For example, counting the nodes in a tree is a one-line function that has no variables:
function count(node) {
  return 1 + count(node.left) + count(node.right)
}
  1. Algorithms where state is mutated while recursing. Recursive descent parsing and Pratt Parsing both fall in this category, and that's one thing that makes them harder to understand (and debug).

    In addition to returning or evaluating subexpressions, you advance through the token stream while recursing. This is done in the Next() call on line 222 of tdop.py. Note that ParseUntil() and the set of nud() / led() functions are mutually recursive.

2023 - Elm

Nice diagrams in this article: