Literate SQL


Literate programming is an approach to programming introduced by Donald Knuth that, among other ideas, aims to arrange source code in order of human logic so that it can be understood by reading it like a novel: from the beginning to the end.

SQL works in the opposite way: due to the nesting of subqueries we must find and understand the innermost query and then start working from inside out. Not even well-chosen names can improve the recursive reading process notably because names follow the subqueries in SQL.

Even the order of clauses in the simplest query has a forward dependency: the meaning of the select clause depends on the from clause. That actually poses an unsolvable problem for development tools: when typing select, it is not yet know which tables are accessed and it is therefore also unknown which columns are visible. Without from no sensible code completion for select.

SQL:1999 introduced an alternative to the nesting at least: the with clause. In context of literate SQL, the with clause has two important properties: (1) names come first; (2) subqueries can be unnested.

Names first

The importance of meaningful names for software elements cannot be overstated. But even the best, intention revealing name, is useless, if it cannot be seen when needed:

SELECT ...
  FROM (SELECT ...
          FROM ...
       ) intention_revealing_name
   ...

The longer the inner query gets, the more useful a good name becomes, and the further it is pushed out of sight. How tragic.

The with clause puts the name before the code—very much like function declarations work in other programming languages:

WITH intention_revealing_name AS (
     SELECT ...
       FROM ...
     )
SELECT ...
  FROM intention_revealing_name irn
   ...

The topping is that it is still possible to assign a short alias in the from clause (like irn in the above example).

Support My Work

I offer SQL training, tuning and consulting. Buying my book “SQL Performance Explained” (from €9.95) also supports my work on this website.

Order of human logic

The second aspect of with is its ability to turn nesting into chaining. This is the way to make complex queries top-down readable.

A with clause can define many queries. Each of them can refer to other queries previously defined in the same with clause. This is exactly what is needed for literate SQL.

The following example drafts a literate query. The first two query definitions are not even shown—their name alone reveals their purpose:

WITH frequently_bought_together (ordered, suggested, times) AS (
     SELECT ...
     )
   , frequently_bought_shortly_after (ordered, suggested, times) AS (
     SELECT ...
     )
   , product_recommendation (ordered, suggested, times) AS (
     SELECT ordered, suggested, SUM(times)
       FROM (SELECT * FROM frequently_bought_together
              UNION ALL
             SELECT * FROM frequently_bought_shortly_after
            ) frequently_needed_together
      GROUP BY ordered, suggested
      ORDER BY SUM(times) DESC
      FETCH FIRST 1 ROW ONLY 
     )
SELECT *
  FROM product_recommendation

The ease of reading is not caused by the top-down structure alone. With meaningless names like x, y, and z the query would still be hard to understand. It is the combination of good names and good structure that brings the literate breakthrough.

As often, it is possible to do too much of it. Small subqueries like frequently_needed_together in the example might not hinder readability. It is important to divide the code into manageable, logically connected units.

Then again it sometimes makes sense to use with clause inside a with query. You are not limited to a global spaghetti with clause: syntactically, with is a prefix for the select keyword. Use it where it helps improving readability.

By now it should be clear why this technique is called “literate SQL”: writing easily readable SQL is an art, not science.

Performance Considerations

For some database—most notable PostgreSQL— the with clause has a performance impact. See with Clause: Performance Impacts.

Compatibility

BigQueryaDb2 (LUW)MariaDBaMySQLOracle DBPostgreSQLSQL ServerSQLitewith on top-levelwith in subqueries
  1. Seems like CTE in subquery cannot see global CTEs

You can’t catch up on 20 years of SQL evolution in one day. Subscribe the newsletter via E-Mail, Twitter or RSS to gradually catch up and to keep modern-⁠sql.com on your radar.

About the Author

Photo of Markus Winand

Markus Winand provides insights into SQL and shows how different systems support it at modern-sql.com. Previously he made use-the-index-luke.com, which is still actively maintained. Markus can be hired as trainer, speaker and consultant via winand.at.

Buy the Book

Cover of “SQL Performance Explained”: Squirrel running on grass

The essence of SQL tuning in 200 pages

Buy now!
(paperback and/or PDF)

Paperback also available at Amazon.com.

Hire Markus

Markus offers SQL training and consulting for developers working at companies of all sizes.
Learn more »

Connect with Markus Winand

Subscribe mailinglistsSubscribe the RSS feedMarkus Winand on LinkedInMarkus Winand on XINGMarkus Winand on TwitterMarkus Winand on Bluesky
Copyright 2015-2024 Markus Winand. All righs reserved.
Legal | Contact | NO WARRANTY | Trademarks | Privacy and GDPR