One Giant Leap For SQL: MySQL 8.0 Released


“Still using SQL-92?” is the opening question of my “Modern SQL” presentation. When I ask this question, an astonishingly large portion of the audience openly admits to using 25 year old technology. If I ask who is still using Windows 3.1, which was also released in 1992, only a few raise their hand…but they’re joking, of course.

Clearly this comparison is not entirely fair. It nevertheless demonstrates that the know-how surrounding newer SQL standards is pretty lacking. There were actually five updates since SQL-92—many developers have never heard of them. The latest version is SQL:2016.

As a consequence, many developers don’t know that SQL hasn’t been limited to the relational algebra or the relational model since 1999. SQL:1999 introduced operations that don’t exist in relational algebra (with recursive, lateral) and types (arrays!) that break the traditional interpretation of the first normal form.0

Since then, so for 19 years, whether or not a SQL feature fits the relational idea isn’t important anymore. What is important is that a feature has well-defined semantics and solves a real problem. The academic approach has given way to a pragmatic one. Today, the SQL standard has a practical solution for almost every data processing problem. Some of them stay within the relational domain, while others do not.

Resolution

Don’t say relational database when referring to SQL databases. SQL is really more than just relational.

It’s really too bad that many developers still use SQL in the same way it was being used 25 years ago. I believe the main reasons are a lack of knowledge and interest1 among developers along with poor support for modern SQL in database products.

Let’s have a look at this argument in the context of MySQL. Considering its market share, I think that MySQL’s lack of modern SQL has contributed more than its fair share to this unfortunate situation. I once touched on that argument in my 2013 blog post “MySQL is as Bad for SQL as MongoDB is to NoSQL”. The key message was that “MongoDB is a popular, yet poor representative of its species—just like MySQL is”. Joe Celko has expressed his opinion about MySQL differently: “MySQL is not SQL, it merely borrows the keywords from SQL”.

You can see some examples of the questionable interpretation of SQL in the MySQL WAT talk on YouTube.2 Note that this video is from 2012 and uses MySQL 5.5 (the current GA version at that time). Since then, MySQL 5.6 and 5.7 came out, which improved the situation substantially. The default settings on a fresh installation are much better now.3

It is particularly nice that they were really thinking about how to mitigate the effects of changing defaults. When they enabled ONLY_FULL_GROUP_BY by default, for example, they went the extra mile to implement the most complete functional dependencies checking among the major SQL databases:

BigQueryDb2 (LUW)MariaDBaaaaaMySQLOracle DBPostgreSQLbSQL ServerSQLiteaaaaaBase table PRIMARY KEYBase table UNIQUEJoined tablesWHERE clauseGROUP BY clause
  1. ⚡Doesn’t fail if there is no functional dependency (per default)
  2. Not following joins to PRIMARY KEYs or UNIQUE constraints

About the same time MySQL 5.7 was released, I stopped bashing MySQL. Of course I’m kidding. I’m still bashing MySQL occasionally…but it has become a bit harder since then.

By the way, did you know MySQL still doesn’t support check constraints? Just as in previous versions, you can use check constraints in the create table statement but they are silently ignored. Yes—ignored without warning. Even MariaDB fixed that a year ago. (Update April 2019: MySQL 8.0.16 finally honors check constraints)

BigQueryDb2 (LUW)MariaDBMySQLOracle DBPostgreSQLSQL ServerSQLiteCHECK constraints

Uhm, I’m bashing again! Sorry—old habits die hard.

Nevertheless, the development philosophy of MySQL has visibly changed over the last few releases. What happened? You know the answer already: MySQL is under new management since Oracle bought it through Sun. I must admit: it might have been the best thing that happened to SQL in the past 10 years, and I really mean SQL—not MySQL.

The reason I think a single database release has a dramatic effect on the entire SQL ecosystem is simple: MySQL is the weakest link in the chain. If you strengthen that link, the entire chain becomes stronger. Let me elaborate.

MySQL is very popular. According to db-engines.com, it’s the second most popular SQL database overall. More importantly: it is, by a huge margin, the most popular free SQL database. This has a big effect on anyone who has to cope with more than one specific SQL database. These are often software vendors that make products like content management systems (CRMs), e-commerce software, or object-relational mappers (ORMs). Due to its immense popularity, such vendors often need to support MySQL. Only a few of them bite the bullet and truly support multiple database—Java Object Oriented Querying (jOOQ) really stands out in this regard. Many vendors just limit themselves to the commonly supported SQL dialect, i.e. MySQL.

Another important group affected by MySQL’s omnipresence are people learning SQL. They can reasonably assume that the most popular free SQL database is a good foundation for learning. What they don’t know is that MySQL limits their SQL-foo to the weakest SQL dialect among those being widely used. Based loosely on Joe Celko’s statement: these people know the keywords, but don’t understand their real meaning. Worse still, they have not heard anything about modern SQL features.

Last week, that all changed when Oracle finally published a generally available (GA) release of MySQL 8.0. This is a landmark release as MySQL eventually evolved beyond SQL-92 and the purely relational dogma. Among a few other standard SQL features, MySQL now supports window functions (over) and common table expressions (with). Without a doubt, these are the two most important post-SQL-92 features.

The days are numbered in which software vendors claim they cannot use these features because MySQL doesn’t support them. Window functions and CTEs are now in the documentation of the most popular free SQL database. Let me therefore boldly claim: MySQL 8.0 is one small step for a database, one giant leap for SQL.4

It gets even better and the future is bright! As a consequence of Oracle getting its hands on MySQL, some of the original MySQL team (among them the original creator) created the MySQL fork MariaDB. Apparently, their strategy is to add many new features to convince MySQL users to consider their competing product. Personally I think they sacrifice quality—very much like they did before with MySQL—but that’s another story. Here it is more relevant that MariaDB has been validating check constraints for a year now. That raises a question: how much longer can MySQL afford to ignore check constraints? Or to put it another way, how much longer can they endure my bashing ;)

Support My Work

I offer SQL training, tuning and consulting. Buying my book “SQL Performance Explained” (from €9.95) also supports my work on this website.

Besides check constraints, MariaDB 10.2 also introduced window functions and common table expressions (CTEs). At that time, MySQL had a beta with CTEs but no window functions. MariaDB is moving faster.5

In 10.3, MariaDB is set to release “system versioned tables”. In a nutshell: once activated for a table, system versioning keeps old versions for updated and deleted rows. By default, queries return the current version as usual, but you can use a special syntax (as of) to get older versions. Your can read more about this in MariaDBs announcement.

System versioning was introduced into the SQL standard in 2011. As it looks now, MariaDB will be the first free SQL database supporting it. I hope this an incentive for other vendors—and also for users asking their vendors to support more modern SQL features!

Now that the adoption of modern SQL has finally gained some traction, there is only one problem left: the gory details. The features defined by the standard have many subfeatures, and due to their sheer number, it is common practice to support only some of them. That means it is not enough to say that a database supports window functions. Which window functions does it actually support? Which frame units (rows, range, groups)? The answers to these questions make all the difference between a marketing gag and a powerful feature.

In my mission to make modern SQL more accessible to developers, I’m testing these details so I can highlight the differences between products. The results of these tests are shown in matrices like the ones above. The rest of this article will thus briefly go through the new standard SQL features introduced with MySQL 8.0 and discuss some implementation differences. As you will see, MySQL 8.0 is pretty good in this regard. The notable exception is its JSON functionality.

Window Functions

There is SQL before window functions and SQL after window functions. Without exaggeration, window functions are a game changer. Once you understood window functions, you cannot imagine how you could ever have lived without them. The most common use cases, for example finding the best N rows per group, building running totals or moving averages, and grouping consecutive events, are just the tip of the iceberg. Window functions are one of the most important tools to avoid self-joins. That alone makes many queries less redundant and much faster. Window functions are so powerful that even newcomers like several Apache SQL implementations (Hive, Impala, Spark), NuoDB and Google BigQuery introduced them years ago. It’s really fair to say that MySQL is pretty late to this party.

The following matrix shows the support of the over clause for some major SQL databases. As you can see, MySQL’s implementation actually exceeds the capabilities of “the world’s most advanced open source relational database”, as PostgreSQL claims on its new homepage. However, PostgreSQL 11 is set to recapture the leader position in this area.

BigQueryaDb2 (LUW)aMariaDBaMySQLOracle DBPostgreSQLSQL ServerbSQLiteaOVER (<window specification>)OVER <name> and WINDOW clauseFrame unit ROWSFrame unit RANGEFrame unit GROUPSShort framing: <UNIT>…PRECEDINGFraming: EXCLUDE
  1. Range frames don’t support datetime types
  2. Range frames don’t accept <distance> (only unbounded and current row)

The actual set of window functions offered by MySQL 8.0 is also pretty close to the state of the art:

BigQuerybchkDb2 (LUW)abdiMariaDBabejlMySQLabfjlOracle DBgPostgreSQLabcjlSQL ServeragSQLiteabcjlAggregates (count, sum, min, ...)row_number()rank()dense_rank()percent_rank()cume_dist()ntile()width_bucket()lead and lagfirst_value, last_valuenth_valueNested window functions
  1. DISTINCT aggregates not supported
  2. Also without ORDER BY clause
  3. No respect|ignore nulls
  4. No negative offsets • Proprietary nulls treatment: lead(<expr>, 'IGNORE NULLS') (it’s a string argument)
  5. No default possible (3rd argument) • No respect|ignore nulls
  6. No negative offsets • No ignore nulls
  7. No negative offsets
  8. Proprietary nulls treatment: first_value(<expr> IGNORE NULLS) (no comma)
  9. Proprietary nulls treatment: first_value(<expr>, 1, null, 'IGNORE NULLS') (it’s a string argument)
  10. No ignore nulls
  11. Proprietary nulls treatment: nth_value(<expr>, <off> IGNORE NULLS) (no comma) • No from last
  12. No ignore nulls • No from last

Common Table Expressions (with [recursive])

The next major enhancement for MySQL 8.0 are common table expressions or the with [recursive] clause. Important use cases are traversing graphs with a single query, generating an arbitrary number of rows, converting CSV strings to rows (reversed listagg / group_concat) or just literate SQL.

Again, MySQL’s first implementation closes the gap.

BigQueryaDb2 (LUW)bMariaDBMySQLOracle DBcPostgreSQLeSQL ServercdeSQLiteewith on top-levelwith in subquerieswith recursivefetch first in non-rec. subq.insert … with … selectwith masks schema objectswith doesn’t imply recursiveviews bypass withqualified names bypass with
  1. Without column list: WITH RECURSIVE query_name AS (SELECT…)
  2. Without keyword recursive • No join in recursive branch—use comma-join (,)
  3. Without keyword recursive
  4. Workaround: use row_number() over() to implement top-N query
  5. Supports proprietary syntax with … insert … select

Other Standard SQL Features

Besides window functions and the with clause, MySQL 8.0 also introduces some other standard SQL features. However compared to the previous two, these are by no means killer features.

BigQueryhDb2 (LUW)fhMariaDBadfMySQLadfijOracle DBbfhPostgreSQLcekSQL ServerghlSQLiteJSON_OBJECTAGGJSON_ARRAYAGGJSON_TABLEGROUPING functioncolumn names in FROM clause
  1. No standard syntax supported: uses proprietary comma (,) between keys and values; no on null specification
  2. No colon syntax (T814)
  3. No colon syntax (T814) • No key uniqueness constraint (T830): [with|without] unique [keys] • Has proprietary json_object_agg.
  4. No order by clause
  5. Has proprietary json_agg.
  6. Without plan clause
  7. Use OPENJSON
  8. Only as single-argument function
  9. When using the proprietary WITH ROLLUP clause
  10. Only for derived tables
  11. Accepts a <derived column list> with fewer columns than the base table has
  12. Not for regular tables or views

As you can see, Oracle pushes standard SQL JSON support. The Oracle database and MySQL are currently the leaders in this area (and both are from the same vendor!). The json_objectagg and json_arrayagg functions were even backported to MySQL 5.7.22. However, it’s also notable that MySQL doesn’t follow the standard syntax for these two functions. Modifiers defined in the standard (e.g. an order by clause) are generally not supported. Json_objectagg neither recognizes the keywords key and value nor accepts the colon (:) to separate attribute names and values. It looks like MySQL parses these as regular functions calls—as opposed to syntax described by the standard.

It’s also interesting to see that json_arrayagg handles null values incorrectly, very much like the Oracle database (they don’t default to absent on null6). Seeing the same issue in two supposedly unrelated products is always interesting. Adding the fact that both products come from the same vendor adds another twist.

The two last features in the list, grouping function (related to rollup) and column names in the from clause are solutions to pretty specific problems. Their MySQL 8.0 implementation is basically on par with that of other databases.

Furthermore, MySQL 8.0 also introduced standard SQL roles. The reason this is not listed in the matrix above is simple: the matrices are based on actual tests I run against all these databases. My homegrown testing framework does not yet support test cases that require multiple users—currently all test are run with a default user, so I cannot test access rights yet. However, the time for that will come—stay tuned.

Other Notable Enhancements

I’d like to close this article with MySQL 8.0 fixes and improvements that are not related to the SQL standard.

One of them is about using the desc modifier in index declarations:

CREATE INDEX … ON … (<column> [ASC|DESC], …)

Most—if not all—databases use the same logic in the index creation as for the order by clause, i.e. by default, the order of column values is ascending. Sometimes it is needed to sort some index columns in the opposite direction. That’s when you specify desc in an index. Here’s what the MySQL 5.7 documentation said about this:

An index_col_name specification can end with ASC or DESC. These keywords are permitted for future extensions for specifying ascending or descending index value storage. Currently, they are parsed but ignored; index values are always stored in ascending order.

“They are parsed but ignored”? To be more specific: they are parsed but ignored without warning very much like check constraints mentioned above.

However, this has been fixed with MySQL 8.0. Now there is a warning. Just kidding! Desc is honored now.

There are many other improvements in MySQL 8.0. Please refer to “What’s New in MySQL 8.0?” for a great overview. How about a small appetizer:

You can’t catch up on 20 years of SQL evolution in one day. Subscribe the newsletter via E-Mail, Twitter or RSS to gradually catch up and to keep modern-⁠sql.com on your radar.

About the Author

Photo of Markus Winand

Markus Winand provides insights into SQL and shows how different systems support it at modern-sql.com. Previously he made use-the-index-luke.com, which is still actively maintained. Markus can be hired as trainer, speaker and consultant via winand.at.

Buy the Book

Cover of “SQL Performance Explained”: Squirrel running on grass

The essence of SQL tuning in 200 pages

Buy now!
(paperback and/or PDF)

Paperback also available at Amazon.com.

Hire Markus

Markus offers SQL training and consulting for developers working at companies of all sizes.
Learn more »

Footnotes

  1. Personally, I’m following Chris Dates argument in this regard: “Domains Can Contain Anything!” [“Date on Database: Writings 2000-2006”;ISBN 978-1590597460]. Values don’t need to be “atomic” anymore.

  2. I suspect that a substantial number of developers prefer to play around with the latest and greatest overhyped technology from an Internet giant instead of learning more about modern SQL.

  3. Actual name: MySQL vs PostgreSQL - Why you shouldn’t use MySQL.

  4. If you migrated your MySQL from an older version and did not explicitly enable the new defaults, you might still see some of the problems described in the WAT talk.

  5. Sadly, this is a more important leap than any new SQL standard since SQL-92.

  6. For the curious: Oracle did a great job in improving MySQLs foundation, which is a good reason to move a little slower.

  7. SQL-2:2016 §10.11 SR5a.

Connect with Markus Winand

Subscribe mailinglistsSubscribe the RSS feedMarkus Winand on LinkedInMarkus Winand on XINGMarkus Winand on TwitterMarkus Winand on Bluesky
Copyright 2015-2024 Markus Winand. All righs reserved.
Legal | Contact | NO WARRANTY | Trademarks | Privacy and GDPR