Zum Inhalt springen

SQL: The Query Language That Outlived Everything

Zusammenfassung

SQL — Structured Query Language — was born from Edgar F. Codd’s 1970 relational model and built into a usable language at IBM in the early 1970s. It is one of the most durable technologies in all of computing: a declarative language where you describe what data you want rather than how to fetch it, leaving the optimization to the database. SQL survived the object-oriented era, weathered the “NoSQL” revolt of the 2010s, and then absorbed its challengers as they re-adopted it (see The Database Revolution and The Database Wars). Fifty years after its creation, SQL remains the universal language of data, used by virtually every business application on earth.

Codd’s Relational Model

The story begins not with a language but with a theory. In 1970, Edgar F. “Ted” Codd, a mathematician at IBM’s San Jose Research Laboratory, published “A Relational Model of Data for Large Shared Data Banks.” Codd proposed that data be organized as relations — what we now call tables of rows and columns — governed by mathematical set theory and predicate logic, rather than the tangled hierarchical and network databases of the time, which required programmers to navigate explicit pointers between records.

Codd’s radical idea was data independence: applications should be able to ask for data by its logical content without knowing how it was physically stored. This decoupling — separating the logical query from the physical access path — is the foundation of everything SQL does.

SEQUEL Becomes SQL

To make Codd’s abstract model usable, IBM researchers Donald D. Chamberlin and Raymond F. Boyce designed a language in the early 1970s as part of IBM’s System R research project. They called it SEQUEL (Structured English Query Language), later shortened to SQL for trademark reasons (though it is still often pronounced “sequel”). Their goal was a language readable enough that people without deep programming training could query data using English-like statements.

The core SQL operations have remained recognizable for half a century:

  • SELECT ... FROM ... WHERE ... to query data
  • INSERT, UPDATE, DELETE to modify it
  • CREATE TABLE, ALTER, DROP to define structure
  • JOIN to combine related tables

Crucially, SQL is declarative: you state the result you want, and the database’s query optimizer decides how to compute it — which indexes to use, what order to join tables, how to scan the data. The programmer does not write the algorithm; the database does. This is the source of SQL’s longevity, because the same query can keep working — and getting faster — as the engine underneath it improves over decades.

The Commercial Explosion and Standardization

System R proved the relational model practical, but the first relational product to market was Oracle (1979), built by Larry Ellison’s company on the public System R papers — beating IBM’s own commercial offering. IBM followed with SQL/DS and DB2. The relational database became the multi-billion-dollar foundation of enterprise computing, with Oracle, IBM, Microsoft (SQL Server), and later open-source engines MySQL and PostgreSQL dividing the market.

SQL was standardized by ANSI in 1986 and ISO in 1987, with major revisions roughly every few years (SQL-92, SQL:1999, SQL:2003 which added window functions and XML, and later additions for JSON). Although each vendor added its own dialect and extensions, the standardized core meant that SQL knowledge transferred across systems — a portability that few technologies achieve.

The model rested on ACID guarantees (Atomicity, Consistency, Isolation, Durability) for transactions, ensuring that database operations were reliable even under concurrency and failure — the bedrock of banking, commerce, and any system where data correctness is non-negotiable.

The NoSQL Challenge — and SQL’s Counterattack

In the late 2000s and early 2010s, the rise of web-scale companies produced a revolt. The new argument was that relational databases could not scale horizontally across thousands of commodity servers, that rigid schemas were too inflexible for fast-moving applications, and that the CAP theorem forced trade-offs the relational model handled poorly. A wave of “NoSQL” databases emerged — MongoDB (documents), Cassandra and DynamoDB (wide-column/key-value), Redis (in-memory), Neo4j (graphs) — many abandoning SQL and strict consistency in favor of scale and flexibility.

For a few years it looked like SQL might be displaced. Instead, the opposite happened. The industry rediscovered why SQL and ACID existed in the first place: developers missed transactions, joins, and the expressive power of declarative queries, and found that eventual consistency pushed enormous complexity into application code.

The response was the NewSQL movement and a broad re-embrace of SQL:

  • Google Spanner delivered a globally distributed database with both horizontal scale and SQL with strong consistency, refuting the claim that the two were incompatible.
  • CockroachDB, YugabyteDB, and TiDB built distributed SQL databases for the cloud era.
  • Even the NoSQL insurgents added SQL-like query languages: Cassandra has CQL, and the analytics world standardized on SQL through engines like Apache Spark SQL, Presto/Trino, BigQuery, and Snowflake (see The Cloud Computing Era).

“NoSQL” was quietly reinterpreted by many as “Not Only SQL.” SQL had won by being adopted by the very systems built to replace it.

Legacy

SQL is a study in the power of the right abstraction. By separating what data you want from how to retrieve it, Codd, Chamberlin, and Boyce created a language whose surface stayed stable while the machinery beneath it was reinvented many times — from single-server engines to globally distributed clusters — without breaking the queries written on top. It outlasted the languages and platforms that surrounded it, survived a generational attempt to dethrone it, and emerged more dominant than before. Half a century on, SQL remains the lingua franca of data, and the relational model the default way humanity organizes its structured information.

📚 Sources