Zum Inhalt springen

Ken Thompson and Unix

Zusammenfassung

Ken Thompson built Unix in a month, partly to play a space travel game. He designed the B language, which became C. He co-invented regular expressions as a practical tool. He built Belle, the world chess computer champion of 1980. He designed UTF-8 on a placemat in a New Jersey diner. He co-designed Go. He received the Turing Award in 1983. He worked at Bell Labs for thirty years, then quietly joined Google at sixty-three and kept building things. He is, by any measure, one of the most productive systems engineers in the history of computing — and among the least self-promotional.

Bell Labs and the Space Travel Game

Kenneth Lane Thompson was born in New Orleans, Louisiana on February 4, 1943. He studied electrical engineering at UC Berkeley, completing a BS in 1965 and an MS in 1966, then joined Bell Labs in Murray Hill, New Jersey, where he would spend the next thirty years.

In 1969, Thompson was one of the Bell Labs researchers who had worked on Multics — a time-sharing operating system developed jointly by MIT, General Electric, and Bell Labs that had become enormously complex and was consuming resources without delivering results. Bell Labs withdrew from Multics in 1969. Thompson missed the interactive computing environment.

He had also written a space travel simulation — a program that calculated planetary orbits and let the user navigate a spacecraft through the solar system — on a GE 645 mainframe. When Bell Labs abandoned Multics, the mainframe went with it. Thompson needed somewhere to run his game.

He found an unused PDP-7 in a corner of the lab. In the summer of 1969, while his wife was on a three-week vacation visiting family in California, he spent one week writing a file system, one week writing a process manager and memory management, and one week writing a command-line shell. The result ran the space travel game. He had also written an operating system.

He called it Unics — a pun on Multics, implying a simpler, castrated version. The name was later respelled Unix.

B, C, and the Language Foundation

The PDP-7’s assembly language was not a good foundation for building a real system. Thompson needed a higher-level language that compiled to efficient machine code. He adapted BCPL (a systems programming language developed by Martin Richards at Cambridge) into a language he called B, stripping it down to fit the PDP-7’s constraints.

B was functional but limited. Dennis Ritchie extended it — adding data types, a more expressive type system, and structures — to create C (1972). The collaboration is described fully in Dennis Ritchie and the C Language. Thompson contributed to both languages; the divide of credit is roughly that Thompson conceived B and the underlying approach, Ritchie designed C.

When Unix was rewritten in C in 1973, it became the first operating system written primarily in a high-level language — a decision so radical that Thompson had to argue strenuously for it against colleagues who insisted operating systems required assembly.

Regular Expressions

Thompson’s 1968 paper “Regular Expression Search Algorithm” described an algorithm for efficiently searching text using regular expressions — a compact notation for describing patterns. The notation had been invented by mathematician Stephen Kleene in the 1950s as a theoretical tool; Thompson made it practical by implementing it in a series of Unix tools.

His implementation of regular expressions in ed (the Unix text editor), grep (global regular expression print), and other tools gave every Unix programmer the ability to describe and search patterns in text with a concise, powerful notation. Regular expressions became a universal feature of text processing tools across every programming language and operating system. They remain in daily use by virtually every programmer.

Pipes and the Unix Philosophy

Thompson implemented pipes — the Unix mechanism that allows the output of one program to be fed as input to another — in 1973, following a suggestion from Doug McIlroy. The pipe operator (|) made it possible to chain programs together:

cat file.txt | sort | uniq | wc -l

This composability — small programs doing one thing, connected by a universal text interface — became the Unix philosophy. Thompson did not articulate the philosophy as doctrine; he built systems that embodied it.

Belle: World Chess Champion

In the late 1970s, Thompson and hardware engineer Joe Condon built Belle — a dedicated chess machine implemented in custom hardware designed specifically to evaluate chess positions. Belle used specialized circuits to generate and evaluate moves at speeds no general-purpose computer of the era could match.

Belle became the world computer chess champion in 1980, defeating all other programs and earning a USCF rating around 2250 — and in 1983 the USCF awarded it the rank of Master, the first machine to reach that level. It was the strongest chess-playing entity in the world that was not a human.

Thompson used Belle to do a systematic analysis of chess endgames — computing all possible positions for a given set of pieces and determining whether each position was a win, loss, or draw for the player to move. The resulting endgame tablebases contained positions that surprised expert chess players, including forced wins in positions previously thought to be draws. All modern chess engines use similar tablebases, directly extending Thompson’s work.

UTF-8: The Encoding on a Placemat

In September 1992, Thompson and Rob Pike were in a New Jersey diner when Pike sketched the design of a new encoding scheme on a placemat. The encoding, UTF-8, was designed to represent all Unicode characters in variable-length byte sequences while maintaining backward compatibility with ASCII.

UTF-8’s key properties: ASCII characters (codes 0-127) are represented by single bytes identical to their ASCII encoding, so all ASCII text is automatically valid UTF-8. Non-ASCII Unicode characters use multi-byte sequences where the length is indicated by the leading byte. The encoding is self-synchronizing — you can identify character boundaries from any point in the stream without reading from the beginning.

Thompson implemented UTF-8 in Plan 9 within weeks of the diner sketch. The encoding was adopted by the internet and is now the dominant character encoding in use globally — over 97% of web pages use UTF-8.

Plan 9 and Go

In 1987, Thompson and his colleagues at Bell Labs began designing Plan 9 from Bell Labs — a next-generation operating system that took Unix’s ideas further. Everything in Plan 9 was a file, accessed through the same file-system interface: not just disk files but windows, network connections, processes, and hardware devices. Plan 9 also introduced the per-process namespace — each process could have its own view of the file system hierarchy.

Plan 9 was technically elegant and commercially unsuccessful. It was too different from Unix to attract the application developers necessary for adoption, and arrived at a time when Linux was capturing the market for Unix-like systems.

In 2006, at age sixty-three, Thompson joined Google. In 2009, with Rob Pike and Robert Griesemer, he co-designed Go — a systems programming language explicitly influenced by C’s simplicity but adding garbage collection, goroutines for concurrent programming, and a cleaner module system. Go’s design philosophy — small, orthogonal, practical — echoed Unix’s.

Thompson and Ritchie received the Turing Award jointly in 1983 “for their development of generic operating systems theory and specifically for the implementation of the UNIX operating system.”

Dead End: Plan 9 as Unix Successor

Plan 9’s clean design solved real problems with Unix — the inconsistency of special files, the complexity of network programming, the mess of terminal handling — but arrived too late and demanded too much of application developers.

The Clean Slate Problem

Plan 9 required programs to be written specifically for Plan 9 to benefit from its design. The enormous existing base of Unix/POSIX software, the momentum of Linux, and the commercial investment in POSIX compatibility made wholesale migration implausible. Plan 9 found users in research environments and among systems programmers who appreciated its design, but never achieved the adoption necessary to threaten Linux. The lesson: a cleaner design that requires starting from scratch rarely defeats an established system that can be incrementally improved, even if the incumbent has deep technical debt.

The Unix story it began is covered in The Unix Story. The C language that emerged from this work is in Dennis Ritchie and the C Language.


📚 Sources