Jeff Dean and Google's Infrastructure
Zusammenfassung
Jeff Dean joined Google in mid-1999, when the company was a few dozen people and its search engine regularly fell over. Over the next quarter century — most of it pair-programming with Sanjay Ghemawat, the collaborator he has described as half of one shared brain — he co-built the software that made planet-scale computing routine: MapReduce, Bigtable, Spanner, Protocol Buffers, LevelDB. Then he did it again for AI: co-founding Google Brain, driving DistBelief and TensorFlow, backing the TPU, and ending up Chief Scientist of Google. Inside Google his stature is measured in a unique currency: “Jeff Dean facts,” the Chuck Norris jokes engineers write about him.
The Epidemiologist’s Son
Jeffrey Adgate Dean (born 1968 in Hawaii) had a childhood in motion — his father was a tropical-disease researcher, his mother a medical anthropologist who spoke six languages, and the family lived everywhere from Hawaii to Somalia. As a teenager he wrote statistical software for epidemiologists, work that led to a stint at the World Health Organization’s Global Programme on AIDS writing analysis software. He took a B.S. in computer science and economics at Minnesota (1990) and a PhD at the University of Washington (1996) under Craig Chambers, on whole-program optimizing compilers for object-oriented languages — exactly the unglamorous skill set, it turned out, that planetary-scale systems would need.
After a stretch at DEC’s Western Research Lab in Palo Alto — where he came to know a researcher at the sister lab named Sanjay Ghemawat — he joined Google in mid-1999, among its first few dozen employees (see Google: The Company).
Dean and Ghemawat: One Brain, Two Bodies
Early Google was held together with string: in one famous early crisis, the index simply stopped containing new pages, and Dean and Ghemawat debugged it down to memory corruption in unreliable cheap hardware. Out of years of such fires came a design philosophy — assume everything fails, build reliability in software — and a working method: Dean and Ghemawat at one keyboard, a pair programming partnership so productive that the New Yorker profiled it in 2018 as “The Friendship That Made Google Huge.”
The systems they built (with the Google File System team around them) became the canonical infrastructure stack of the big-data era (see The Big Data Revolution and The Cloud Computing Era):
- MapReduce (OSDI 2004) — express a computation as a map and a reduce, and the framework handles distribution, failures, and retries across thousands of machines. The open-source clone Hadoop built an industry on the paper.
- Bigtable (OSDI 2006) — a sparse, distributed, sorted map over petabytes; ancestor of the entire NoSQL wave (HBase and Cassandra both descend from it; see The Database Revolution).
- Spanner (OSDI 2012) — the “impossible” database: globally distributed yet externally consistent transactions, achieved by turning time itself into an engineering problem (GPS and atomic clocks bounding uncertainty via the TrueTime API; see Leslie Lamport and the Science of Distributed Systems for why this is hard).
- Plus the daily workhorses: Protocol Buffers, the LevelDB key-value store, and core pieces of Google’s ad serving and search indexing.
Dean’s talk-circuit list of “latency numbers every programmer should know” — L1 cache reference 0.5 ns, disk seek 10 ms — became a standard engineering catechism. He and Ghemawat shared the 2012 ACM-Infosys Foundation Award for the work.
Google Brain: The Second Career
In 2011, Dean ran into Andrew Ng in a Google microkitchen; Ng mentioned that neural networks, long dismissed, were starting to work. Dean — who had written a neural-net thesis-adjacent project as an undergraduate in 1990 — co-founded Google Brain with Ng and Greg Corrado to test one hypothesis: that scale was the missing ingredient (see Geoffrey Hinton and Deep Learning).
The 2012 “cat paper” made the point theatrically: a network spread across 16,000 CPU cores, trained on unlabeled YouTube frames, spontaneously developed a neuron that responded to cats. The infrastructure lineage followed the same arc as before: the internal DistBelief framework, then its public successor TensorFlow (open-sourced November 2015), and the Tensor Processing Unit — custom silicon Dean championed when it became clear that, at Google’s scale, neural networks on CPUs would not survive contact with the electricity bill (see The GPU Revolution).
Brain’s open-publication culture produced, among much else, the 2017 Transformer paper that underlies all modern LLMs (see The Transformer Architecture). Dean became head of all Google AI in 2018; when Brain and DeepMind merged into Google DeepMind in April 2023 under Demis Hassabis, Dean moved up to Chief Scientist of Google, co-steering the technical side of the Gemini model effort (see The LLM Race).
Fun Fact: Jeff Dean Facts
For April Fools’ Day 2007, Google engineers built an internal page of “Jeff Dean facts” in the style of Chuck Norris jokes, and colleagues have added to the canon ever since: “Jeff Dean’s PIN is the last four digits of pi.” “Compilers don’t warn Jeff Dean. Jeff Dean warns compilers.” “Jeff Dean once failed a Turing test when he correctly identified the 203rd Fibonacci number in less than a second.” “The speed of light in a vacuum used to be about 35 mph. Then Jeff Dean spent a weekend optimizing physics.” The genre says something real: in a company of tens of thousands of engineers, the culture needed a folk hero, and it chose the one who writes the boring fast code underneath everything.
📚 Sources
- Wikipedia: Jeff Dean
- The New Yorker: The Friendship That Made Google Huge (December 2018)
- Dean & Ghemawat: MapReduce — Simplified Data Processing on Large Clusters (OSDI 2004)
- Chang, Dean, Ghemawat et al.: Bigtable — A Distributed Storage System for Structured Data (OSDI 2006)
- Corbett, Dean et al.: Spanner — Google’s Globally-Distributed Database (OSDI 2012)
- Le, Dean, Ng et al.: Building High-level Features Using Large Scale Unsupervised Learning (ICML 2012) — the “cat paper”
- NYT: How Many Computers to Identify a Cat? 16,000 (June 25, 2012)
- Google: TensorFlow — Google’s latest machine learning system, open sourced (November 9, 2015)
- ACM: Jeff Dean and Sanjay Ghemawat — 2012 ACM-Infosys Foundation Award
- Google DeepMind announcement — Brain/DeepMind merger (April 2023)