Zum Inhalt springen

The Virtualization Revolution: One Machine, Many Worlds

Zusammenfassung

This article tells the story of how engineers learned to make one computer behave as if it were many — and why this seemingly academic trick became the invisible foundation of the modern internet. It is the story of Gerald Popek and Robert Goldberg, who proved in 1974 that full virtualization of the x86 architecture was theoretically impossible; of Mendel Rosenblum and VMware, who found a way around the theorem anyway; of Amazon, who turned virtualized hardware into the engine of cloud computing; and of Solomon Hykes and Docker, who made the entire question of hardware feel almost irrelevant. Virtualization is the technology that nobody sees and everybody depends on.

The Problem of Sharing One Machine

In the beginning, a computer was dedicated to a single user, a single job. The earliest machines at universities and research labs ran one program at a time — you submitted your punch cards, you waited, you collected your output. When timesharing operating systems arrived in the 1960s, they introduced the radical idea that a single machine could serve multiple users simultaneously by rapidly switching between their processes. The CPU’s time was shared; the hardware was not.

But sharing hardware at a deeper level — allowing multiple complete operating systems to run simultaneously on the same physical machine, each believing it had exclusive access to the hardware — proved far more difficult. This problem was called virtualization.

The concept was not new. IBM had built the CP-40 and CP-67 systems in the mid-1960s, running multiple virtual machines on a single mainframe. IBM’s engineers called the software layer between the hardware and the virtual machines a Virtual Machine Monitor, or hypervisor — a name that persists today. The technology worked on IBM’s mainframe architecture because that architecture had been designed with virtualization in mind.

The x86 architecture, designed by Intel for the personal computer, had not.

The Popek-Goldberg Problem

In 1974, computer scientists Gerald Popek and Robert Goldberg published a formal paper — “Formal Requirements for Virtualizable Third Generation Architectures” — that defined, with mathematical precision, the conditions a processor architecture must satisfy for a hypervisor to work correctly.

Their theorem identified three categories of processor instructions:

  • Privileged instructions: instructions that trap (cause an interrupt) when executed in unprivileged (user) mode, forcing control back to the operating system.
  • Sensitive instructions: instructions that behave differently depending on the privilege level at which they execute, or that directly access or modify hardware state.
  • Innocuous instructions: all other instructions.

For virtualization to work correctly, Popek and Goldberg proved, every sensitive instruction must also be a privileged instruction. If a sensitive instruction could execute in user mode without trapping, the hypervisor could not intercept and emulate it — the guest operating system might read the real hardware state rather than the virtual state the hypervisor was maintaining, and the illusion would shatter.

The x86 architecture violated this requirement. Intel’s design contained seventeen sensitive but non-privileging instructions — instructions that behaved differently in user mode versus kernel mode, but did not cause a trap when executed in user mode. A virtual machine monitor running on x86 could not intercept these instructions, and therefore could not fully virtualize the architecture.

The conclusion seemed inescapable: full, transparent virtualization of x86 was theoretically impossible.

Hypervisor Type 1 vs. Type 2

A Type 1 hypervisor (also called “bare-metal”) runs directly on the physical hardware, with no host operating system underneath it. Guest operating systems run on top of the hypervisor. IBM’s mainframe hypervisors were Type 1; VMware ESX Server is Type 1. A Type 2 hypervisor runs as a process within a conventional host operating system — the host OS manages hardware access, and the hypervisor sits above it. VMware Workstation and VirtualBox are Type 2. The distinction matters for performance and isolation, but the fundamental virtualization challenge — intercepting and emulating sensitive instructions — applies to both.

VMware and the Binary Translation Solution

For twenty years after Popek and Goldberg, the problem remained largely academic. Personal computers were not shared resources; the idea of running multiple operating systems on a single PC had little commercial appeal in the era when RAM was measured in megabytes and disk space in gigabytes.

Then came Mendel Rosenblum.

Rosenblum was a computer science professor at Stanford, specializing in operating systems and storage systems. In 1998, he co-founded VMware with his wife Diane Greene and three other Stanford colleagues: Scott Devine, Edward Wang, and Edouard Bugnion. Their central insight was not to solve the Popek-Goldberg problem — it was to route around it.

The technique was called binary translation. Rather than executing guest operating system code directly on the hardware, VMware’s hypervisor scanned the code before execution, identified any of the seventeen problematic non-privileging sensitive instructions, and replaced them at runtime with equivalent safe code that could be intercepted and controlled. The translation happened dynamically, with results cached so that frequently executed code was only translated once. Everything else — the vast majority of instructions — ran directly on the hardware at near-native speed.

The result was a running x86 virtual machine, on x86 hardware, executing a guest operating system that had been compiled for bare metal and had no awareness that it was virtualized. The Popek-Goldberg constraint had not been violated; it had been bypassed.

VMware Workstation launched in 1999. It allowed a developer running Windows to simultaneously run Linux in a window, or vice versa, with copy-paste between the two environments. To software developers, who routinely needed to test across operating systems and configurations, this was immediately transformative. VMware went from zero to one million users in less than two years.

The enterprise product, VMware GSX Server (later ESX Server), arrived in 2001. It allowed a single physical server to run multiple independent virtual machines simultaneously — meaning that a datacenter could consolidate its physical hardware, run multiple operating system instances on a single machine, and isolate applications from each other without buying more servers. Server consolidation ratios of 10:1 or 20:1 were achievable. The economics of datacenter management changed fundamentally.

Xen and the Paravirtualization Alternative

While VMware pursued full transparency — guest operating systems that required no modification — researchers at the University of Cambridge were developing a different approach.

Xen, first described in a 2003 paper by Ian Pratt and Paul Barham and their colleagues, introduced paravirtualization: instead of transparently intercepting all sensitive instructions, Xen modified the guest operating system kernel to replace those sensitive instructions with direct calls to the hypervisor’s API — what the Xen team called hypercalls. The guest OS was aware it was running virtualized, but the result was significantly better performance than binary translation for certain workloads, because no dynamic rewriting was needed.

The trade-off was that paravirtualization required modified guest kernels. Running an unmodified commercial operating system — a copy of Windows Server, for instance — was not possible without binary translation. Xen’s approach worked well for open-source operating systems like Linux, whose kernel could be modified, but struggled with proprietary software.

The debate between full virtualization and paravirtualization became moot in 2005 and 2006 when Intel and AMD introduced hardware virtualization extensions — Intel VT-x and AMD-V — directly into their processor designs. These extensions added the privileged trapping behavior for sensitive instructions that Popek and Goldberg’s theorem required. For the first time, x86 hardware natively supported full virtualization without binary translation or paravirtualization tricks. Both VMware and Xen rapidly adopted the hardware extensions as the foundation for their next-generation products.

Amazon and the Cloud

The insight that transformed virtualization from a useful IT tool into the infrastructure of the modern economy came not from a chip company or a hypervisor vendor, but from a bookseller.

Amazon Web Services grew from an internal Amazon project to improve how the company’s engineering teams provisioned infrastructure. In 2006, Amazon opened EC2 — the Elastic Compute Cloud — to the public. EC2 offered virtual machine instances that customers could rent by the hour, configure to their needs, and terminate when finished. The underlying infrastructure was built on Xen.

The economics were radical. Previously, a startup needing server capacity had two options: buy and rack physical hardware (capital-intensive, slow, permanent) or rent dedicated servers from a hosting provider (expensive, inflexible, over-provisioned). EC2 offered a third option: pay for exactly the capacity you need, for exactly as long as you need it, with no minimum commitment and no physical hardware to manage.

The implications for software development were profound. A startup could launch a product with a credit card and a laptop. A company expecting a traffic spike — a product launch, a Super Bowl advertisement — could provision hundreds of servers in minutes and release them hours later. The asymmetry between infrastructure cost and product experimentation collapsed almost overnight.

By 2010, AWS had become a significant revenue stream for Amazon — and a strategic mystery to every other technology company. Microsoft, Google, and IBM scrambled to build equivalent offerings: Azure (2010), Google Cloud (2011), IBM Bluemix (2014). The cloud computing market grew from near zero in 2006 to over half a trillion dollars annually by the mid-2020s. It is the defining infrastructure shift of twenty-first-century computing, and virtualization is its foundation.

Docker and the Container Revolution

By 2013, a new question had emerged. Virtual machines solved the problem of sharing hardware, but they carried significant overhead: each virtual machine included a complete guest operating system — kernel, system libraries, device drivers, init system — all of which consumed memory and CPU that had nothing to do with the actual application being run. A server hosting twenty virtual machines was running twenty full operating systems.

Solomon Hykes was the founder of a platform-as-a-service startup called dotCloud. In March 2013, at PyCon in Santa Clara, he demonstrated an internal tool his team had built to solve a problem familiar to every developer: the gap between “it works on my machine” and “it works in production.”

The tool was called Docker.

Docker packaged an application and all its dependencies — libraries, configuration files, runtime environment — into a self-contained unit called a container. Unlike a virtual machine, a container did not include a guest operating system kernel; it shared the host kernel while maintaining strict isolation at the process and filesystem level, using Linux features called namespaces and cgroups that had been available since 2008.

The result was radical efficiency. A container started in milliseconds rather than minutes. A physical server that could run ten virtual machines could run hundreds of containers. A developer’s container image, built on a laptop, ran identically in production because it carried its own dependencies. The environment inconsistencies that had plagued deployment — the “works on my machine” problem — disappeared by definition.

Docker’s release triggered an explosion of tooling. Kubernetes, released by Google in 2014, provided orchestration: it managed the deployment, scaling, and networking of thousands of containers across clusters of machines. By 2017, Kubernetes had become the de facto standard for container orchestration, running the application infrastructure of companies ranging from startups to the world’s largest banks.

The container revolution did not replace virtual machines — most Kubernetes clusters run containers inside virtual machines, adding a layer of hardware isolation beneath the software isolation containers provide. But it changed the unit of software deployment from the machine to the process, making the question of which physical hardware an application runs on almost entirely irrelevant.

Dead End: The Limits of x86 Virtualization and the Hardware Specific Trap

The virtualization revolution built on a fundamental insight: that the x86 architecture, though not designed for virtualization, could be made to support it through clever software engineering. This was correct, and the software ecosystem that grew up around x86 virtualization — VMware, Xen, KVM, the entire AWS infrastructure — represents one of the largest and most durable platform investments in computing history.

But the approach carried a constraint that became increasingly visible as workloads evolved.

The Overhead of Generality

Every virtual machine running on x86 hardware carries the overhead of an architecture designed in 1978 for single-user personal computers. The x86 instruction set accumulates decades of backwards-compatible cruft: the protected mode introduced in the 286, the 32-bit extensions of the 386, the 64-bit extensions of the Opteron. A virtual machine running a cloud workload in 2024 is, at the hardware level, still running code designed around the assumption that a human might be typing on a keyboard. This overhead is small on a per-instruction basis and enormous at datacenter scale.

The response has been a gradual shift toward purpose-built architectures. Amazon designed the Nitro System (2017), a custom hypervisor running on dedicated hardware that offloaded virtualization overhead from the main CPU entirely — effectively taking the hypervisor off the critical path. Apple’s M-series chips (2020) demonstrated that an ARM-based architecture designed holistically — CPU, GPU, memory, and I/O sharing the same silicon — could deliver x86-competitive performance at a fraction of the power consumption.

The deeper challenge for virtualization came with AI workloads. Neural network training and inference require GPU and TPU access — hardware that virtualization handles awkwardly. Passing a GPU through to a virtual machine (GPU passthrough) is possible but sacrifices most of the flexibility that virtualization provides. Running AI inference in containers is more practical but requires precise hardware-software co-design. The clean abstraction of “any workload, any hardware” that virtualization promised breaks down when the workload has an intimate hardware dependency.

Virtualization solved the problem of sharing general-purpose compute. The open question of the 2020s is whether general-purpose compute remains the relevant unit — or whether the shift to specialized accelerators requires new abstractions that the hypervisor model was never designed to provide.

For the container and orchestration layer that runs on top of virtualized infrastructure, see The Container Revolution. For the microservices architecture that containers enabled at scale, see The Microservices Revolution.


📚 Sources