RISC vs. CISC

Zusammenfassung

In 1980, a Berkeley PhD student named David Patterson proposed that computer architects had been wrong for two decades. The prevailing wisdom held that processors should provide rich instruction sets — complex, powerful operations that compilers could use to write compact programs. Patterson’s RISC hypothesis argued the opposite: simple, regular instructions that execute in one clock cycle, a compiler that generates more of them, and hardware that runs all of them fast. The RISC/CISC debate that followed produced the most consequential architectural divide in computing history — and then, paradoxically, resolved itself: modern x86 processors secretly execute RISC operations internally, while the world’s most deployed architecture (ARM) is RISC running on 200 billion devices. The debate changed everything and then disappeared into the silicon.

The Complex Instruction Set Tradition

The history of complex instruction sets begins with a simple problem: memory was expensive. In the 1960s, main memory cost thousands of dollars per kilobyte. Every byte of program storage had economic value. Instruction sets were designed to pack as much meaning as possible into each instruction.

IBM’s System/360 (1964) established the template. A single instruction could perform a complex operation — move a block of memory, perform a decimal arithmetic operation on strings of digits, search a character array for a byte. Complex instructions took multiple clock cycles to execute, but they compressed programs into fewer bytes.

The pattern continued through the 1970s. DEC’s PDP-11 and later VAX architectures provided increasingly elaborate instruction sets. The VAX had instructions that could perform string operations, polynomial evaluation, and queue manipulation. A single VAX instruction could perform an operation that required dozens of RISC instructions. VAX programs were compact — important when memory cost money.

Intel’s 8086 (1978), the processor that would eventually underpin the IBM PC and its descendants, was designed for the 8-bit era’s constraints. It had irregular registers, complex addressing modes, and variable-length instructions — a design that made efficient implementation increasingly difficult as clock speeds rose. The 8086 architecture’s characteristics would haunt Intel engineers for forty years.

The RISC Hypothesis

Two research projects, nearly simultaneously, challenged the CISC orthodoxy.

At IBM, John Cocke led a project studying the 801 minicomputer, intended for telecommunications switching equipment. The 801 had a simple instruction set with 32-bit fixed-length instructions, 32 general-purpose registers, and a single-cycle execution model. Cocke’s team found that the architecture performed better than complex-instruction machines on the same workloads. IBM kept the 801 research largely internal, but it eventually produced the POWER architecture.

At Berkeley, David Patterson and Carlo Sequin led the RISC project (1980–1984), coining the term “Reduced Instruction Set Computer.” Patterson’s team analyzed VAX programs and found a striking pattern: the complex instructions were almost never used. Compilers generated a handful of simple operations — loads, stores, arithmetic, branches — accounting for the vast majority of executed instructions. The complex instructions existed to help human assembly programmers, not compilers.

The RISC hypothesis followed: if compilers don’t use complex instructions anyway, build hardware optimized for the simple instructions compilers do use. Execute every instruction in a single clock cycle. Provide many registers (reducing memory accesses). Use fixed-length instructions (simplifying instruction fetching). Expose the pipeline to the compiler (let the compiler schedule instructions to avoid pipeline stalls).

At Stanford, John Hennessy led a parallel project called MIPS (Microprocessor without Interlocked Pipeline Stages), applying similar principles with particular emphasis on pipeline efficiency.

The Benchmark Wars

Through the 1980s, RISC and CISC proponents fought through published benchmarks. The debates were acrimonious and the benchmarks were frequently cherry-picked.

RISC advocates pointed to clean performance on scientific workloads and compilers. CISC advocates noted that RISC programs were larger — more instructions to accomplish the same task — and that instruction cache pressure could offset the per-instruction speed advantage. On some workloads, VAX systems outperformed early RISC machines.

The argument resolved empirically through the 1990s. RISC processors, as they were fabricated in CMOS and designed with full-custom layout, demonstrated clear performance advantages. Sun Microsystems built SPARC (a RISC architecture) and positioned it against VAX. Hewlett-Packard built PA-RISC. IBM commercialized the POWER architecture (descending from the 801). DEC eventually built the Alpha, a RISC chip that held the performance crown through most of the 1990s.

The commercial RISC workstation market of the late 1980s and early 1990s ran circles around VAX and the aging x86. DEC eventually killed the VAX family. The remaining question was whether Intel could extend x86 fast enough to keep up.

Intel’s Response: CISC Outside, RISC Inside

Intel’s answer to RISC was indirect and ultimately decisive: keep the x86 instruction set for compatibility, but internally translate it to RISC-like microoperations (μops) at execution time.

The Pentium Pro (1995) introduced this approach. A Pentium Pro translated complex x86 instructions into multiple simple μops, which it then executed out-of-order using a RISC-style execution engine. The x86 instruction set remained the programming interface; RISC execution was the implementation.

This architecture — sometimes called “CISC outside, RISC inside” — combined compatibility (every x86 program ever written still ran) with RISC’s performance advantages. AMD adopted the same approach with its K5 and Athlon processors. The two firms competed on μop execution efficiency while the instruction set remained fixed.

The approach worked spectacularly. By 2000, x86 processors were faster than any RISC workstation chip. Sun’s SPARC, HP’s PA-RISC, DEC’s Alpha, and IBM’s POWER retreated from the high-volume market. The RISC workstation era ended not because CISC won but because Intel’s RISC-inside-CISC was indistinguishable from RISC in practice.

ARM: RISC Everywhere

The lasting RISC victory came not in workstations but in mobile and embedded computing, where ARM reigned.

ARM (originally Acorn RISC Machine, then Advanced RISC Machines) was designed in Cambridge in 1985 with extreme power efficiency as a constraint. The first ARM chip (ARM1, 1985) consumed about a tenth of a watt — astonishing for the era, against roughly 2 watts for Intel’s contemporary 386. The instruction set was clean, 32-bit RISC with a few clever extensions. ARM Holdings, spun out of Acorn in 1990, chose to license the architecture rather than manufacture chips — any company could design and produce an ARM processor.

ARM chips powered the Apple Newton, the iPod, and then the iPhone. By 2020, ARM processors shipped in essentially every smartphone, tablet, smart speaker, and embedded system on earth — more than 200 billion chips since the architecture was created. Apple’s transition to Apple Silicon (M1, 2020) brought ARM-based chips to Mac laptops and desktops, where they performed comparably to or better than Intel’s x86 processors while consuming far less power.

The ARM victory confirmed the RISC thesis for constrained environments: when power consumption and chip area matter more than instruction set compatibility, RISC’s simplicity wins.

The Resolution That Wasn’t

The RISC/CISC debate is often declared over, but the resolution is stranger than either side predicted. The x86 — a CISC architecture — remains dominant in servers, desktops, and laptops, but executes RISC μops internally. ARM — a RISC architecture — dominates mobile and embedded computing, but modern ARM chips (ARMv8-A, AArch64) include sophisticated extensions that blur the RISC/CISC line. Apple’s M-series chips achieve their performance partly through very wide execution pipelines that find and parallelize independent operations — a technique that requires the “simple, regular instructions” that RISC provides.

The academic framing has evolved. Modern processor design literature rarely uses RISC/CISC as primary categories. Instead, it discusses out-of-order execution, speculative execution, branch prediction, cache hierarchy design, and memory bandwidth — questions that apply to any instruction set. The controversy that drove a decade of architecture research produced a generation of techniques that now power every processor, regardless of which “side” designed them.

Patterson and Hennessy received the Turing Award in 2017 for their foundational work on RISC architecture. Their textbook, Computer Organization and Design, remains the standard introduction to computer architecture worldwide.

📚 Sources

Patterson and Hennessy, Computer Organization and Design — the standard textbook; its 1988 first edition defined the pedagogy of computer architecture
David Patterson’s RISC Paper (1980) — “RISC I: A Reduced Instruction Set VLSI Computer,” ISCA 1980
John Hennessy ACM Turing Award Lecture — 2017 award covering both RISC history and modern architecture
The Pentium Pro’s μop Architecture — Intel’s internal translation from CISC to RISC microoperations
ARM Architecture Reference Manual — ARM Holdings’ specification; all revisions from ARMv1 (1985) through AArch64
Apple M1 Architecture Overview — detailed breakdown of Apple’s ARM implementation and its performance characteristics