Computer Systems

To understand a program you must become both the machine and the program. Alan Perlis, “Epigrams” As software engineers, we study computer systems (also called "computer architecture") to be able to understand how our programs ultimately run and how the machine expects our data to be encoded. Our immediate reward is to be able to write faster, more memory-efficient and more secure code.

Longer term, the value of understanding computer systems may be even greater. Every abstraction between us and the hardware leaks, to one degree or another. This course aims to provide a set of first principles from which to build sturdier mental models and reason more effectively.

We’ll start by considering the machine's expectation of how our data is encoded, as well as some higher level binary representations like those for text. We'll continue through introductory C and assembly programming, to better understand the interface that a typical computer provides for executing programs. Finally we'll cover two important areas for improving program performance: utilizing the CPU microarchitecture, and CPU caches (the memory hierarchy).

At the core of this course are the sequences of problems for each topic. You should aim to solve each problem, using the worked solutions and supplementary explainers as needed. There are also some full seminars, which some find helpful to tie topics together. While no textbook is necessary for this course, we do recommend Computer Systems: A Programmer’s Perspective (“CS:APP”) as a supplement, and reference relevant CS:APP chapters below. For more suggestions on how to approach CS Primer, see the how-to guide.

Bits and bytes

The Ariane V rocket exploded in its maiden flight due in large part to integer overflow during conversion from a 64-bit floating point to a 16-bit signed integer. This module develops the skill of working with binary encodings of data, which will be invaluable in later courses like networking and databases and an important capability in its own right..

Key concepts covered include fixed width integer encodings and byte ordering, signed integer encodings, the IEEE-754 floating point scheme, Unicode and encodings such as UTF-8.

If time is limited or you are using this material for revision, we suggest you solve at least "Protobuf varint" or "TCP SYN flood" for integer encodings, "Sneaky NaN" for floating point and "UTF-8 truncate" for text encodings. Full seminars are included to provide motivating context and overviews for those who are less familiar with these concepts, although we still suggest attempting the problems first!

For those following along with CS:APP, we suggest chapter 2: "Representing and Manipulating Information" as a supplement

Problems

Protobuf varint	implement the variable length integer encoding in Protocol Buffers (57:39)
CSS color convert	convert CSS files from hex to rgb color declarations (36:17)
Beep beep boop	make the terminal beep… even ASCII isn't text! (28:14)
Image rotate	parse a BMP file and rotate it 90 degrees (57:50)
TCP SYN flood	parse pcap, IP and TCP headers to analyze a SYN flood attack (52:52)
UTF-8 truncate	cleanly truncate UTF-8 encoded strings by understanding the encoding (25:30)
Sneaky NaN FREE	encode a hidden message inside a NaN (26:33)

Seminars

Unsigned integers, alternatives, and overflow (sorry ma'am) (1:05:32)

Signed integers, integers in JavaScript, and ones' complement Internet checksum FREE (1:15:40)

Floating point, IEEE‑754 structure, and "safe integers" in floats in JavaScript FREE (1:10:20)

Unicode, UTF-8, UTF-32 and ugh UTF-16 (1:34:21)

Explainers

What are "standard in" and "standard out"? (02:43)

What are file descriptors 0, 1 and 2? (03:39)

Why does one byte correspond to two hexadecimal digits? (02:33)

What is a file descriptor? (01:57)

Why does a byte have values 0‑255? (04:12)

What does it mean for a value to "be" a certain number of bits? (09:19)

What's the difference between Unicode and something like UTF‑8? (01:53)

What exactly is hexadecimal? (06:12)

What do "big‑endian" and "little‑endian" mean? (08:07)

What are signed and unsigned integers? (10:33)

How to remember all the powers of two (03:48)

What do bitwise and, or, xor and not do? (08:04)

Why masking low order bits is like modulus of a power of 2 FREE (35:41)

Why do I sometimes see the sequence "\r\n" for a newline? (04:04)

How do I read a hexdump? (06:01)

What's the point of floating point? FREE (12:30)

Why do we care about teletype machines? (01:28)

What does it mean to flush a buffer? (02:16)

What is a byte exactly? (02:20)

What is the effect of shifting bits? (03:56)

Why are some bitwise operations faster than arithmetic equivalents? (07:43)

What's the basic difference between UTF‑8, UTF‑16 and UTF‑32? (09:24)

How does UTF-8 encode characters? (aka "artisinal Unicode sandwich") FREE (10:02)

How are IEEE 754 floating point numbers encoded? (aka "why is 0.1 + 0.2 = 0.30000000000000004?") FREE (20:05)

Introduction to C

This module is a crash course in the C programming language and its associated tooling. C allows us to write "high-level" codeHigher level than assembly, that is! with functions, arrays and pointers, structs, looping and conditional constructs and a syntax that isn’t tied to any one computer architecture. We take many of these abstractions for granted, but it's important to at least know that we rely upon them!

Our goal is to cover enough C syntax that we can later disassemble short C snippets to better understand the hardware/software interface. In later courses, we will also read code from a number of important open source projects written in C.Consider that Linux, PostgreSQL, the Apache HTTP Server and CPython (the main Python implementation) are all written in C. Others such as V8 are written in C++.

While not necessary, it's useful to have a copy of The C Programming Language (also known as K&R C) available as a reference, and for additional practice problems.

We suggest completing at least the first few "Bits and bytes" problems before starting this one, particularly if you're unfamiliar with bitwise operators.

Problems

Hello, World!	make sure you can compile and execute a C program (14:35)
Bitcount FREE	use basic C types, operators and control flow to count bits (23:40)
Fast pangram	rewrite a short Python program in C to compare execution time (34:10)
Dynamic array	implement the equivalent of a Python "list" or Ruby or JavaScript "Array" (43:14)
Varint C extension	write a Python module as a C extension (50:31)
Basic hashmap	implement your own hashmap for more struct and pointer practice (1:21:47)

Explainers

How do C compilers differ? (05:13)

What happens in the compilation pipeline? (10:11)

Loop syntax in C (05:54)

Type definitions and literals in C (12:06)

What is the effect of the compiler optimization level flag? (03:42)

What is a register? (06:40)

The C pre-processor, macros and conditional inclusion (06:12)

A brief overview of malloc and related functions (05:25)

Overview of pointers and arrays in C (14:44)

The generic pointer (void *) in C (16:21)

A brief overview of structs in C (15:42)

A brief tour of LLDB (33:26)

Helping Jessica understand void** (39:35)

A brief introduction to Valgrind (16:23)

Helping Max "rotate bits right" (20:59)

Low‑level programming

This module introduces the x86-64 architecture,This is the 64-bit version of Intel’s x86 architecture, an astonishingly popular architecture first introduced with the Intel 8086, released in 1978. Modern x86 machines are largely backwards compatible with programs written for these older machines, including the Intel 386 or 486 which were the first computers that many of us used. and provides an opportunity for us to write programs “close to the metal” in x86-64 assembly language.

Assembly language has become an abstraction layer, but it remains one of the best ways for us as software engineers to reason about the execution of our programs. In these exercises, we’ll primarily write short programs to help us understand how our higher level programming constructs are evaluated by the machine.

We will be using the nasm assembler; you may find this tutorial helpful to get started, and the nasm manual generally useful. You will almost certainly need a reference for the available instructions on x86-64, such as Felix Cloutier's x86 and amd64 instruction reference. While we link explanatory videos along the way, many students find this a new and challenging topic and respond positively to a variety of introductory guides: you may enjoy some assembly required, a fundamental introduction, or others you find yourself.

For those using CS:APP as a supplement, we suggest chapters 3.1-3.6.

Problems

Assembly "Hello world!"	Run your first program assembled from x86-64 assembly (31:53)
Sum to N	write a very simple assembly program to sum the integers to N (46:26)
Matrix access	simply make use of the x86-64 mov instruction for the first time (15:46)
x86‑64 pangram	convert your "fast pangram" C program to x86-64 assembly (17:08)
Binary convert	convert a string of ASCII 0's and 1's to an integer (09:47)
Cone volume	use floating point registers and operations to calculate the volume of a cone (11:12)
Low level recursion	understand recursion by writing a recursive Fibonacci implementation in assembly (26:13)

Explainers

Explaining the terms "computer architecture", "instruction set" and "microarchitecture" (04:26)

The System V AMD64 calling convention (08:40)

What's the difference between Intel and AT&T assembly syntax? (08:37)

What are the general purpose registers in x86‑64? (11:36)

What is the fetch‑decode‑execute cycle? (13:05)

CPU microarchitecture and low‑level performance

One of the most rewarding aspects of understanding computer architecture is being able to use a machine’s characteristics to one's advantage to write faster programs. While we often expect our compilers to find optimizations for us, there is both a limit to what a compiler can understand about your data, and a value to overseeing the compiler's decision making.

In this class, we explore some techniques for program optimization, mostly as an excuse to increase our understanding of how the CPU actually works, including microarchitectural considerations like micro-operations, pipelining, branch prediction and out-of-order execution.

For supplementary reading we suggest CS:APP chapter 5: Optimizing Program Performance. We also draw heavily upon Agner Fog's fantastic Optimization manuals.

Problems

Faster sum	Use your understanding of CPU microarchitecture to speed up a basic program (23:20)
Color quantizing	write a branchless version of a color conversion algorithm (40:56)

Explainers

Moving beyond the simple fetch/decode/execute model (32:34)

The memory hierarchy

In this module, we explore one of the most important practical aspects of modern computer architectures: the “memory hierarchy”.

Modern computers use a series of hardware caches to mitigate the cost of accessing main memory.Consider that on a 4GHz machine, the time taken for an unnecessary 100ns trip to RAM could have been used to execute 400 or more 0.25ns instructions! For many programs, the CPU cache miss rate can be a substantial cause of poor performance. We cover the reason for and basic operation of CPU caches, and most practically how to measure and best utilize them.

For supporting material, we suggest chapters 6.1-6.3 of CS:APP as well as the Algorithms for Modern Hardware chapter on RAM & CPU Caches.

Problems

Grayscale speedup	with an understanding of CPU caches, make a big speedup with a small change (10:34)
Pointer chase	speed up a "typical" Python program by reducing pointer chasing (29:04)
Bogosum	think concretely about (and measure!) cache use in linear vs random access patterns (1:00:14)

Explainers

What is a cache line? (brief explanation) (02:15)

Understanding CPU caches (33:23)

Why are there multiple levels of CPU cache? (13:46)

Measuring cache performance with perf and cachegrind (13:24)