The Prehistory of Kubernetes: From Cubic Equations to Cloud Orchestration
Wed 19 June 2024 by Moshe ZadkaThe story of Kubernetes, the leading container orchestration platform, is a tale of mathematical innovation, wartime necessity, and the open-source revolution. It begins, perhaps unexpectedly, with the work of Omar Khayyam, the 11th-century Persian polymath known for his contributions to mathematics, astronomy, and poetry.
Khayyam's work on solving cubic equations laid the foundation for the development of algebraic geometry, which in turn led to the invention of Cartesian coordinates by René Descartes in the 17th century. This "algebrization of geometry" allowed for the mathematical description of physical phenomena, such as planetary motion, and paved the way for Isaac Newton's development of calculus.
However, Newton's calculus, while groundbreaking, lacked a rigorous mathematical foundation. It took the work of 19th-century mathematicians like Augustin-Louis Cauchy and Karl Weierstrass to establish the epsilon-delta definition of limits and place calculus on a solid footing. This development also opened up new questions about infinity, leading to Georg Cantor's work on set theory and the discovery of paradoxes, such as Russell's paradox by Bertrand Russell, that threatened the foundations of mathematics.
The quest to resolve these paradoxes and establish a secure foundation for mathematics led to Kurt Gödel's incompleteness theorems, published in 1931. Gödel's first incompleteness theorem showed that in any consistent axiomatic system that includes arithmetic, there are statements that can neither be proved nor disproved within the system. The second incompleteness theorem demonstrated that such a system cannot prove its own consistency.
Crucially, Gödel's theorems relied on the concept of computability, which he used to construct a formal system representing arithmetic. However, Gödel's definition of computability was not entirely convincing, as it relied on the intuitive notion of a "finite procedure." This left open the possibility that a non-computable axiomatization of number theory, capturing "all that is true about the natural numbers," could exist and potentially sidestep the incompleteness theorems.
It was Alan Turing who took up the challenge of formalizing the concept of computability. In his groundbreaking 1936 paper "On Computable Numbers, with an Application to the Entscheidungsproblem," Turing introduced the Turing machine, a simple yet powerful mathematical model of computation. Turing's work not only provided a more rigorous foundation for Gödel's ideas but also proved that certain problems, such as the halting problem, are undecidable by Turing machines.
Turing's formalization of computability had far-reaching implications beyond the foundations of mathematics. It laid the groundwork for the development of modern computer science and played a crucial role in the birth of the digital age. Turing's work took on new urgency with the outbreak of World War II, as the need to break the German Enigma machine led to the development of early computing machines based on the principles he had established.
After the war, the individuals who had worked on these machines helped to establish the first computing companies, leading to the industrialization of computing and the development of programming languages and operating systems. One notable example is Alan Turing himself, who joined the National Physical Laboratory (NPL) in London, where he worked on the design of the Automatic Computing Engine (ACE), one of the first stored-program computers.
Another key figure was John von Neumann, a mathematician and physicist who made significant contributions to the design of the EDVAC (Electronic Discrete Variable Automatic Computer), an early stored-program computer. Von Neumann's work on the EDVAC and his subsequent report, "First Draft of a Report on the EDVAC," laid the foundation for the von Neumann architecture, which became the standard design for modern computers.
In the United Kingdom, Maurice Wilkes, who had worked on radar systems during the war, led the development of the EDSAC (Electronic Delay Storage Automatic Calculator) at the University of Cambridge. The EDSAC, which became operational in 1949, was the first practical stored-program computer and inspired the development of similar machines in the United States and elsewhere.
In the United States, J. Presper Eckert and John Mauchly, who had worked on the ENIAC (Electronic Numerical Integrator and Computer) during the war, founded the Eckert-Mauchly Computer Corporation in 1946. The company developed the UNIVAC (Universal Automatic Computer), which became the first commercially available general-purpose computer in the United States.
The UNIVAC was followed by the Multiprocessing Automatic Computer (Multivac), developed by IBM in the late 1950s. The Multivac introduced several innovative features, such as multiprogramming and memory protection, which allowed multiple users to share the same machine and provided a degree of isolation between their programs. These features would later inspire both positive and negative lessons for the creators of Unix, the influential operating system developed at Bell Labs in the 1970s.
Due to antitrust pressures, Bell Labs made Unix available to universities, where it became a standard teaching tool. This decision led to the development of Minix, a simplified Unix-like system, and eventually to the creation of Linux by Linus Torvalds.
As Linux grew in popularity, thanks to its open-source nature and ability to run on cheap hardware, it caught the attention of Google, which was looking for an operating system to power its "cloud-native" approach to computing. Google's engineers contributed key features to the Linux kernel, such as cgroups and namespaces, which laid the groundwork for the development of containerization technologies like Docker.
Google, recognizing the potential of containers and the need for a robust orchestration platform, developed Kubernetes as an open-source system based on its experience with Borg and other orchestration tools. By establishing Kubernetes as the standard for container orchestration, Google aimed to reduce the barrier to entry for users looking to switch between cloud providers, challenging the dominance of Amazon Web Services.
Today, Kubernetes has become the de facto standard for managing containerized applications. As in previous improvements, this led to a new open problem: generating and deploying manifests. Tools for generating manifests range from general templating solutions like Bash variable substitution, Sed, and Jinja, through full fledged programming languages, like using Jsonnet and Python, all the way to using dedicated tools like Kustomize and Helm. Meanwhile, deploying the manifests to Kubernetes can be done through continuous integration platforms running "helm upgrade" or "kubectl apply" or using dedicated platforms like ArgoCD or FluxCD. ArgoCD or Flagger also support gradual roll-outs.
From cubic equations to cloud orchestration, the story of Kubernetes is a reminder that the path of progress is rarely straightforward, but rather a winding journey through the realms of mathematics, computer science, and human ingenuity.