Meditations on the Zen of Python

Mon 30 December 2019 by Moshe Zadka

(This is based on the series published in opensource.com as 9 articles: 1, 2, 3, 4, 5, 6, 7, 8, 9)

Python contributor Tim Peters introduced us to the Zen of Python in 1999. Twenty years later, its 19 guiding principles continue to be relevant within the community.

The Zen of Python is not "the rules of Python" or "guidelines of Python". It is full of contradiction and allusion. It is not intended to be followed: it is intended to be meditated upon.

In this spirit, I offer this series of meditations on the Zen of Python.

Beautiful is better than ugly.

It was in Structure and Interpretation of Computer Programs (SICP) that the point was made: "Programs must be written for people to read and only incidentally for machines to execute." Machines do not care about beauty, but people do.

A beautiful program is one that is enjoyable to read. This means first that it is consistent. Tools like Black, flake8, and Pylint are great for making sure things are reasonable on a surface layer.

But even more important, only humans can judge what humans find beautiful. Code reviews and a collaborative approach to writing code are the only realistic way to build beautiful code. Listening to other people is an important skill in software development.

Finally, all the tools and processes are moot if the will is not there. Without an appreciation for the importance of beauty, there will never be an emphasis on writing beautiful code.

This is why this is the first principle: it is a way of making "beauty" a value in the Python community. It immediately answers: "Do we really care about beauty?" We do.

Explicit is better than implicit.

We humans celebrate light and fear the dark. Light helps us make sense of vague images. In the same way, programming with more explicitness helps us make sense of abstract ideas. It is often tempting to make things implicit.

"Why is self explicitly there as the first parameter of methods?"

There are many technical explanations, but all of them are wrong. It is almost a Python programmer's rite of passage to write a metaclass that makes explicitly listing self unnecessary. (If you have never done this before, do so; it makes a great metaclass learning exercise!)

The reason self is explicit is not because the Python core developers did not want to make a metaclass like that the "default" metaclass. The reason it is explicit is because there is one less special case to teach: the first argument is explicit.

Even when Python does allow non-explicit things, such as context variables, we must always ask: Are we sure we need them? Could we not just pass arguments explicitly? Sometimes, for many reasons, this is not feasible. But prioritizing explicitness means, at least, asking the question and estimating the effort.

Simple is better than complex.

When it is possible to choose at all, choose the simple solution. Python is rarely in the business of disallowing things. This means it is possible, and even straightforward, to design baroque programs to solve straightforward problems.

It is worthwhile to remember at each point that simplicity is one of the easiest things to lose and the hardest to regain when writing code.

This can mean choosing to write something as a function, rather than introducing an extraneous class. This can mean avoiding a robust third-party library in favor of writing a two-line function that is perfect for the immediate use-case. Most often, it means avoiding predicting the future in favor of solving the problem at hand.

It is much easier to change the program later, especially if simplicity and beauty were among its guiding principles, than to load the code down with all possible future variations.

Complex is better than complicated.

This is possibly the most misunderstood principle because understanding the precise meanings of the words is crucial. Something is complex when it is composed of multiple parts. Something is complicated when it has a lot of different, often hard to predict, behaviors.

When solving a hard problem, it is often the case that no simple solution will do. In that case, the most Pythonic strategy is to go "bottom-up." Build simple tools and combine them to solve the problem.

This is where techniques like object composition shine. Instead of having a complicated inheritance hierarchy, have objects that forward some method calls to a separate object. Each of those can be tested and developed separately and then finally put together.

Another example of "building up" is using singledispatch, so that instead of one complicated object, we have a simple, mostly behavior-less object and separate behaviors.

Flat is better than nested.

Nowhere is the pressure to be "flat" more obvious than in Python's strong insistence on indentation. Other languages will often introduce an implementation that "cheats" on the nested structure by reducing indentation requirements. To appreciate this point, let's take a look at JavaScript.

JavaScript is natively async, which means that programmers write code in JavaScript using a lot of callbacks.

a(function(resultsFromA) {
  b(resultsFromA, function(resultsfromB) {
    c(resultsFromC, function(resultsFromC) {
      console.log(resultsFromC)
   }
  }
}

Ignoring the code, observe the pattern and the way indentation leads to a right-most point. This distinctive "arrow" shape is tough on the eye to quickly walk through the code, so it's seen as undesirable and even nicknamed "callback hell." However, in JavaScript, it is possible to "cheat" and not have indentation reflect nesting.

a(function(resultsFromA) {
b(resultsFromA,
  function(resultsfromB) {
c(resultsFromC,
  function(resultsFromC) {
    console.log(resultsFromC)
}}}

Python affords no such options to cheat: every nesting level in the program must be reflected in the indentation level. So deep nesting in Python looks deeply nested. That means "callback hell" was a worse problem in Python than in JavaScript: nesting callbacks mean indenting with no options to "cheat" with braces.

This challenge, in combination with the Zen principle, has led to an elegant solution by a library I worked on. In the Twisted framework, we came up with the deferred abstraction, which would later inspire the popular JavaScript promise abstraction. In this way, Python's unwavering commitment to clear code forces Python developers to discover new, powerful abstractions.

future_value = future_result()
future_value.addCallback(a)
future_value.addCallback(b)
future_value.addCallback(c)

(This might look familiar to modern JavaScript programmers: Promises were heavily influenced by Twisted's deferreds.)

Sparse is better than dense.

The easiest way to make something less dense is to introduce nesting. This habit is why the principle of sparseness follows the previous one: after we have reduced nesting as much as possible, we are often left with dense code or data structures. Density, in this sense, is jamming too much information into a small amount of code, making it difficult to decipher when something goes wrong.

Reducing that denseness requires creative thinking, and there are no simple solutions. The Zen of Python does not offer simple solutions. All it offers are ways to find what can be improved in the code, without always giving guidance for "how."

Take a walk. Take a shower. Smell the flowers. Sit in a lotus position and think hard, until finally, inspiration strikes. When you are finally enlightened, it is time to write the code.

Readability counts.

In some sense, this middle principle is indeed the center of the entire Zen of Python. The Zen is not about writing efficient programs. It is not even about writing robust programs, for the most part. It is about writing programs that other people can read.

Reading code, by its nature, happens after the code has been added to the system. Often, it happens long after. Neglecting readability is the easiest choice since it does not hurt right now. Whatever the reason for adding new code -- a painful bug or a highly requested feature -- it does hurt. Right now.

In the face of immense pressure to throw readability to the side and just "solve the problem," the Zen of Python reminds us: readability counts. Writing the code so it can be read is a form of compassion for yourself and others.

Special cases aren't special enough to break the rules.

There is always an excuse. This bug is particularly painful; let's not worry about simplicity. This feature is particularly urgent; let's not worry about beauty. The domain rules covering this case are particularly hairy; let's not worry about nesting levels.

Once we allow special pleading, the dam wall breaks, and there are no more principles; things devolve into a Mad Max dystopia with every programmer for themselves, trying to find the best excuses.

Discipline requires commitment. It is only when things are hard, when there is a strong temptation, that a software developer is tested. There is always a valid excuse to break the rules, and that's why the rules must be kept the rules. Discipline is the art of saying no to exceptions. No amount of explanation can change that.

Although, practicality beats purity.

"If you think only of hitting, springing, striking, or touching the enemy, you will not be able actually to cut him.", Miyamoto Musashi, The Book of Water

Ultimately, software development is a practical discipline. Its goal is to solve real problems, faced by real people. Practicality beats purity: above all else, we must solve the problem. If we think only about readability, simplicity, or beauty, we will not be able to actually solve the problem.

As Musashi suggested, the primary goal of every code change should be to solve a problem. The problem must be foremost in our minds. If we waver from it and think only of the Zen of Python, we have failed the Zen of Python. This is another one of those contradictions inherent in the Zen of Python.

Errors should never pass silently...

Before the Zen of Python was a twinkle in Tim Peters' eye, before Wikipedia became informally known as "wiki," the first WikiWiki site, C2, existed as a trove of programming guidelines. These are principles that mostly came out of a Smalltalk programming community. Smalltalk's ideas influenced many object-oriented languages, Python included.

The C2 wiki defines the Samurai Principle: "return victorious, or not at all." In Pythonic terms, it encourages eschewing sentinel values, such as returning None or -1 to indicate an inability to complete the task, in favor of raising exceptions. A None is silent: it looks like a value and can be put in a variable and passed around. Sometimes, it is even a valid return value.

The principle here is that if a function cannot accomplish its contract, it should "fail loudly": raise an exception. The raised exception will never look like a possible value. It will skip past the returned_value = call_to_function(parameter) line and go up the stack, potentially crashing the program.

A crash is straightforward to debug: there is a stack trace indicating the problem as well as the call stack. The failure might mean that a necessary condition for the program was not met, and human intervention is needed. It might mean that the program's logic is faulty. In either case, the loud failure is better than a hidden, "missing" value, infecting the program's valid data with None, until it is used somewhere and an error message says "None does not have method split," which you probably already knew.

Unless explicitly silenced.

Exceptions sometimes need to be explicitly caught. We might anticipate some of the lines in a file are misformatted and want to handle those in a special way, maybe by putting them in a "lines to be looked at by a human" file, instead of crashing the entire program.

Python allows us to catch exceptions with except. This means errors can be explicitly silenced. This explicitness means that the except line is visible in code reviews. It makes sense to question why this is the right place to silence, and potentially recover from, the exception. It makes sense to ask if we are catching too many exceptions or too few.

Because this is all explicit, it is possible for someone to read the code and understand which exceptional conditions are recoverable.

In the face of ambiguity, refuse the temptation to guess.

What should the result of 1 + "1" be? Both "11" and 2 would be valid guesses. This expression is ambiguous: there is no single thing it can do that would not be a surprise to at least some people.

Some languages choose to guess. In JavaScript, the result is "11". In Perl, the result is 2. In C, naturally, the result is the empty string. In the face of ambiguity, JavaScript, Perl, and C all guess.

In Python, this raises a TypeError: an error that is not silent. It is atypical to catch TypeError: it will usually terminate the program or at least the current task (for example, in most web frameworks, it will terminate the handling of the current request).

Python refuses to guess what 1 + "1" means. The programmer is forced to write code with clear intention: either 1 + int("1"), which would be 2 or str(1) + "1", which would be "11"; or "1"[1:], which would be an empty string. By refusing to guess, Python makes programs more predictable.

There should be one -- and preferably only one -- obvious way to do it.

Prediction also goes the other way. Given a task, can you predict the code that will be written to achieve it? It is impossible, of course, to predict perfectly. Programming, after all, is a creative task.

However, there is no reason to intentionally provide multiple, redundant ways to achieve the same thing. There is a sense in which some solutions are "better" or "more Pythonic."

Part of the appreciation for the Pythonic aesthetic is that it is OK to have healthy debates about which solution is better. It is even OK to disagree and keep programming. It is even OK to agree to disagree for the sake of harmony. But beneath it all, there has to be a feeling that, eventually, the right solution will come to light. There must be the hope that eventually we can live in true harmony by agreeing on the best way to achieve a goal.

Although that way may not be obvious at first (unless you're Dutch).

This is an important caveat: It is often not obvious, at first, what is the best way to achieve a task. Ideas are evolving. Python is evolving. The best way to read a file block-by-block is, probably, to wait until Python 3.8 and use the walrus operator.

This common task, reading a file block-by-block, did not have a "single best way to do it" for almost 30 years of Python's existence.

When I started using Python in 1998 with Python 1.5.2, there was no single best way to read a file line-by-line. For many years, the best way to know if a dictionary had a key was to use .haskey until the in operator became the best way.

It is only by appreciating that sometimes, finding the one (and only one) way of achieving a goal can take 30 years of trying out alternatives that Python can keep aiming to find those ways. This view of history, where 30 years is an acceptable time for something to take, often feels foreign to people in the United States, when the country has existed for just over 200 years.

The Dutch, whether it's Python creator Guido van Rossum or famous computer scientist Edsger W. Dijkstra, have a different worldview according to this part of the Zen of Python. A certain European appreciation for time is essential.

Now is better than never.

There is always the temptation to delay things until they are perfect. They will never be perfect, though. When they look "ready" enough, that is when it is time to take the plunge and put them out there. Ultimately, a change always happens at some now: the only thing that delaying does is move it to a future person's "now."

Although never is often better than right now.

This, however, does not mean things should be rushed. Decide the criteria for release in terms of testing, documentation, user feedback, and so on. "Right now," as in before the change is ready, is not a good time.

This is a good lesson not just for popular languages like Python, but also for your personal little open source project.

If the implementation is hard to explain, it's a bad idea.

The most important thing about programming languages is predictability. Sometimes we explain the semantics of a certain construct in terms of abstract programming models, which do not correspond exactly to the implementation. However, the best of all explanations just explains the implementation.

If the implementation is hard to explain, it means the avenue is impossible.

If the implementation is easy to explain, it may be a good idea.

Just because something is easy does not mean it is worthwhile. However, once it is explained, it is much easier to judge whether it is a good idea.

This is why the second half of this principle intentionally equivocates: nothing is certain to be a good idea, but it always allows people to have that discussion.

Namespaces in Python

Python uses namespaces for everything. Though simple, they are sparse data structures -- which is often the best way to achieve a goal.

Modules are namespaces. This means that correctly predicting module semantics often just requires familiarity with how Python namespaces work. Classes are namespaces. Objects are namespaces. Functions have access to their local namespace, their parent namespace, and the global namespace.

The simple model, where the . operator accesses an object, which in turn will usually, but not always, do some sort of dictionary lookup, makes Python hard to optimize, but easy to explain.

Indeed, some third-party modules take this guideline and run with it. For example, the variants package turns functions into namespaces of "related functionality." It is a good example of how the Zen of Python can inspire new abstractions.