(Thanks to Mark Rice for his helpful suggestions. Any mistakes or omissions that remain are my responsibility.)
Some Python projects are designed to be libraries, consumed by other projects. These are most of the things people consider "Python projects": for example, Twisted, Flask, and most other open source tools. However, things like mu are sometimes installed as an end-user artifact. More commonly, many web services are written as deployable Python applications. A good example is the issue tracking project trac.
Projects that are deployed must be deployed with their dependencies,
and with the dependencies of those dependencies,
and so forth.
at deployment time,
a specific version must be deployed.
If a project declares a dependency of
something needs to decide whether to deploy
in this text,
we will refer to the declared compatibility statements
in something like
as "intent" dependencies,
since they document programmer intent.
The specific dependencies that are eventually deployed
will be referred as the "expressed" dependencies,
since they are expressed in the actual deployed artifact
(for example, a Docker image).
Usually, "intent" dependencies are defined in
This does not have to be the case,
but it almost always is:
since there is usually some "glue" code at the top,
keeping everything together,
it makes sense to treat it as a library --
one that sometimes is not uploaded to any package index.
When producing the deployed artifact, we need to decide on how to generate the expressed dependencies. There are two competing forces. One is the desire to be current: using the latest version of Django means getting all the latest bug fixes, and means getting fixes to future bugs will require moving less versions. The other is the desire to avoid changes: when deploying a small bug fix, changing all library versions to the newest ones might introduce a lot of change.
For this reason,
most projects will check in the "artifact"
into source control,
produce actual deployed versions from that,
and some procedure to update it.
A similar story can be told about the development dependencies,
often defined as extra
[dev] dependencies in
and resulting in a file
is checked into source control.
The pressures are a little different,
sometimes nobody bothers to check in
even when checking in
but the basic dynamic is similar.
The worst procedure is probably "when someone remembers to". This is not usually anyone's top priority, and most developers are busy with their regular day-to-day task. When an upgrade is necessary for some reason -- for example, a bug fix is available, this can mean a lot of disruption. Often this disruption manifests in that just upgrading one library does not work. It now depends on newer libraries, so the entire dependency graph has to be updated, all at once. All intermediate "deprecation warnings" that might have been there for several months have been skipped over, and developers are suddenly faced with several breaking upgrades, all at once. The size of the change only grows with time, and becomes less and less surmountable, making it less and less likely that it will be done, until it ends in a case of complete bitrot.
Sadly, however, "when someone remembers to" is the default procedure in the absence of any explicit procedure.
having suffered through the disadvantages of "when someone remembers to",
decide to go to the other extreme:
avoiding to check in the
and generating it on every artifact build.
this means causing a lot of unnecessary churn.
It is impossible to fix a small bug without making sure that the
code is compatible with the latest versions of all libraries.
A better way to approach the problem is to have an explicit process
of recalculating the expressed dependencies from the intent dependencies.
One approach is to manufacture,
with some cadence,
code change requests that update the
This means they are resolved like all code changes:
review, running automated tests, and
whatever other local processes are implemented.
Another is to do those on a calendar based event.
This can be anything from a manually-strongly-encouraged
where on Monday morning,
one of a developer tasks is to generate a
updates for all projects they are responsible for,
to including it as part of a time-based release process:
generating it on a cadence that aligns with agile "sprints",
as part of the release of the code changes in a particular sprints.
When updating does reveal an incompatibility
it needs to be resolved.
One way is to update the local code:
this certainly is the best thing to do when the problem is that the library
changed an API or changed an internal implementation detail that was being
used accidentally (...or intentionally).
sometimes the new version has a bug in it that needs to be fixed.
In that case,
the intent is now to avoid that version.
It is best to express the intent exactly as that:
This means when an even newer version is released,
hopefully fixing the bug,
it will be used.
If a new version is released without the bug fix,
we add another
This is painful,
and intentionally so.
Either we need to get the bug fixed in the library,
stop using the library,
or fork it.
Since we are falling further and further behind the latest version,
this is introducing risk into our code,
and the increasing
!= clauses will indicate this pain:
and encourage us to resolve it.
The most important thing is to choose a specific process for updating the expressed dependencies, clearly document it and consistently follow it. As long as such a process is chosen, documented and followed, it is possible to avoid the bitrot issue.
Tests Should Fail
"eyes have they, but they see not" -- Psalms, 135:16
Eyes are expensive to maintain. They require protection from the elements, constant lubrication, behavioral adaptations to protect them and more. However …read more
Thank you, Guido
When I was in my early 20s, I was OK at programming, but I definitely didn't like it. Then, one evening, I read the Python tutorial. That evening changed my mind. I woke up the next morning, like Neo in the matrix, and knew Python.
I was doing statistics at …read more
Avoiding Private Methods
MyClass._dangerous(self) is a private method.
We could have implemented the same functionality without a private
method as follows:
- Define a class
InnerClasswith the same
InnerClass.dangerous(self)with the same logic of
MyClassinto a wrapper class over
PyCon US 2018 Twisted Birds of Feather Open Space Summary
PyCon 2018 US Docker Birds of Feather Open Space Summary
We started out the conversation with talking about writing good Dockerfiles. There is no list of "best practices" yet. Hynek reiterated for us "ship applications, not build environments". Moshe summarized it as "don't put gcc in the deployed image."
We discussed a little bit what we are trying to achieve …read more
Announcment: My book, from python import better, has been published. This post is based on one of the chapters from it.
When Python started out, one of the oft-touted benefits was "batteries included!". Gone were the days of searching for which XML parsing library was the best -- just use the …read more
Web Development for the 21st Century
(Thanks to Glyph Lefkowitz for some of the inspiration for this port, and to Mahmoud Hashemi for helpful comments and suggestions. All mistakes and issues that remain are mine alone.)
The Python REPL has always been touted as one of Python's greatest strengths. With Jupyter, Jupyter Lab in its latest …read more
(Thanks to Paul Ganssle for his suggestions and improvements. All mistakes that remain are mine.)
When exposing a Python program as a command-line application,
there are several ways to get the Python code to run.
The oldest way,
and the one people usually learn in tutorials,
is to run