A common way to expose an API in Python is as inheritance. Though many projects do that, there is a better way.
But first, let's see. How popular is inheritance-as-an-API, anyway?
Let's go to the Twisted website. Right at the center of the screen, at prime real-estate, we see:
What's there? The following is abridged:
class Echo(protocol.Protocol): def dataReceived(self, data): self.transport.write(data) class EchoFactory(protocol.Factory): def buildProtocol(self, addr): return Echo()
(This is part of an example on building an echo-server protocol.)
If you are wondering who came up with this amazing API, it is the same person who is writing the words you are reading. I certainly thought it was an amazing API!
Look at how many smart people agreed with me.
Django takes a page of tutorial to get there, but sure enough:
class Question(models.Model): question_text = models.CharField(max_length=200) pub_date = models.DateTimeField('date published') class Choice(models.Model): question = models.ForeignKey(Question, on_delete=models.CASCADE) choice_text = models.CharField(max_length=200) votes = models.IntegerField(default=0)
Jupyter's echo kernel starts:
class EchoKernel(Kernel): implementation = 'Echo' implementation_version = '1.0' language = 'no-op'
Everyone is doing it. A project I have been a developer on for ~16 years. The most popular Python web library, responsible for who-knows-how-many requests per second in Instagram. A project that won the ACM award (and well deserved, at that).
However, popularity is not everything. This is not a good idea.
When exposing class inheritance as a public interface, that means committing to a level of backwards compatibility that is unheard of. Even adding private methods or attributes becomes dangerous.
Let's give a toy example:
class Writer: _write = lambda x: None def set_output(self, output): self._write = output.write def write(self, message): formatted = self.format(message) self._write(message) def format(self, message): raise NotImplementedError("format")
This is a simple writer,
while initially sending everything down a black hole,
can be set to write the output to a file-like object.
It needs to format the messages,
so the proper usage is to subclass
(while taking care
not to define methods called
class BufferWriter(MultiWriter): _buffer = False def format(self, message): if self._buffer: return 'Buffer: ' + message else: return 'Message: ' + message def switch_buffer(self): self._buffer = not self._buffer
The simplest formatting would return the message as is.
However, this formatter is slightly less trivial --
it prefixes the message with the word
depending on an internal variable that can be switched.
Now we can do things like:
>>> bp = BufferWriter() >>> bp.set_output(sys.stdout) >>> bp.write("hello") Message: hello >>> bp.switch_buffer() >>> bp.write("hello") Buffer: hello
This looks good, so far. Of course, things are never so simple in real life. The writer library, naturally, gets thousands of stars on GitHub. It becomes popular. There's a development community, complete with a discord channel and a mailing list. So naturally, important features get added.
class Writer: _buffer = "" _write = lambda x: None def set_output(self, output): self._write = output.write def write(self, message): self._buffer += self.format(message) if len(self._buffer) > 10: self._write(self._buffer) self._buffer = "" def format(self, message): raise NotImplementedError("format")
Turns out people needed to buffer some of the shorter messages.
This was a crucial performance improvement,
that all users were clamoring for,
2018.6.1 is highly anticipated.
The symptoms are weird:
TypeError s and other such fun.
All because both the superclass and the subclass are competing to access
With enough care,
these problems can be avoided.
A library which exposes classes for inheritance must add all
new private methods or attributes as
never ever add any public methods or attributes.
nobody does that.
So what's the alternative?
from zope import interface class IFormatter(interface.Interface): def format(message): """show stuff"""
We define an abstract interface.
This interface  has only one method --
@attr.s class Writer: _buffer = "" _write = lambda x: None _formatter = attr.ib() def set_output(self, output): self._write = output.write def write(self, message): self._buffer += self._formatter.format(message) if len(self._buffer) > 10: self._write(self._buffer) self._buffer = ""
We use the
attrs library [#]
to define our main functionality:
a class that wraps other objects,
which we expect to be
We can automatically verify,
by instead having the
_formatter line say:
_formatter = attr.ib(validator=lambda instance, attribute, value: verify.verifyObject(IFormatter, value))
Note that this separates the concerns:
the "fake method"
has moved to a "fake class" (an interface).
@interface.implementer(IFormatter) class BufferFormatter: _buffer = False def format(self, message): if self._buffer: return 'All Channels: ' + message else: return 'Limited Channels: ' + message def switch_buffer(self): self._buffer = not self._buffer
Note that now,
if we only have the
there is no way to switch prefixes.
Correctly switching prefixes means keeping access to the original object.
If there is a need to "call back" to the original methods, the original object can be passed in to the wrapped object. One advantage is that, being a distinct object, it is obvious one should only call into public methods and only access public variables.
Passing ourselves to a method is, in general, not an ideal practice. What we really should do, is to pass specific methods or variables directly into the method. But this is funny: when using inheritance, we always effectively pass ourselves to every method. So even this refactoring is a net improvement. When the biggest criticism of a refactoring is "this could now be improved even more", it usually means it is a good idea.
- Thanks to Tom Goren for his feedback -- the original version was more aggressive.
- Thanks to Glyph Lefkowitz for pushing me to make the example better.
- Thanks to Augie Fackler and Nathaniel Manista for much of the inspiration.
Avoiding Private Methods
MyClass._dangerous(self) is a private method.
We could have implemented the same functionality without a private
method as follows:
- Define a class
InnerClasswith the same
InnerClass.dangerous(self)with the same logic of
MyClassinto a wrapper class over
PyCon US 2018 Twisted Birds of Feather Open Space Summary
PyCon 2018 US Docker Birds of Feather Open Space Summary
We started out the conversation with talking about writing good Dockerfiles. There is no list of "best practices" yet. Hynek reiterated for us "ship applications, not build environments". Moshe summarized it as "don't put gcc in the deployed image."
We discussed a little bit what we are trying to achieve …read more
Announcment: My book, from python import better, has been published. This post is based on one of the chapters from it.
When Python started out, one of the oft-touted benefits was "batteries included!". Gone were the days of searching for which XML parsing library was the best -- just use the …read more
Web Development for the 21st Century
(Thanks to Glyph Lefkowitz for some of the inspiration for this port, and to Mahmoud Hashemi for helpful comments and suggestions. All mistakes and issues that remain are mine alone.)
The Python REPL has always been touted as one of Python's greatest strengths. With Jupyter, Jupyter Lab in its latest …read more
(Thanks to Paul Ganssle for his suggestions and improvements. All mistakes that remain are mine.)
When exposing a Python program as a command-line application,
there are several ways to get the Python code to run.
The oldest way,
and the one people usually learn in tutorials,
is to run
Random Bites of Pi(e)
In today's edition of Pi day post, we will imagine we have a pie. (If you lack imagination, go out and get a pie.) (Even if you do not lack imagination, go out and get a pie.)
As is traditional, we got a round pie. Since pies are important, we …read more
The Python Toolbox
I have written before about Python tooling. However, as all software, things have changed -- and I wanted to write a new post, with my current understanding of best practices.
As of now,
pytest has achieved official victory.
Unless there are overwhelming reasons to use something else,
strongly consider using …
Jupyter for SRE
Jupyter is a tool that came out of the data science community. In science, being able to replicate experiments is of the utmost importance -- so a tool where you can "show your work" is helpful. However, being able to show your work -- have colleagues validate what you have done, repeat …read more