Imagine that you are a chef in a pizzeria. A customer has ordered a Hawaiian pizza. You put tomato sauce, mozzarella cheese, and ham onto thin pizza dough, bake everything in a professional oven and serve it with a smile. You hear the customer's voice saying:
Thanks! I love pineapple on the pizza!
And then comes the moment of realization.
You forgot about the pineapple.
What would you say to some automation to prevent this tragic mistake?
And what would you say if I told you that you often make such a mistake when creating subclasses?
Let's create a base class Reptile
. All subclasses of the Reptile
class
will have to provide either the skin_color
attribute, or override the
get_skin_color
method.
class Reptile:
skin_color = None
def get_skin_color(self):
assert self.skin_color is not None, (
"'%s' should either include a `skin_color` attribute, "
"or override the `get_skin_color()` method."
% self.__class__.__name__
)
return self.skin_color
This pattern is well known to anyone who has worked with the Django REST
framework for at least some time. We assume that the public interface is the
get_skin_color
method that checks if the class attribute has been set. It's
cool and safe, isn't it?
Let's create three classes that will inherit from Reptile
:
import random
class Crocodile(Reptile):
skin_color = "green"
class Chameleon(Reptile):
def get_skin_color(self):
return random.choice(["green", "brown", "purple"])
class Lizard(Reptile):
...
Instantiate them:
crocodile = Crocodile()
chameleon = Chameleon()
lizard = Lizard()
A skilled eye has already caught that we had made a mistake in one of the previous steps. The Python interpreter, however, reported no problems. At least until the next step, when we'll try to read the skin color of each reptile:
In [1]: crocodile.get_skin_color()
Out[1]: 'green'
In [2]: chameleon.get_skin_color()
Out[2]: 'green'
In [3]: chameleon.get_skin_color()
Out[3]: 'purple'
In [4]: lizard.get_skin_color()
---------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[4], line 1
----> 1 lizard.get_skin_color()
File ~/.virtualenvs/tmp-243ff0e324ab521/repts.py:5, in Reptile.get_skin_color(self)
4 def get_skin_color(self):
----> 5 assert self.skin_color is not None, (
6 "'%s' should either include a `skin_color` attribute, "
7 "or override the `get_skin_color()` method."
8 % self.__class__.__name__
9 )
11 return self.skin_color
AssertionError: 'Lizard' should either include a `skin_color` attribute, or override the `get_skin_color()` method.
We get information that the class Lizard
was created incorrectly only when
we call the get_skin_color
method. In a real application, this will
probably be when the user wants to perform a specific action. This
mistake could have also been detected with unit tests. Because you write unit
tests, right?
Wouldn't it be better to be informed about the mistake earlier? The correctness
of the Lizard
class does not depend on anything that changes at runtime. So
why not report it to the developer earlier, for example, on the application
startup?
PEP 487 introduced a new way of validating
and customizing subclasses in Python 3.6. It's the __init_subclass__
method.
Let's see it in action:
class Reptile:
skin_color = None
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
attribute_overriden = cls.skin_color is not None
method_overriden = (
cls.get_skin_color != Reptile.get_skin_color
)
assert attribute_overriden or method_overriden, (
"'%s' should either include a `skin_color` attribute, "
"or override the `get_skin_color()` method."
% cls.__name__
)
def get_skin_color(self):
return self.skin_color
The example above performs the same validation as before, but in the
__init_subclass__
method, not in the body of the get_skin_color
method.
Let's re-create the same classes that inherit from Reptile
:
In [1]: import random
In [2]: class Crocodile(Reptile):
...: skin_color = "green"
...:
In [3]: class Chameleon(Reptile):
...: def get_skin_color(self):
...: return random.choice(["green", "brown", "purple"])
...:
In [4]: class Lizard(Reptile):
...: ...
...:
---------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[4], line 1
----> 1 class Lizard(Reptile):
2 ...
File ~/.virtualenvs/tmp-243ff0e324ab521/repts.py:10, in Reptile.__init_subclass__(cls, **kwargs)
7 attribute_overriden = cls.skin_color is not None
8 method_overriden = cls.get_skin_color != Reptile.get_skin_color
---> 10 assert attribute_overriden or method_overriden, (
11 "'%s' should either include a `skin_color` attribute, "
12 "or override the `get_skin_color()` method."
13 % cls.__name__
14 )
AssertionError: 'Lizard' should either include a `skin_color` attribute, or override the `get_skin_color()` method.
Excellent! Thanks to the validation in __init_subclass__
, we were not allowed
to create a subclass that does not meet the assumptions of the base class. We
can see that the Lizard
lacks skin color information.
The mistake has been caught early and can be corrected right away.
When you write a validator, it is always worth considering who will benefit from it. Input validation will almost always be for the end user, so it must be done at runtime. Subclass validation, checking whether you have written the code correctly, is for the developer. It's a good idea to report such mistakes upon class initialization.
If you have additional questions related to this topic, feel free to contact me at lukasz.chojnacki [at] deployed.pl.
Hopefully, your base classes will have better validation from now on!