Better base classes with PEP 487

Łukasz Chojnacki | 2023-08-11

Imagine that you are a chef in a pizzeria. A customer has ordered a Hawaiian pizza. You put tomato sauce, mozzarella cheese, and ham onto thin pizza dough, bake everything in a professional oven and serve it with a smile. You hear the customer's voice saying:

Thanks! I love pineapple on the pizza!

And then comes the moment of realization.

You forgot about the pineapple.

Forgotten pineapple

What would you say to some automation to prevent this tragic mistake?

And what would you say if I told you that you often make such a mistake when creating subclasses?

Subclass validation at runtime

Let's create a base class Reptile. All subclasses of the Reptile class will have to provide either the skin_color attribute, or override the get_skin_color method.

class Reptile: skin_color = None def get_skin_color(self): assert self.skin_color is not None, ( "'%s' should either include a `skin_color` attribute, " "or override the `get_skin_color()` method." % self.__class__.__name__ ) return self.skin_color

This pattern is well known to anyone who has worked with the Django REST framework for at least some time. We assume that the public interface is the get_skin_color method that checks if the class attribute has been set. It's cool and safe, isn't it?

Let's create three classes that will inherit from Reptile:

import random class Crocodile(Reptile): skin_color = "green" class Chameleon(Reptile): def get_skin_color(self): return random.choice(["green", "brown", "purple"]) class Lizard(Reptile): ...

Instantiate them:

crocodile = Crocodile() chameleon = Chameleon() lizard = Lizard()

A skilled eye has already caught that we had made a mistake in one of the previous steps. The Python interpreter, however, reported no problems. At least until the next step, when we'll try to read the skin color of each reptile:

In [1]: crocodile.get_skin_color() Out[1]: 'green' In [2]: chameleon.get_skin_color() Out[2]: 'green' In [3]: chameleon.get_skin_color() Out[3]: 'purple' In [4]: lizard.get_skin_color() --------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[4], line 1 ----> 1 lizard.get_skin_color() File ~/.virtualenvs/tmp-243ff0e324ab521/repts.py:5, in Reptile.get_skin_color(self) 4 def get_skin_color(self): ----> 5 assert self.skin_color is not None, ( 6 "'%s' should either include a `skin_color` attribute, " 7 "or override the `get_skin_color()` method." 8 % self.__class__.__name__ 9 ) 11 return self.skin_color AssertionError: 'Lizard' should either include a `skin_color` attribute, or override the `get_skin_color()` method.

We get information that the class Lizard was created incorrectly only when we call the get_skin_color method. In a real application, this will probably be when the user wants to perform a specific action. This mistake could have also been detected with unit tests. Because you write unit tests, right?

The new feature has been finished. With unit tests, right?

Wouldn't it be better to be informed about the mistake earlier? The correctness of the Lizard class does not depend on anything that changes at runtime. So why not report it to the developer earlier, for example, on the application startup?

Subclass validation at startup

PEP 487 introduced a new way of validating and customizing subclasses in Python 3.6. It's the __init_subclass__ method. Let's see it in action:

class Reptile: skin_color = None def __init_subclass__(cls, **kwargs): super().__init_subclass__(**kwargs) attribute_overriden = cls.skin_color is not None method_overriden = ( cls.get_skin_color != Reptile.get_skin_color ) assert attribute_overriden or method_overriden, ( "'%s' should either include a `skin_color` attribute, " "or override the `get_skin_color()` method." % cls.__name__ ) def get_skin_color(self): return self.skin_color

The example above performs the same validation as before, but in the __init_subclass__ method, not in the body of the get_skin_color method. Let's re-create the same classes that inherit from Reptile:

In [1]: import random In [2]: class Crocodile(Reptile): ...: skin_color = "green" ...: In [3]: class Chameleon(Reptile): ...: def get_skin_color(self): ...: return random.choice(["green", "brown", "purple"]) ...: In [4]: class Lizard(Reptile): ...: ... ...: --------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[4], line 1 ----> 1 class Lizard(Reptile): 2 ... File ~/.virtualenvs/tmp-243ff0e324ab521/repts.py:10, in Reptile.__init_subclass__(cls, **kwargs) 7 attribute_overriden = cls.skin_color is not None 8 method_overriden = cls.get_skin_color != Reptile.get_skin_color ---> 10 assert attribute_overriden or method_overriden, ( 11 "'%s' should either include a `skin_color` attribute, " 12 "or override the `get_skin_color()` method." 13 % cls.__name__ 14 ) AssertionError: 'Lizard' should either include a `skin_color` attribute, or override the `get_skin_color()` method.

Excellent! Thanks to the validation in __init_subclass__, we were not allowed to create a subclass that does not meet the assumptions of the base class. We can see that the Lizard lacks skin color information.

Lizard without skin color

The mistake has been caught early and can be corrected right away.

Final thoughts

When you write a validator, it is always worth considering who will benefit from it. Input validation will almost always be for the end user, so it must be done at runtime. Subclass validation, checking whether you have written the code correctly, is for the developer. It's a good idea to report such mistakes upon class initialization.

If you have additional questions related to this topic, feel free to contact me at lukasz.chojnacki [at] deployed.pl.

Hopefully, your base classes will have better validation from now on!