6. Structural Subtyping#
Meet generics first?
It can be useful to learn first about generics, especially the section describing Variance types.
6.1. Nominal Subtyping#
Subtyping?
Subtyping is the process of creating a more specific type from a broader one.
In software engineering, we have two principal ways of subtyping:
nominal (i.e., by name),
structural (i.e., by shape).
In Python, nominal subtyping is far more popular and widely used. It relies on direct inheritance/generalization relationships. We won’t spend much time on this kind of subtyping since it’s well-known and, if you’re an experienced Python user, you’ve surely encountered and used it.
1class User:
2 pass
3
4class AuthorizedUser(User):
5 pass
Line |
Code |
Explanation |
|---|---|---|
1 |
class User
|
Here, we create a base class (superclass/broader type) |
4 |
class AuthorizedUser(User):
|
Here, we explicitly state (in parentheses) that |
Multiple inheritance/MRO
Python supports multiple inheritance. In that case, methods are searched according to a predefined order. See the example below:
class A:
pass
class B:
pass
class C(A, B):
...
print(C.__mro__)
This results in a tuple like:
(<class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class 'object'>)
showing the order of method lookup. Methods are searched among superclasses from left to right. See __mro__
Important!
Subtyping represents an asymmetrical relationship. This means AuthorizedUser is a User (it’s a specific type of user), but the reverse is not true, a User is not necessarily an AuthorizedUser. Formally, if S is a subtype of T (written S <: T), you can use S wherever T is expected, but not vice versa.
6.2. Protocol Definition#
Python, beginning with version 3.8, officially introduced structural subtyping via PEP 544 [Levkivskyi et al., 2017] by introducing protocols. Let’s start with a definition:
Structural Subtyping
Structural subtyping is based only on following the same interface (implementing methods with compatible signatures) and (optionally) attributes. Unlike nominal subtyping, no explicit inheritance relationship is required.
You might ask, why do we need it at all? Sometimes you don’t need or even can’t inherit directly from a broader type. This happens when:
you work with legacy classes or third-party library classes you can’t modify,
you want to decouple the interface from implementation (by creating protocols independent of class hierarchies),
you need static type checking for duck-typed code,
you create callbacks or plugins without needing to import or inherit from a superclass.
Let’s move to an example:
1class DataConnector:
2
3 def connect(self, ip: str, port: int) -> None:
4 pass
5
6 def retrieve(self) -> str:
7 return ""
8
9class WebPageConnector:
10
11 def connect(self, ip: str, port: int) -> None:
12 return None
13
14 def retrieve(self) -> str:
15 return "<html></html>"
16
17class DatabaseConnector:
18
19 def connect(self, ip: str, port: int) -> None:
20 return None
21
22 def retrieve(self) -> str:
23 return "1, 2, 3"
24
25def process_data(conn: DataConnector):
26 conn.connect("127.0.0.1", 8888)
27 data = conn.retrieve()
28
29process_data(DatabaseConnector())
In our example, we’ve created a class DataConnector and defined a pair of methods to be implemented. The classes WebPageConnector and DatabaseConnector follow that interface, but there’s an issue here:
Static type check
Though the code works and no error will be raised at runtime, static type checking (which is the pivotal aspect of structural subtyping) will fail in this case. Running the mypy tool to check our code, you’ll see:
error: Argument 1 to "process_data" has incompatible type "DatabaseConnector"; expected "DataConnector" [arg-type]
Found 1 error in 1 file (checked 1 source file)
This happens because static typing tools doesn’t recognize DataConnector as it’s structural supertype.
To solve this, PEP 544 introduced a special Protocol class type hint that marks classes as interfaces for structural subtyping. Look at the refined implementation of the code above.
1from typing import Protocol
2
3class DataConnector(Protocol):
4
5 def connect(self, ip: str, port: int) -> None:
6 ...
7
8 def retrieve(self) -> str:
9 ...
10
11class WebPageConnector:
12
13 def connect(self, ip: str, port: int) -> None:
14 return None
15
16 def retrieve(self) -> str:
17 return "<html></html>"
18
19class DatabaseConnector:
20
21 def connect(self, ip: str, port: int) -> None:
22 return None
23
24 def retrieve(self) -> str:
25 return "1, 2, 3"
26
27def process_data(conn: DataConnector):
28 conn.connect("127.0.0.1", 8888)
29 data = conn.retrieve()
30
31process_data(DatabaseConnector())
Note that our protocol must inherit from typing.Protocol and, by convention, interface methods’ bodies contain just the ellipsis (...) symbol. Let’s explore the emphasized lines more deeply:
Line |
Code |
Explanation |
|---|---|---|
1 |
from typing import Protocol
|
We need to import the |
3 |
class DataConnector(Protocol):
|
We use nominal subtyping (I know this can be confusing) to indicate our class will be used for structural subtyping |
5-6 |
def connect(...) -> None:
...
|
By convention, protocol methods should contain no body except the ellipsis literal ( |
Use type hints
Since structural subtyping relies on method signatures, don’t forget to use type hints!
Signature matching requirements
To follow the structural subtyping mechanism, the entire signature must be compatible, including:
method name: must match exactly.
parameters: the number, order, and kinds (e.g., positional-only, keyword-only) of parameters must be compatible (the parameter names themselves do not need to match).
parameter types: must be compatible (contravariant).
return type: must be compatible (covariant).
Note: Parameter types are contravariant (can accept broader types/supertypes)[1] and return types are covariant (can accept narrower types/subtypes) in subtyping relationships.
Check if class implements a protocol
By default, you cannot use isinstance or issubclass checks to verify that a class implements a given protocol. Checks like:
isinstance(DatabaseConnector(), DataConnector)
will fail unless you explicitly use the @runtime_checkable decorator from the typing module. However, that’s not recommended as it might slow down your code and only performs shallow checks. It’s better to use the hasattr method to check if a class has the requested method, or rely on static type checkers.
See also decorators
To learn about decorators, see the Decorators chapter!
6.3. Protocol Attributes#
Not only methods but also attributes can be indicators of structural subtyping. Let’s slightly change our protocol so it has two attributes and the connect method becomes argument-free:
1from typing import Protocol
2
3class DataConnector(Protocol):
4 ip: str
5 port: int
6
7 def connect(self) -> None:
8 ...
9
10 def retrieve(self) -> str:
11 ...
Now, to satisfy the protocol, we need to either define the attributes at the class level or initialize them is __init__ or making them properties (@property):
13class WebPageConnector:
14 ip: str = "127.0.0.1"
15 port: int = 8888
16
17 def connect(self) -> None:
18 return None
19
20 def retrieve(self) -> str:
21 return "<html></html>"
22
23class DatabaseConnector:
24
25 def __init__(self) -> None:
26 self.ip = "127.0.0.1"
27 self.port = 8889
28
29 def connect(self) -> None:
30 return None
31
32 def retrieve(self) -> str:
33 return "1, 2, 3"
34
35def process_data(conn: DataConnector):
36 conn.connect()
37 data = conn.retrieve()
38
39process_data(WebPageConnector())
40process_data(DatabaseConnector())
Attribute variance
Attributes in protocols are invariant by default (see Variance types in the Generics chapter to read more about variance types), meaning an exact type match is required. This is because mutable attributes can be both read from and written to, which requires invariance for type safety.
However, read-only attributes can be covariant. Use @property to make a read-only attribute:
from typing import Protocol
class DataConnector(Protocol):
@property
def some_attribute(self) -> int:
...
Then any class following the DataConnector protocol should have a some_attribute property of type int or any subtype of int. The property ensures the attribute is read-only, allowing covariance without sacrificing type safety.