Data Model
With the Python Data Model, user-defined types can mimic the behavior of built-in types seamlessly
. This can be achieved without relying on inheritance; instead, it follows the principle of duck typing
: you only need to implement the necessary methods for your objects to behave as intended.
Think of the data model as a description of Python as a framework: It formalizes interfaces for our objects to behave like standard Python data structures
such as sequences, iterators, coroutines, classes, context managers, and so on. The Python interpreter invokes these special methods to perform basic object operations, often triggered by special syntax
. These special method names are always written with leading and trailing double underscores, and they are also referred to as "dunder" methods
.
Protocols and Duck Typing
With duck typing, you don't need to inherit from any special class to create fully functional behaviors
like a sequence type in Python. You just need to implement the methods that fulfill the sequence protocol
. In the context of object-oriented programming, a protocol is an informal interface defined only in documentation, not in code.
For instance, the sequence protocol in Python requires only the __len__ and __getitem__ methods.
Any class that implements these methods with the standard signature and semantics can be used wherever a sequence is expected
. Whether the class is a subclass of something else is irrelevant; all that matters is that it provides the necessary methods.
Since protocols are informal and unenforced
, you can often implement just part of a protocol
, especially if you know the specific context where a class will be used
. For example, to support iteration, you only need to provide the __getitem__
method; there's no need to include __len__.
Sequence Protocol
To create a class that can behave like a sequence in Python, you only need to implement two methods: __len__
and __getitem__
.
__len__
When a class implements __len__
, you can use the len()
function with instances of that class.
__getitem__
When a class implements __getitem__
, you can access items in it just like you would with regular Python collections.
Example
class Vector:
pointers: list[int]
def __init__(self, pointers: list[int]):
self.pointers = pointers
def __len__(self):
return len(self.pointers)
def __getitem__(self, index):
return self.pointers[index]
v1 = Vector([1, 2, 3, 4, 5])
print(f"len of v1: {len(v1)}")
print(v1[-1])
print(v1[2])
print(v1[:2])
print(v1[1:])
print(f"Is 5 in v1? {5 in v1}")
print(f"Is 15 in v1? {15 in v1}")
More Complete Sequence Protocol
To ensure that when an object is sliced, the returned object is of the same type as the original object
, you need to implement a more comprehensive __getitem__
method. This method should handle slicing and return an instance of the same class with the sliced elements.
Example
class Vector:
pointers: list[int]
def __init__(self, pointers: list[int]):
self.pointers = pointers
def __repr__(self) -> str:
return f"Vector ({self.pointers})"
def __len__(self):
return len(self.pointers)
def __getitem__(self, index):
if isinstance(index, slice):
cls = type(self)
return cls(self.pointers[index])
return self.pointers[index]
v1 = Vector([1, 2, 3, 4, 5])
print(f"len of v1: {len(v1)}")
print(v1[-1])
print(v1[2])
print(v1[:2])
print(type(v1[:2]))
print(v1[1:])
print(type(v1[1:]))
Object Representations
Python offers two methods for obtaining a string representation from any object:
repr()
The repr()
function in Python returns a string that represents an object as the developer intends to see it. This is the string you typically see when the Python console or a debugger shows an object
.
str()
The str()
function in Python returns a string that represents an object as the user would like to see it
. This is the string you typically get when you use the print() function to display an object
. It's designed to provide a more human-readable and user-friendly representation of the object.
The special methods __repr__
and __str__
support repr() and str(). If the __str__
method is not implemented for an object, Python will use the implementation inherited from the base object class
, which, in turn, calls the __repr__
method as a fallback.
Example
>>> from array import array
>>> class Vector:
>>> def __init__(self, x=0, y=0):
... self.x = x
... self.y = y
>>> def __repr__(self):
... return f'Vector({self.x!r}, {self.y!r}) nice'
>>> def __str__(self):
... return str(tuple([self.x, self.y]))
>>> v1 = Vector(1, 2)
>>> print(v1)
>>> print(repr(v1))
(1, 2)
Vector(1, 2) nice
Alternative Representations
There are two additional special methods for supporting alternative representations of objects: __bytes__
and __format__
.
__bytes__
This method is analogous to __str__
. It is called by the bytes() function
to obtain the object represented as a byte sequence.
__format__
This method is used by f-strings, the built-in format() function, and the str.format() method
.
Formatted Displays
In Python, f-strings, the format() function, and str.format()
method handle formatting by calling the `.__format__(format_spec)
method of the respective data types. The format_spec
is a formatting specifier, which can be either:
- The second argument in
format(my_obj, format_spec)
. - Anything that comes after a colon inside a replacement field within curly braces
{} in an f-string
or thefmt in fmt.str.format()
.
If a class doesn't have a .__format__
method defined, it will inherit the behavior from the object class, which returns the string representation of the object using str(my_object).
Formatted Examples
Dunder __format__
example
>>> class nice_class:
... def __init__(self, str_value: str):
... self.str_value = str_value
... def __format__(self, fmt_spec=''):
... formatter = ""
... if fmt_spec.endswith("n"):
... formatter = "NICE STRING "
... return formatter + self.str_value
>>> obj_nice = nice_class("nice_str")
>>> print(format(obj_nice))
>>> print(f"obj: {obj_nice}")
>>> print(format(obj_nice, "n"))
nice_str
obj: nice_str
NICE STRING nice_str
Supporting Positional Pattern Matching
For a class to support positional pattern matching, you need to add a __match_args__
class attribute. This attribute should list the instance attributes in the order they will be used during pattern matching
. It's important to note that __match_args__ doesn't have to include all public instance attributes
. Typically, you include the required arguments from the __init__
method in __match_args, but optional arguments may be omitted
.
Matching Support
class Vector:
x: int
y: int
__match_args__ = ('x', 'y')
def __init__(self, x, y) -> None:
self.x = x
self.y = y
def __str__(self) -> str:
return f"Vector(x={self.x}, y={self.y})"
def __repr__(self) -> str:
return f"Vector(x={self.x}, y={self.y})"
def positional_pattern_demo(v: Vector):
match v:
case Vector(x=0, y=0):
print(f'{v!r} is null')
case Vector(x=0):
print(f'{v!r} is vertical')
case Vector(y=0):
print(f'{v!r} is horizontal')
case Vector(x=x, y=y) if x==y:
print(f'{v!r} is diagonal')
case _:
print(f'{v!r} is awesome')
positional_pattern_demo(Vector(0, 0))
positional_pattern_demo(Vector(0, 2))
positional_pattern_demo(Vector(2, 0))
positional_pattern_demo(Vector(1, 1))
positional_pattern_demo(Vector(2, 11))
Operator Overloading
Operator overloading enables user-defined objects
to work with infix operators like +
and |
, or unary operators such as -
and ~
. In Python, a thoughtful balance is maintained between flexibility, usability, and safety by introducing certain constraints on operator overloading:
- We cannot change the meaning of the operators for the
built-in types
. - We
cannot create new operators
, only overload existing ones. - A few operators can’t be overloaded:
is
,and
,or
, andnot
.
Unary Operators (+, -, and ~)
-
, implemented by __neg__
Arithmetic unary negation. If x
is -2 then -x == 2
.
+
, implemented by __pos__
Arithmetic unary plus. Usually x == +x
.
~
, implemented by __invert__
Bitwise not, or bitwise inverse of an integer, defined as ~x == -(x+1)
. If x
is 2 then ~x == -3
.
To implement unary operators (+, -, and ~) for your class, you need to define the appropriate special method that takes only one argument: self
.
Follow the General Rule for Operators
Always return a new object
. This means do not changes the receiver self
and instead creating and returning a new instance of a suitable type.
For unary -
and +
, the result will likely be an instance of the same class as self
. For unary +
, if the receiver is immutable
, you should return self; otherwise, return a copy of self.
Example
class Vector:
x: int
y: int
def __init__(self, x, y) -> None:
self.x = x
self.y = y
def __str__(self) -> str:
return f"Vector(x={self.x}, y={self.y})"
def __repr__(self) -> str:
return f"Vector(x={self.x}, y={self.y})"
def __neg__(self):
return Vector(-self.x, -self.y)
def __pos__(self):
return Vector(self.x + 1, self.y + 1)
def __invert__(self):
return Vector(-(self.x + 1), -(self.y + 1))
vec = Vector(2, 1)
print(f"Initial Vector: {vec}")
print(f"Plus Vector: {+vec}")
print(f"Negation Vector: {-vec}")
print(f"Bitwise not Vector: {~vec}")
print(f"FINAL Vector: {vec}")
Listing All Operators
Operator | Forward | Reverse | In-place | Description |
---|---|---|---|---|
+ | __add__ |
__radd__ |
__iadd__ |
Addition or concatenation |
- | __sub__ |
__rsub__ |
__isub__ |
Subtraction |
* | __mul__ |
__rmul__ |
__imul__ |
Multiplication or repetition |
/ | __truediv__ |
__rtruediv__ |
__itruediv__ |
True division |
// | __floordiv__ |
__rfloordiv__ |
__ifloordiv__ |
Floor division |
% | __mod__ |
__rmod__ |
__imod__ |
Modulo |
divmod() | __divmod__ |
__rdivmod__ |
__idivmod__ |
Returns tuple of floor division quotient and modulo |
**, pow() | __pow__ |
__rpow__ |
__ipow__ |
Exponentiation |
@ | __matmul__ |
__rmatmul__ |
__imatmul__ |
Matrix multiplication |
& | __and__ |
__rand__ |
__iand__ |
Bitwise and |
| | __or__ |
__ror__ |
__ior__ |
Bitwise or |
^ | __xor__ |
__rxor__ |
__ixor__ |
Bitwise xor |
<< | __lshift__ |
__rlshift__ |
__ilshift__ |
Bitwise shift left |
>> | __rshift__ |
__rrshift__ |
__irshift__ |
Bitwise shift right |
Listing All Rich Comparison Operators
Group | Infix operator | Forward method call | Reverse method call | Fallback |
---|---|---|---|---|
Equality | a == b |
a.__eq__(b) |
b.__eq__(a) |
Return id(a) == id(b) |
Equality | a != b |
a.__ne__(b) |
b.__ne__(a) |
Return not (a == b) |
Ordering | a > b |
a.__gt__(b) |
b.__lt__(a) |
Raise TypeError |
Ordering | a < b |
a.__lt__(b) |
b.__gt__(a) |
Raise TypeError |
Ordering | a >= b |
a.__ge__(b) |
b.__le__(a) |
Raise TypeError |
Ordering | a <= b |
a.__le__(b) |
b.__ge__(a) |
Raise TypeError |