"The Python data model formalizes the interfaces of the building blocks of the language itself." — Luciano Ramalho, Fluent Python
Chapter 1 doesn't just show you __len__ and __getitem__. It introduces the most important design principle in Python: the Data Model as a framework of protocols and hooks.
When you implement special methods (dunders), you're not just adding functionality — you're making your objects first-class citizens of the Python ecosystem. They gain access to:
- Built-in functions (
len,abs,bool,repr,str) - Operators (
+,*,[],in,<,>) - Language constructs (
for,with,if, unpacking) - Standard library functions (
random.choice,sorted,reversed)
The fundamental insight: Python doesn't check what you are — it checks what you can do. This is structural typing (protocols) rather than nominal typing (inheritance).
Python built-in function / operator
│
▼
Checks for special method
(e.g., __len__, __add__)
│
┌─────┴──────┐
│ │
Built-in User class
C shortcut Python dispatch
(~50ns) (~90ns)
Python's special methods are interceptors — they let your classes integrate into the language's core machinery without inheritance.
bool(obj) ──► __bool__() exists? ──YES──► return __bool__()
│
NO
│
__len__() exists? ──YES──► return len(obj) != 0
│
NO
│
return True (all objects are truthy by default)
This fallback chain means: if you define __len__ but not __bool__, Python uses length to determine truthiness. An empty custom collection is automatically falsy.
| Method | When Called | Audience |
|---|---|---|
__repr__ |
repr(obj), REPL, debugger |
Developer |
__str__ |
str(obj), print(obj) |
End user |
__format__ |
f"{obj:spec}", format(obj, spec) |
Custom formatting |
Rule: Always implement __repr__. Implement __str__ only when the user-facing representation should differ from the developer-facing one. If __str__ is missing, Python falls back to __repr__.
// Objects/abstract.c (simplified)
Py_ssize_t
PyObject_Size(PyObject *o)
{
PySequenceMethods *m;
if (o == NULL) { ... }
m = Py_TYPE(o)->tp_as_sequence;
if (m && m->sq_length) {
Py_ssize_t len = m->sq_length(o); // C-level direct call!
...
}
// Falls back to mp_length for mappings
return PyMapping_Size(o);
}For user-defined classes, Python calls __len__ through the type's tp_as_sequence slot populated during class creation. The key point: len() has a fast path for C built-ins that bypasses Python attribute lookup entirely.
This is why len([1,2,3]) is faster than [1,2,3].__len__() on lists — the former hits C directly, the latter goes through Python's attribute lookup machinery.
| Operation | Time (ns) | Notes |
|---|---|---|
len(list) |
~50 | C-level direct call |
list.__len__() |
~80 | Python attribute lookup overhead |
len(custom_class) |
~90 | Python dispatch via type |
repr(obj) |
~100–500 | Depends on __repr__ complexity |
Measured on Python 3.12, results vary by hardware. See benchmarks.py for full methodology.
Python could have required from collections.abc import Sequence; class MySeq(Sequence) — and ABCs do exist. But the data model was designed for structural subtyping first. This means:
- Third-party code remains compatible — old code that predates ABCs works because Python checks for methods, not inheritance
- Less ceremony — you implement only what you need; no abstract method stubs required
- More Pythonic — "if it walks like a duck and quacks like a duck, it's a duck"
The ABCs in collections.abc complement this: they let you register existing classes as virtual subclasses and provide mixin implementations.
- Django ORM:
QuerySet.__len__andQuerySet.__iter__make query results work seamlessly withlen()andforloops - NumPy arrays: Full data model implementation gives NumPy arrays operator overloading (
+,*), slicing, and boolean evaluation - Pandas DataFrames:
__getitem__enablesdf['column']syntax;__len__giveslen(df)as row count - SQLAlchemy:
Result.__iter__allowsfor row in query_result:
| File | Description |
|---|---|
examples.py |
Annotated implementations: FrenchDeck++, Vector2D, custom protocol demos |
exercises.py |
Original exercises extending data model concepts |
mini_project.py |
Card game engine using the full data model |
benchmarks.py |
Timing dunder dispatch, repr overhead, truthiness chain |
notes.md |
Structured reference notes |
pitfalls.md |
Common dunder method mistakes + production fixes |
interview_questions.md |
Senior-level Q&A for this topic |
architecture_notes.md |
Why Python made these design decisions |
- Chapter 9 (this repo): Decorators — another protocol-based system built on
__call__ - Chapter 11: Building custom sequences — extends
__len__and__getitem__further - Chapter 16: Operator overloading —
__add__,__mul__, reflected operators - python-internals-playground: Deep dives into each dunder method family