Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
architecture_notes.md	architecture_notes.md
benchmarks.py	benchmarks.py
examples.py	examples.py
exercises.py	exercises.py
interview_questions.md	interview_questions.md
mini_project.py	mini_project.py
notes.md	notes.md
pitfalls.md	pitfalls.md

Chapter 1 — The Python Data Model

"The Python data model formalizes the interfaces of the building blocks of the language itself." — Luciano Ramalho, Fluent Python

🧠 What This Chapter Is Really About

Chapter 1 doesn't just show you __len__ and __getitem__. It introduces the most important design principle in Python: the Data Model as a framework of protocols and hooks.

When you implement special methods (dunders), you're not just adding functionality — you're making your objects first-class citizens of the Python ecosystem. They gain access to:

Built-in functions (len, abs, bool, repr, str)
Operators (+, *, [], in, <, >)
Language constructs (for, with, if, unpacking)
Standard library functions (random.choice, sorted, reversed)

The fundamental insight: Python doesn't check what you are — it checks what you can do. This is structural typing (protocols) rather than nominal typing (inheritance).

📐 Core Concepts

1. The Protocol-Based Design Model

Python built-in function / operator
            │
            ▼
    Checks for special method
    (e.g., __len__, __add__)
            │
      ┌─────┴──────┐
      │            │
  Built-in      User class
  C shortcut    Python dispatch
   (~50ns)        (~90ns)

Python's special methods are interceptors — they let your classes integrate into the language's core machinery without inheritance.

2. The Truthiness Protocol Chain

bool(obj) ──► __bool__() exists? ──YES──► return __bool__()
                    │
                   NO
                    │
              __len__() exists? ──YES──► return len(obj) != 0
                    │
                   NO
                    │
              return True  (all objects are truthy by default)

This fallback chain means: if you define __len__ but not __bool__, Python uses length to determine truthiness. An empty custom collection is automatically falsy.

3. The Representation Protocol

Method	When Called	Audience
`__repr__`	`repr(obj)`, REPL, debugger	Developer
`__str__`	`str(obj)`, `print(obj)`	End user
`__format__`	`f"{obj:spec}"`, `format(obj, spec)`	Custom formatting

Rule: Always implement __repr__. Implement __str__ only when the user-facing representation should differ from the developer-facing one. If __str__ is missing, Python falls back to __repr__.

🔬 CPython Internals

How `len()` Really Works

// Objects/abstract.c (simplified)
Py_ssize_t
PyObject_Size(PyObject *o)
{
    PySequenceMethods *m;
    if (o == NULL) { ... }

    m = Py_TYPE(o)->tp_as_sequence;
    if (m && m->sq_length) {
        Py_ssize_t len = m->sq_length(o);  // C-level direct call!
        ...
    }
    // Falls back to mp_length for mappings
    return PyMapping_Size(o);
}

For user-defined classes, Python calls __len__ through the type's tp_as_sequence slot populated during class creation. The key point: len() has a fast path for C built-ins that bypasses Python attribute lookup entirely.

This is why len([1,2,3]) is faster than [1,2,3].__len__() on lists — the former hits C directly, the latter goes through Python's attribute lookup machinery.

📊 Performance Analysis

Operation	Time (ns)	Notes
`len(list)`	~50	C-level direct call
`list.__len__()`	~80	Python attribute lookup overhead
`len(custom_class)`	~90	Python dispatch via type
`repr(obj)`	~100–500	Depends on `__repr__` complexity

Measured on Python 3.12, results vary by hardware. See benchmarks.py for full methodology.

🏗️ Architecture Notes

Why Python Chose Protocols Over Abstract Base Classes

Python could have required from collections.abc import Sequence; class MySeq(Sequence) — and ABCs do exist. But the data model was designed for structural subtyping first. This means:

Third-party code remains compatible — old code that predates ABCs works because Python checks for methods, not inheritance
Less ceremony — you implement only what you need; no abstract method stubs required
More Pythonic — "if it walks like a duck and quacks like a duck, it's a duck"

The ABCs in collections.abc complement this: they let you register existing classes as virtual subclasses and provide mixin implementations.

🎯 Real-World Applications

Django ORM: QuerySet.__len__ and QuerySet.__iter__ make query results work seamlessly with len() and for loops
NumPy arrays: Full data model implementation gives NumPy arrays operator overloading (+, *), slicing, and boolean evaluation
Pandas DataFrames: __getitem__ enables df['column'] syntax; __len__ gives len(df) as row count
SQLAlchemy: Result.__iter__ allows for row in query_result:

📁 Files in This Chapter

File	Description
`examples.py`	Annotated implementations: FrenchDeck++, Vector2D, custom protocol demos
`exercises.py`	Original exercises extending data model concepts
`mini_project.py`	Card game engine using the full data model
`benchmarks.py`	Timing dunder dispatch, repr overhead, truthiness chain
`notes.md`	Structured reference notes
`pitfalls.md`	Common dunder method mistakes + production fixes
`interview_questions.md`	Senior-level Q&A for this topic
`architecture_notes.md`	Why Python made these design decisions

🔗 Related Chapters & Repos

Chapter 9 (this repo): Decorators — another protocol-based system built on __call__
Chapter 11: Building custom sequences — extends __len__ and __getitem__ further
Chapter 16: Operator overloading — __add__, __mul__, reflected operators
python-internals-playground: Deep dives into each dunder method family

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Chapter 1 — The Python Data Model

🧠 What This Chapter Is Really About

📐 Core Concepts

1. The Protocol-Based Design Model

2. The Truthiness Protocol Chain

3. The Representation Protocol

🔬 CPython Internals

How `len()` Really Works

📊 Performance Analysis

🏗️ Architecture Notes

Why Python Chose Protocols Over Abstract Base Classes

🎯 Real-World Applications

📁 Files in This Chapter

🔗 Related Chapters & Repos

FilesExpand file tree

chapter_01_python_data_model

Directory actions

More options

Directory actions

More options

Latest commit

History

chapter_01_python_data_model

Folders and files

parent directory

README.md

Chapter 1 — The Python Data Model

🧠 What This Chapter Is Really About

📐 Core Concepts

1. The Protocol-Based Design Model

2. The Truthiness Protocol Chain

3. The Representation Protocol

🔬 CPython Internals

How len() Really Works

📊 Performance Analysis

🏗️ Architecture Notes

Why Python Chose Protocols Over Abstract Base Classes

🎯 Real-World Applications

📁 Files in This Chapter

🔗 Related Chapters & Repos

How `len()` Really Works