In the previous post, we explored the basics of Pydantic: creating models, enforcing type validation, and ensuring data integrity with minimal boilerplate. But real-world applications often involve more complex, structured data—like API payloads, configuration files, or nested JSON. How do we handle a blog post with comments, an order with multiple items, or a user profile with nested addresses? This post dives into Pydantic’s powerful support for nested models and smart data structures, showing how to model, validate, and access complex data with ease.
We’ll cover practical examples, including a blog system with authors and comments, and touch on use cases like user profiles or e-commerce orders. Let’s get started!
Nested BaseModels
Pydantic allows you to define models within models, enabling clean, hierarchical data structures. Let’s model a blog system with an Author
, Comment
, and Blog
model.
from pydantic import BaseModel
from datetime import datetime
class Author(BaseModel):
name: str
email: str
class Comment(BaseModel):
content: str
author: Author
created_at: datetime
class Blog(BaseModel):
title: str
content: str
author: Author
comments: list[Comment] = []
# Example usage
blog_data = {
"title": "Nested Models in Pydantic",
"content": "This is a blog post about Pydantic...",
"author": {"name": "Jane Doe", "email": "[email protected]"},
"comments": [
{
"content": "Great post!",
"author": {"name": "John Smith", "email": "[email protected]"},
"created_at": "2025-05-04T10:00:00"
}
]
}
blog = Blog(**blog_data)
print(blog.author.name) # Jane Doe
print(blog.comments[0].author.email) # [email protected]
Here, Comment
and Blog
embed the Author
model, and Pydantic automatically validates the nested data. If author.email
is invalid (e.g., not a string), validation fails before the model is instantiated. This cascading validation ensures every layer of your data is correct.
Lists, Tuples, and Sets of Models
Nested models often involve collections, like a list of comments on a blog. Pydantic supports List[T]
, Tuple[T, ...]
, and Set[T]
for collections of models or other types.
Using our Blog
model, notice the comments: list[Comment] = []
. Pydantic validates each Comment
in the list:
invalid_comment_data = {
"title": "Invalid Comment Example",
"content": "This blog has a bad comment...",
"author": {"name": "Jane Doe", "email": "[email protected]"},
"comments": [
{
"content": "This is fine",
"author": {"name": "John Smith", "email": "[email protected]"},
"created_at": "2025-05-04T10:00:00"
},
{
"content": "This is bad",
"author": {"name": "Bad Author", "email": "not-an-email"}, # Invalid email
"created_at": "2025-05-04T10:01:00"
}
]
}
try:
blog = Blog(**invalid_comment_data)
except ValueError as e:
print(e)
Pydantic will raise a ValidationError
pinpointing the invalid email in the second comment. You can also use Tuple[Comment, ...]
for immutable sequences or Set[Comment]
for unique items, and validation works the same way.
Optional Fields and Defaults
Real-world data often includes optional fields or defaults. Pydantic supports Optional[T]
from typing
and allows default values.
from typing import Optional
class Author(BaseModel):
name: str
email: Optional[str] = None # Email is optional
bio: str = "No bio provided" # Default value
class Blog(BaseModel):
title: str
content: str
author: Author
# Example with missing email
blog_data = {
"title": "Optional Fields",
"content": "This blog has an author with no email.",
"author": {"name": "Jane Doe"}
}
blog = Blog(**blog_data)
print(blog.author.email) # None
print(blog.author.bio) # No bio provided
Optional[str]
means the field can be None
or a string, while email: str = None
implies the field is optional but defaults to None
. Pydantic distinguishes between missing fields (not in the input) and fields explicitly set to None
, ensuring precise control over data parsing.
Dict and Map-Like Structures
Pydantic supports Dict[K, V]
for key-value structures, perfect for feature flags, localized content, or other mappings.
from typing import Dict
class Blog(BaseModel):
title: str
content: str
translations: Dict[str, str] # Language code -> translated title
blog_data = {
"title": "Pydantic Power",
"content": "This is a blog post...",
"translations": {
"es": "El poder de Pydantic",
"fr": "La puissance de Pydantic"
}
}
blog = Blog(**blog_data)
print(blog.translations["es"]) # El poder de Pydantic
You can also nest models in dictionaries, like Dict[str, Author]
, for more complex mappings. Pydantic validates both keys and values according to their types.
Accessing Nested Data Safely
Once validated, Pydantic models provide type-safe access to nested attributes. You can access fields like blog.author.name
or blog.comments[0].content
without worrying about KeyError
or AttributeError
.
For serialization, use .dict()
(or .model_dump()
in Pydantic V2) with options like exclude_unset
, include
, or exclude
:
# Serialize only specific fields
print(blog.dict(include={"title", "author": {"name"}}))
# Output: {'title': 'Pydantic Power', 'author': {'name': 'Jane Doe'}}
# Exclude unset fields
blog = Blog(
title="Test",
content="Content",
author=Author(name="Jane")
)
print(blog.dict(exclude_unset=True))
# Only includes fields explicitly set, skips defaults like author.bio
This makes it easy to control what data is serialized for APIs or storage.
Validation and Error Reporting in Nested Structures
Pydantic’s error reporting is precise, even for nested data. Let’s revisit the invalid comment example:
try:
blog = Blog(**invalid_comment_data)
except ValueError as e:
print(e.errors())
Output might look like:
[
{
'loc': ('comments', 1, 'author', 'email'),
'msg': 'value is not a valid email address',
'type': 'value_error.email'
}
]
The loc
field shows the exact path to the error (comments[1].author.email
), making it easy to debug complex structures. This granularity is invaluable for APIs or user-facing validation.
Recap and Takeaways
Nested models in Pydantic make it easy to handle complex, structured data with robust validation. Key techniques:
- Use
BaseModel
for nested structures likeAuthor
inBlog
. - Leverage
List[T]
,Dict[K, V]
, andOptional[T]
for flexible data shapes. - Access nested data safely with dot notation or serialize with
.dict()
. - Rely on Pydantic’s detailed error reporting for debugging.
These tools are perfect for APIs, configuration files, or any scenario with hierarchical data.
Top comments (0)