DEV Community

Akash for MechCloud Academy

Posted on

Going Deeper with Pydantic: Nested Models and Data Structures

In the previous post, we explored the basics of Pydantic: creating models, enforcing type validation, and ensuring data integrity with minimal boilerplate. But real-world applications often involve more complex, structured data—like API payloads, configuration files, or nested JSON. How do we handle a blog post with comments, an order with multiple items, or a user profile with nested addresses? This post dives into Pydantic’s powerful support for nested models and smart data structures, showing how to model, validate, and access complex data with ease.

We’ll cover practical examples, including a blog system with authors and comments, and touch on use cases like user profiles or e-commerce orders. Let’s get started!

Nested BaseModels

Pydantic allows you to define models within models, enabling clean, hierarchical data structures. Let’s model a blog system with an Author, Comment, and Blog model.

from pydantic import BaseModel
from datetime import datetime

class Author(BaseModel):
    name: str
    email: str

class Comment(BaseModel):
    content: str
    author: Author
    created_at: datetime

class Blog(BaseModel):
    title: str
    content: str
    author: Author
    comments: list[Comment] = []

# Example usage
blog_data = {
    "title": "Nested Models in Pydantic",
    "content": "This is a blog post about Pydantic...",
    "author": {"name": "Jane Doe", "email": "[email protected]"},
    "comments": [
        {
            "content": "Great post!",
            "author": {"name": "John Smith", "email": "[email protected]"},
            "created_at": "2025-05-04T10:00:00"
        }
    ]
}

blog = Blog(**blog_data)
print(blog.author.name)  # Jane Doe
print(blog.comments[0].author.email)  # [email protected]
Enter fullscreen mode Exit fullscreen mode

Here, Comment and Blog embed the Author model, and Pydantic automatically validates the nested data. If author.email is invalid (e.g., not a string), validation fails before the model is instantiated. This cascading validation ensures every layer of your data is correct.

Lists, Tuples, and Sets of Models

Nested models often involve collections, like a list of comments on a blog. Pydantic supports List[T], Tuple[T, ...], and Set[T] for collections of models or other types.

Using our Blog model, notice the comments: list[Comment] = []. Pydantic validates each Comment in the list:

invalid_comment_data = {
    "title": "Invalid Comment Example",
    "content": "This blog has a bad comment...",
    "author": {"name": "Jane Doe", "email": "[email protected]"},
    "comments": [
        {
            "content": "This is fine",
            "author": {"name": "John Smith", "email": "[email protected]"},
            "created_at": "2025-05-04T10:00:00"
        },
        {
            "content": "This is bad",
            "author": {"name": "Bad Author", "email": "not-an-email"},  # Invalid email
            "created_at": "2025-05-04T10:01:00"
        }
    ]
}

try:
    blog = Blog(**invalid_comment_data)
except ValueError as e:
    print(e)
Enter fullscreen mode Exit fullscreen mode

Pydantic will raise a ValidationError pinpointing the invalid email in the second comment. You can also use Tuple[Comment, ...] for immutable sequences or Set[Comment] for unique items, and validation works the same way.

Optional Fields and Defaults

Real-world data often includes optional fields or defaults. Pydantic supports Optional[T] from typing and allows default values.

from typing import Optional

class Author(BaseModel):
    name: str
    email: Optional[str] = None  # Email is optional
    bio: str = "No bio provided"  # Default value

class Blog(BaseModel):
    title: str
    content: str
    author: Author

# Example with missing email
blog_data = {
    "title": "Optional Fields",
    "content": "This blog has an author with no email.",
    "author": {"name": "Jane Doe"}
}

blog = Blog(**blog_data)
print(blog.author.email)  # None
print(blog.author.bio)    # No bio provided
Enter fullscreen mode Exit fullscreen mode

Optional[str] means the field can be None or a string, while email: str = None implies the field is optional but defaults to None. Pydantic distinguishes between missing fields (not in the input) and fields explicitly set to None, ensuring precise control over data parsing.

Dict and Map-Like Structures

Pydantic supports Dict[K, V] for key-value structures, perfect for feature flags, localized content, or other mappings.

from typing import Dict

class Blog(BaseModel):
    title: str
    content: str
    translations: Dict[str, str]  # Language code -> translated title

blog_data = {
    "title": "Pydantic Power",
    "content": "This is a blog post...",
    "translations": {
        "es": "El poder de Pydantic",
        "fr": "La puissance de Pydantic"
    }
}

blog = Blog(**blog_data)
print(blog.translations["es"])  # El poder de Pydantic
Enter fullscreen mode Exit fullscreen mode

You can also nest models in dictionaries, like Dict[str, Author], for more complex mappings. Pydantic validates both keys and values according to their types.

Accessing Nested Data Safely

Once validated, Pydantic models provide type-safe access to nested attributes. You can access fields like blog.author.name or blog.comments[0].content without worrying about KeyError or AttributeError.

For serialization, use .dict() (or .model_dump() in Pydantic V2) with options like exclude_unset, include, or exclude:

# Serialize only specific fields
print(blog.dict(include={"title", "author": {"name"}}))
# Output: {'title': 'Pydantic Power', 'author': {'name': 'Jane Doe'}}

# Exclude unset fields
blog = Blog(
    title="Test",
    content="Content",
    author=Author(name="Jane")
)
print(blog.dict(exclude_unset=True))
# Only includes fields explicitly set, skips defaults like author.bio
Enter fullscreen mode Exit fullscreen mode

This makes it easy to control what data is serialized for APIs or storage.

Validation and Error Reporting in Nested Structures

Pydantic’s error reporting is precise, even for nested data. Let’s revisit the invalid comment example:

try:
    blog = Blog(**invalid_comment_data)
except ValueError as e:
    print(e.errors())
Enter fullscreen mode Exit fullscreen mode

Output might look like:

[
    {
        'loc': ('comments', 1, 'author', 'email'),
        'msg': 'value is not a valid email address',
        'type': 'value_error.email'
    }
]
Enter fullscreen mode Exit fullscreen mode

The loc field shows the exact path to the error (comments[1].author.email), making it easy to debug complex structures. This granularity is invaluable for APIs or user-facing validation.

Recap and Takeaways

Nested models in Pydantic make it easy to handle complex, structured data with robust validation. Key techniques:

  • Use BaseModel for nested structures like Author in Blog.
  • Leverage List[T], Dict[K, V], and Optional[T] for flexible data shapes.
  • Access nested data safely with dot notation or serialize with .dict().
  • Rely on Pydantic’s detailed error reporting for debugging.

These tools are perfect for APIs, configuration files, or any scenario with hierarchical data.

Top comments (0)

OSZAR »