Your Pydantic Models Are Already Test Specifications — You're Just Not Using Them
If you're using Pydantic in Python, you've already written your test specifications. You just haven't run them yet.
Every Field(gt=0), every validator, every Literal["pending", "active", "cancelled"] — these are machine-readable contracts. They describe exactly what valid data looks like and what should be rejected. That's the definition of a test specification.
The problem: most codebases have these models and no tests that verify the constraints actually hold.
What a Pydantic model tells you
Consider this model from a typical e-commerce backend:
from pydantic import BaseModel, Field, validator
from typing import Literal
class OrderRequest(BaseModel):
amount: float = Field(gt=0, description="Order amount in pence")
currency: Literal["GBP", "USD", "EUR", "INR"]
customer_email: str
items: list[str] = Field(min_length=1)
@validator("customer_email")
def email_must_be_valid(cls, v):
if "@" not in v:
raise ValueError("invalid email")
return v.lower()
Without looking at any test files, this model tells you at least six things worth testing:
amountmust be greater than zeroamountof exactly zero must be rejectedcurrencymust be one of the four listed values- An unlisted currency must be rejected
itemsmust contain at least one elementcustomer_emailwithout@must be rejected
None of these are controversial. They're written right there. But in most codebases, these constraints have no tests. You rely on Pydantic to enforce them at runtime and assume that's enough.
It's not enough.
Why "Pydantic enforces it" isn't sufficient
Pydantic validates at model construction. If someone bypasses the model — calling the underlying function directly, constructing data from a dict that skips validation, or using model_construct() — the constraints are ignored.
More importantly: downstream functions that accept an OrderRequest often assume the constraints hold and don't re-check. If validation is ever bypassed, the error surfaces somewhere unexpected, far from the source, with a confusing message.
A verified test for each constraint catches both paths: it proves the model rejects bad input and proves the downstream function handles it correctly.
What verified means
Generating a test is easy. Generating a test that actually works is harder. Here's the failure mode:
def test_order_rejects_zero_amount():
with pytest.raises(ValidationError):
OrderRequest(amount=0, currency="GBP", customer_email="test@test.com", items=["a"])
This test passes as long as the model exists. But now imagine someone accidentally changed Field(gt=0) to Field(ge=0). The test still passes. The constraint changed. No one noticed.
A properly verified test works differently: you first confirm the test passes on current code, then temporarily change the constraint to allow amount=0, run the test again, and confirm it fails. If it doesn't fail, the test wasn't proving the constraint — it was just passing by coincidence.
This is what Quell's verification engine does for every test it generates. The write-to-disk step only happens after both checks pass.
Running this on your codebase
pip install quelltest
quell find src/ --no-llm
Quell scans Pydantic models alongside docstrings and type annotations. Each Field constraint, each Literal, each validator becomes a Requirement. The coverage checker then determines which ones have a test and which don't.
Output on a typical FastAPI project:
OrderRequest ENUM_VALID currency ∈ {GBP,USD,EUR,INR} ✗ no test
OrderRequest GT_ZERO amount > 0 ✗ no test
OrderRequest MIN_LENGTH items: len >= 1 ✗ no test
UserProfile MUST_RAISE email: @ required ✗ no test
→ 4 gaps. Run with --fix to generate and verify.
With --fix, each gap gets a generated, verified test written directly into your test file via AST-safe injection. No string concatenation. No formatting issues. Clean diffs.
The workflow going forward
The best time to use this is at PR review. Before merging a new model or a new field constraint, run quell find src/models.py and confirm there's a test for each constraint. If there isn't, add --fix and commit the generated tests.
This costs about 30 seconds per PR. It eliminates an entire class of production bugs.
Quell on PyPI — MIT licensed, no config, no LLM key needed for rule-based constraints.