v2.0.0 · three-bucket output · PRS

Untested edge cases
will bite you
in production.
Quell finds them first.

Run quell find src/ and get three buckets back: tests written to disk, stubs to finish, and gaps with a one-line reason. Every WRITTEN test passed five gates — including proving it actually catches the bug.

$pip install quelltest
· no LLM key required· runs offline· MIT licensed
$ quell find src/payments/ --fix
v2.0.0
13 requirements·12 functions·2 Pydantic models
8
WRITTEN
3
SCAFFOLDED
2
FLAGGED
test_payment_rejects_zero_amount
94%
test_payment_rejects_negative
91%
test_user_email_must_not_be_empty
88%
~test_refund_idempotency
stub
src/billing.py:142 — external API
PRS: 84 / 100Production Ready
code never left your machine
0
verification gates
every WRITTEN test passes all five
0
output buckets
WRITTEN · SCAFFOLDED · FLAGGED
0%
of cases handled by rule engine
no LLM, no network, deterministic
0%
local by default
no source code leaves your machine
THE OUTPUT

Three buckets.
Nothing dropped silently.

Edge Cases Found
13 requirements extracted
5 / 5 gates
Partial gates
Cannot test
WRITTEN
8
~ SCAFFOLDED
3
FLAGGED
2
WRITTENTests written to disk
~ SCAFFOLDEDStubs for you to complete
FLAGGEDCannot auto-test, reason shown
WRITTEN
5 / 5 gates passed
Test written and proven

Fully generated, verified to pass on correct code and fail when the guard is removed. Written to your test file via libcst — no string pasting.

# tests/test_payments.py
def test_payment_rejects_zero_amount():
with pytest.raises(ValueError):
process_payment(amount=0, currency="USD")
 
# ✓ WRITTEN — all 5 gates passed

Source file restored immediately after gate 5 verification.

~
SCAFFOLDED
Partial — gates 1-3 passed
Stub ready for you to finish

Gate 1-3 passed but gate 4 or 5 could not be verified automatically. Quell writes a stub with a clear comment on what's needed — you fill in the rest.

# tests/test_payments.py
def test_refund_idempotency():
# TODO: verify idempotent refund behavior
# Quell: external state makes gate 5 unprovable
pass # complete me
 
# ~ SCAFFOLDED — gates 1-3 passed

Stubs are valid Python — they run (and fail) immediately.

FLAGGED
Cannot auto-test
Gap documented, reason given

The requirement exists and is documented, but no automatable test path exists. Quell explains why — side effects, non-determinism, external service — so you can decide.

# FLAGGED requirements:
 
✗ test_external_payment_gateway
Reason: calls stripe.Charge.create()
Side effect detected — cannot inject violation
 
# -5 PRS per unflagged requirement

Each flagged item documents exactly why it cannot be auto-tested.

PRODUCTION READINESS SCORE

One number.
How production-ready
are your edge cases?

PRS (0–100) aggregates how many of your requirements have verified tests. It rewards WRITTEN tests, penalizes uncovered gaps, and gives partial credit for SCAFFOLDED stubs. One number in CI.

Score Tiers
80 – 100Production Ready
60 – 79Review Needed
0 – 59Needs Work
+5for each documented FLAGGED requirement (you know what can't be tested)
-10for each skipped high-confidence test (coverage gap you chose to ignore)
060801000/100Production Ready
0
WRITTEN
0
SCAFFOLDED
0
FLAGGED
avg confidence 91%·13 edge cases
CI gate config
# pyproject.toml
[tool.quell]
prs_threshold = 80
fail_on_below_threshold = true
 
# CI gate — fails if PRS < 80
$ quell score --gate
PRS: 84/100 Production Ready ✓
HOW IT WORKS

From spec to verified test
in a few seconds.

Input Readers
Docstrings · Pydantic PySpark StructType
Requirements
list[Requirement]
Test Synthesizer
Rule engine LLM fallback
Verification
5-gate pipeline
Writer
libcst injection
WRITTEN
~ SCAFFOLDED
FLAGGED
01

Read existing specs

Quell AST-scans your source files. No annotations required. It reads Python docstrings (numpy/google/plain), Pydantic model field validators and constraints, and PySpark StructType schemas. Each reader returns [] on any error — it never crashes.

# quell reads what's already there
 
class PaymentRequest(BaseModel):
amount: float = Field(gt=0, description="Must be positive")
currency: str = Field(min_length=3, max_length=3)
 
# Extracted: MUST_RAISE, BOUNDARY, ENUM_VALID
02

Rule engine generates candidates

~75% of cases are handled by the deterministic rule engine — no LLM, no network. Rules handle MUST_RAISE, MUST_RETURN, BOUNDARY, ENUM_VALID, NOT_NULL, and TYPE_CHECK constraints. LLM is only called as a fallback for complex cases.

# Rule engine: BOUNDARY constraint
 
# From: amount: float = Field(gt=0)
# Generates:
def test_payment_rejects_zero_amount():
with pytest.raises(ValidationError):
PaymentRequest(amount=0, currency='USD')
03

5-gate verification pipeline

Every candidate test runs the 5-gate pipeline. Gates 1-3 are static (AST valid, not duplicate, no side effects). Gate 4 runs the test on the original code — it must pass. Gate 5 injects a violation and runs again — it must fail. Gate 5 is the moat.

Gate 1: AST Valid ✓ parses
Gate 2: Original ✓ not duplicate
Gate 3: Secure ✓ no side effects
Gate 4: Passes correct ✓ test passes
Gate 5: Fails violated ✓ violation caught
 
→ WRITTEN (5/5 gates)
04

Written to disk with libcst

Tests that pass all 5 gates are injected into your test file using libcst — Concrete Syntax Tree safe injection. No string concatenation, no overwriting. Quell backs up the file before writing, validates the CST, and restores on any failure. An audit log entry is appended.

# libcst injection — CST-safe
 
$ quell find src/
→ tests/test_payments.py (+8 tests)
→ tests/test_users.py (+3 tests)
 
Audit log: .quell/audit.jsonl
Backup: .quell/backups/
THE MOAT

Every WRITTEN test passes five gates.
Most tools run one.

~18% of generated tests that look correct actually fail gate 5 — they don't catch the bug they're supposed to catch.
1
AST Valid
Parses to valid Python AST before any execution
2
Original
Test not already present in any test file
3
Secure
No shell calls, no file system writes, no network
4
Passes CorrectTHE MOAT
Runs against original code
✓ MUST PASS
5
Fails ViolatedTHE MOAT
Runs against code with injected violation
✗ MUST FAIL
4
Gate 4 — Passes correct code
# Original code — guard intact
if amount <= 0:
raise ValueError
✓ test passes — correct behavior
5
Gate 5 — Fails violated code (THE MOAT)
# Violation injected — guard removed
# if amount <= 0:
# raise ValueError
pass
✗ test fails — violation detected
Gate 4 — Passes correct code
# Gate 4: test on ORIGINAL code
 
def process_payment(amount: float):
if amount <= 0: # guard intact
raise ValueError('amount must be positive')
 
$ pytest test_payment_rejects_zero_amount
PASSED ✓ (gate 4 passed)
Gate 5 — Fails violated code (THE MOAT)
# Gate 5: test on VIOLATED code
 
def process_payment(amount: float):
# if amount <= 0: <- guard removed
# raise ValueError <- violation
pass # nothing raised
 
$ pytest test_payment_rejects_zero_amount
FAILED ✗ (gate 5 passed — bug caught)
COVERAGE VS PRS

High coverage. Low PRS.
Both can be true simultaneously.

Coverage tells you which lines executed. It says nothing about whether those lines have any checks at the edge cases that matter. A 91% coverage score can coexist with a 52 PRS — same codebase, same tests, different measures.

Line Coverage(coverage.py)
91%
Production Readiness(PRS (quell score))
52 / 100
Same codebase. Both numbers correct. Coverage measures which lines ran. PRS measures whether tests actually catch bugs.
Constraint kinds & violation injections
ConstraintWhat it checksViolation injection
MUST_RAISEExpected exceptionRemove raise statement
MUST_RETURNExpected return valueChange return to wrong value
BOUNDARYNumeric boundary checkNegate comparison operator
ENUM_VALIDAllowed set membershipRemove enum validation
NOT_NULLNone rejectionRemove None check
SPEC SOURCES

No annotations needed.
Quell reads what's already there.

Your docstrings, type annotations, and schema definitions already contain testable requirements. Quell extracts them without any changes to your source code.

Python Docstrings
Numpy, Google, plain, reStructuredText
def process_payment(amount: float):
"""Process a payment.
 
Args:
amount: Must be > 0. Raises ValueError
if zero or negative."""
# → MUST_RAISE (ValueError, amount <= 0)
MUST_RAISEBOUNDARYNOT_NULL
Pydantic Models
v1 and v2, Field validators, model validators
class OrderRequest(BaseModel):
quantity: int = Field(ge=1, le=999)
sku: str = Field(min_length=6, max_length=12)
status: Literal['new','paid','shipped']
 
# → BOUNDARY, ENUM_VALID, TYPE_CHECK
BOUNDARYENUM_VALIDTYPE_CHECK
PySpark Schemas
StructType, StructField, nullable=False constraints
schema = StructType([
StructField('user_id', LongType(), nullable=False),
StructField('amount', DoubleType(), nullable=False),
StructField('currency', StringType(), nullable=True),
])
# → NOT_NULL (user_id, amount)
NOT_NULLTYPE_CHECK

OpenAPI, TypeScript types, and mutation results are on the roadmap.

VERSUS

Most tools run one gate.
We run five.

FeatureQuell
quelltest
GitHub CopilotQodo (CodiumAI)Hypothesis
Reads existing specs (no annotation)
Deterministic rule engine (no LLM needed)
Gate 4: test passes on correct codepartialpartial
Gate 5: test fails on violated code
Works offline (no network required)
Writes verified tests to disk (libcst)
Three-bucket output (WRITTEN / SCAFFOLDED / FLAGGED)
PRS production readiness score
No LLM API key required
Source file restore guarantee (finally block)
Supports Pydantic + PySpark schemaspartial
MIT licensed, runs in CI

Comparison as of May 2026. Information sourced from public documentation.

PRICING

Free to start.
Scale when you need to.

Hobby
Free

For individuals exploring Quell on personal projects.

  • 500 verifications / month
  • Python docstrings + Pydantic
  • 3-bucket output
  • CLI access
  • Community support
  • MIT licensed
Pro
Most Popular
$19/month

For individual developers shipping production Python.

  • Unlimited verifications
  • All spec sources (Pydantic, PySpark, docstrings)
  • PRS score + CI gate
  • Priority rule engine updates
  • GitHub Actions integration
  • Email support
Team
$79/month

For teams that need shared PRS tracking and audit logs.

  • Everything in Pro
  • Up to 10 team members
  • Shared PRS dashboard
  • Audit log export
  • Dedicated support

All plans include the offline rule engine. No LLM API key required for any tier. Cancel anytime.

FREQUENTLY ASKED

Questions, answered.

No. The rule engine handles ~75% of cases offline and deterministically. MUST_RAISE, MUST_RETURN, BOUNDARY, ENUM_VALID, NOT_NULL, and TYPE_CHECK are all pure rule-based. The LLM fallback is optional and only activates on complex cases you opt into.

Stop shipping
untested edge cases.

Quell reads your existing specs, generates verified tests, and writes them to disk. No LLM key. No internet. No guessing whether your tests actually catch bugs — they do.

$pip install quelltest

MIT licensed · Python 3.11+ · runs offline · no LLM key required