Quell reads your docstrings, Pydantic models, and type annotations, extracts every testable requirement, finds which ones have no test, generates pytest tests via a rule engine, verifies each test through a 5-gate pipeline, and writes only proven tests to disk.

Does Quell require an LLM API key?

The rule engine runs entirely in-process — no source code is ever transmitted. ~75% of edge cases are handled with no network call and no API key. LLM fallback is opt-in and only sends the function signature, never the full body.

What is the 5-gate pipeline?

Every generated test must pass: Gate 1 (AST valid Python), Gate 2 (not already in a test file), Gate 3 (no shell calls or file writes), Gate 4 (passes against original code), Gate 5 (fails when the requirement is violated). Only gate-5-verified tests are written to disk.

What is the Production Readiness Score (PRS)?

PRS = (WRITTEN × 1.0 + SCAFFOLDED × 0.5) / total_requirements × 100. Tiers: 80-100 Production Ready, 60-79 Review Needed, 0-59 Needs Work.

How is Quell different from GitHub Copilot or Qodo for test generation?

Quell reads specifications that already exist in your code — it does not generate tests from scratch. It finds requirements already documented in your docstrings, Pydantic models, and type annotations that have no test. The 5-gate pipeline, especially Gate 5 (violation injection), verifies each test actually catches the bug it claims to catch. This verification step is not present in Copilot, Qodo, or Hypothesis.

Can Quell be used in CI pipelines?

Yes. Run quell ci src/ --threshold 80 to fail CI if PRS falls below 80. Set prs_threshold in pyproject.toml under [tool.quell]. Works with GitHub Actions, GitLab CI, and any system that checks exit codes.

✦New: Quell now has a Claude connector — ask Claude about your production readiness in plain English.See how it works →

v2.0.1 · three-bucket output · PRS

✦New: Claude connector

Untested edge cases
will bite you
in production.
Quell finds them first.

Run quell find src/ and get three buckets back: tests written to disk (WRITTEN), stubs to finish and gaps with a one-line reason (FLAGGED). Every WRITTEN test passed five gates — including Gate 5, which injects the violation and confirms the test actually fails. Rule engine. No API key. Works offline.

Start for free Star on GitHub

$pip install quelltest

· rule engine, no API key· runs offline· MIT licensed

$ quell find src/payments/ --fix

v2.0.1

13 requirements·12 functions·2 Pydantic models

WRITTEN

SCAFFOLDED

FLAGGED

✓test_payment_rejects_zero_amount

94%

✓test_payment_rejects_negative

91%

✓test_user_email_must_not_be_empty

88%

~test_refund_idempotency

stub

✗src/billing.py:142 — external API

PRS: 84 / 100Production Ready

scanner ran in-process. nothing transmitted.

verification gates

every WRITTEN test passes all five

output buckets

WRITTEN · SCAFFOLDED · FLAGGED

of cases handled by rule engine

rule engine, no network calls, deterministic

local by default

no source code leaves your machine

THE OUTPUT

Three buckets.
Nothing dropped silently.

Edge Cases Found

13 requirements extracted

5 / 5 gates

Partial gates

Cannot test

✓ WRITTEN

~ SCAFFOLDED

✗ FLAGGED

✓ WRITTEN — Tests written to disk

~ SCAFFOLDED — Stubs for you to complete

✗ FLAGGED — Cannot auto-test, reason shown

✓

WRITTEN

5 / 5 gates passed

Test written and proven

Fully generated, verified to pass on correct code and fail when the guard is removed. Written to your test file via libcst — no string pasting.

# tests/test_payments.py
def test_payment_rejects_zero_amount():
    with pytest.raises(ValueError):
        process_payment(amount=0, currency="USD")
 
# ✓ WRITTEN — all 5 gates passed

Source file restored immediately after gate 5 verification.

SCAFFOLDED

Partial — gates 1-3 passed

Stub ready for you to finish

Gate 1-3 passed but gate 4 or 5 could not be verified automatically. Quell writes a stub with a clear comment on what's needed — you fill in the rest.

# tests/test_payments.py
def test_refund_idempotency():
    # TODO: verify idempotent refund behavior
    # Quell: external state makes gate 5 unprovable
    pass  # complete me
 
# ~ SCAFFOLDED — gates 1-3 passed

Stubs are valid Python — they run (and fail) immediately.

✗

FLAGGED

Cannot auto-test

Gap documented, reason given

The requirement exists and is documented, but no automatable test path exists. Quell explains why — side effects, non-determinism, external service — so you can decide.

# FLAGGED requirements:
 
✗ test_external_payment_gateway
  Reason: calls stripe.Charge.create()
  Side effect detected — cannot inject violation
 
# -5 PRS per unflagged requirement

Each flagged item documents exactly why it cannot be auto-tested.

PRODUCTION READINESS SCORE

One number.
How production-ready
are your edge cases?

PRS (0–100) aggregates how many of your requirements have verified tests. It rewards WRITTEN tests, penalizes uncovered gaps, and gives partial credit for SCAFFOLDED stubs. One number in CI.

Score Tiers

80 – 100Production Ready

60 – 79Review Needed

0 – 59Needs Work

+5for each documented FLAGGED requirement (you know what can't be tested)

-10for each skipped high-confidence test (coverage gap you chose to ignore)

WRITTEN

SCAFFOLDED

FLAGGED

avg confidence 91%·13 edge cases

CI gate config

# pyproject.toml
[tool.quell]
prs_threshold = 80
fail_on_below_threshold = true
 
# CI gate — fails if PRS < 80
$ quell score --gate
  PRS: 84/100  Production Ready  ✓

HOW IT WORKS

From spec to verified test
in a few seconds.

Input Readers

Docstrings · Pydantic PySpark StructType

Requirements

list[Requirement]

Test Synthesizer

Rule engine LLM fallback

Verification

5-gate pipeline

Writer

libcst injection

✓ WRITTEN

~ SCAFFOLDED

✗ FLAGGED

Input Readers

Docstrings · Pydantic PySpark StructType

Requirements

list[Requirement]

Test Synthesizer

Rule engine LLM fallback

Verification

5-gate pipeline

Writer

libcst injection

✓ WRITTEN

~ SCAFFOLDED

✗ FLAGGED

Read existing specs

Quell AST-scans your source files. No annotations required. It reads Python docstrings (numpy/google/plain), Pydantic model field validators and constraints, and PySpark StructType schemas. Each reader returns [] on any error — it never crashes.

# quell reads what's already there
 
class PaymentRequest(BaseModel):
    amount: float = Field(gt=0, description="Must be positive")
    currency: str = Field(min_length=3, max_length=3)
 
# Extracted: MUST_RAISE, BOUNDARY, ENUM_VALID

Rule engine generates candidates

Deterministic rule engine. Runs in-process, no network calls, no tokens consumed. Covers ~75% of real edge cases — MUST_RAISE, BOUNDARY, ENUM_VALID, NOT_NULL, TYPE_CHECK, MUST_RETURN. LLM handles the rest, only if you configure one.

# Rule engine: BOUNDARY constraint
 
# From: amount: float = Field(gt=0)
# Generates:
def test_payment_rejects_zero_amount():
    with pytest.raises(ValidationError):
        PaymentRequest(amount=0, currency='USD')

5-gate verification pipeline

Every candidate test runs the 5-gate pipeline. Gates 1-3 are static (AST valid, not duplicate, no side effects). Gate 4 runs the test on the original code — it must pass. Gate 5 injects a violation and runs again — it must fail. Gate 5 is the moat.

Gate 1: AST Valid        ✓ parses
Gate 2: Original         ✓ not duplicate
Gate 3: Secure           ✓ no side effects
Gate 4: Passes correct   ✓ test passes
Gate 5: Fails violated   ✓ violation caught
 
→ WRITTEN  (5/5 gates)

Written to disk with libcst

Tests that pass all 5 gates are injected into your test file using libcst — Concrete Syntax Tree safe injection. No string concatenation, no overwriting. Quell backs up the file before writing, validates the CST, and restores on any failure. An audit log entry is appended.

# libcst injection — CST-safe
 
$ quell find src/
  → tests/test_payments.py (+8 tests)
  → tests/test_users.py (+3 tests)
 
  Audit log: .quell/audit.jsonl
  Backup: .quell/backups/

THE MOAT

Every WRITTEN test passes five gates.
Most tools run one.

~18% of generated tests that look correct are actually wrong — they run green on CI but wouldn't catch the bug they claim to catch. Gate 5 injects the violation and demands the test fails. That's the gate nobody else runs.

AST Valid

Parses to valid Python AST before any execution

Original

Test not already present in any test file

Secure

No shell calls, no file system writes, no network

Passes Correct

Runs against original code

✓ MUST PASS

Fails Violated

Runs against code with injected violation

✗ MUST FAIL

THE MOAT — Only Quell verifies both

AST Valid

Parses to valid Python AST before any execution

Original

Test not already present in any test file

Secure

No shell calls, no file system writes, no network

Passes CorrectTHE MOAT

Runs against original code

✓ MUST PASS

Fails ViolatedTHE MOAT

Runs against code with injected violation

✗ MUST FAIL

Gate 4 — Passes correct code

# Original code — guard intact

if amount <= 0:

raise ValueError

✓ test passes — correct behavior

Gate 5 — Fails violated code (THE MOAT)

# Violation injected — guard removed

# if amount <= 0:

# raise ValueError

pass

✗ test fails — violation detected

Gate 4 — Passes correct code

# Gate 4: test on ORIGINAL code
 
def process_payment(amount: float):
    if amount <= 0:  # guard intact
        raise ValueError('amount must be positive')
 
$ pytest test_payment_rejects_zero_amount
  PASSED  ✓ (gate 4 passed)

Gate 5 — Fails violated code (THE MOAT)

# Gate 5: test on VIOLATED code
 
def process_payment(amount: float):
    # if amount <= 0:  <- guard removed
    #     raise ValueError  <- violation
    pass  # nothing raised
 
$ pytest test_payment_rejects_zero_amount
  FAILED  ✗ (gate 5 passed — bug caught)

COVERAGE VS PRS

High coverage. Low PRS.
Both can be true simultaneously.

Coverage tells you which lines executed. It says nothing about whether those lines have any checks at the edge cases that matter. A 91% coverage score can coexist with a 52 PRS — same codebase, same tests, different measures.

Line Coverage(coverage.py)

91%

Production Readiness(PRS (quell score))

52 / 100

Same codebase. Both numbers correct. Coverage measures which lines ran. PRS measures whether tests actually catch bugs.

Constraint kinds & violation injections

Constraint	What it checks	Violation injection
MUST_RAISE	Expected exception	Remove raise statement
MUST_RETURN	Expected return value	Change return to wrong value
BOUNDARY	Numeric boundary check	Negate comparison operator
ENUM_VALID	Allowed set membership	Remove enum validation
NOT_NULL	None rejection	Remove None check

SPEC SOURCES

No annotations needed.
Quell reads what's already there.

Your docstrings, type annotations, and schema definitions already contain testable requirements. Quell extracts them without any changes to your source code.

Python Docstrings

Numpy, Google, plain, reStructuredText

def process_payment(amount: float):
    """Process a payment.
 
    Args:
        amount: Must be > 0. Raises ValueError
                if zero or negative."""
# → MUST_RAISE (ValueError, amount <= 0)

MUST_RAISEBOUNDARYNOT_NULL

Pydantic Models

v1 and v2, Field validators, model validators

class OrderRequest(BaseModel):
    quantity: int = Field(ge=1, le=999)
    sku: str = Field(min_length=6, max_length=12)
    status: Literal['new','paid','shipped']
 
# → BOUNDARY, ENUM_VALID, TYPE_CHECK

BOUNDARYENUM_VALIDTYPE_CHECK

PySpark Schemas

StructType, StructField, nullable=False constraints

schema = StructType([
    StructField('user_id', LongType(), nullable=False),
    StructField('amount', DoubleType(), nullable=False),
    StructField('currency', StringType(), nullable=True),
])
# → NOT_NULL (user_id, amount)

NOT_NULLTYPE_CHECK

OpenAPI, TypeScript types, and mutation results are on the roadmap.

VERSUS

Most tools run one gate.
We run five.

Feature	Quell quelltest	GitHub Copilot	Qodo (CodiumAI)	Hypothesis
Reads existing specs (no annotation)	✓	✗	✗	✗
Deterministic rule engine (no LLM needed)	✓	✗	✗	✓
Gate 4: test passes on correct code	✓	partial	partial	✗
Gate 5: test fails on violated code	✓	✗	✗	✗
Offline by default — rule engine needs no API key	✓	✗	✗	✓
Writes verified tests to disk (libcst)	✓	✗	✗	✗
Three-bucket output (WRITTEN / SCAFFOLDED / FLAGGED)	✓	✗	✗	✗
PRS production readiness score	✓	✗	✗	✗
Source code stays on disk — nothing transmitted	✓	✗	✗	✗
Supports Pydantic + PySpark schemas	✓	✗	partial	✗
Claude.ai native connector	✓	✗	✗	✗
MIT licensed, runs in CI	✓	✗	✗	✓

Comparison as of June 2026. Information sourced from public documentation.

What Quell does

Quell does not touch your source code.
It writes tests for it.

If you need to safely refactor existing code, that's a different tool. Quell's job is to find the edge cases in your code that have no tests yet, and write the pytest tests that prove they're covered.

CLAUDE CONNECTOR

Ask Claude about your edge cases.

Connect Quell to Claude.ai. Ask in plain English — no terminal required.

Claude+ Quell connector

What's blocking us from 80 PRS on the payments module?

Two things: test_refund_idempotency is scaffolded and 11 days old (that's −5 PRS). And billing.py:42 is flagged because it calls stripe.Charge.create() directly — Quell can't inject a violation into a live Stripe call. If you mock that call, Quell can write the test next run.

Which tests should I review before merging?

3 tests are MEDIUM confidence (60–84). test_apply_discount_boundary at 72% and test_process_payment_zero at 68% are worth a glance — both rely on docstring specs rather than Pydantic constraints.

Ask about your production readiness…

Sync your Quell reports to the cloud (Pro/Team). Connect the Quell Claude connector. Then ask Claude anything about your production readiness — no terminal required.

1Run quell find src/ --fix --sync to push your report
2Open Claude.ai → Connectors → search "Quell" → Connect
3Ask Claude anything about your PRS, flagged items, or test confidence

🔒

Your source code stays on your machine. Always.

The connector sees test names, confidence scores, and flagged reasons. Never your source code, test bodies, or docstrings.

Full privacy details →

Connect to Claude.ai Read the docs

PRICING

Free to start.
Scale when you need to.

Hobby

Free

For individuals exploring Quell on personal projects.

✓500 verifications / month
✓Python docstrings + Pydantic
✓3-bucket output
✓CLI access
✓Community support
✓MIT licensed

Get started free

Pro

Questions, answered.

No. The rule engine handles ~75% of cases with zero network calls and no API key. MUST_RAISE, MUST_RETURN, BOUNDARY, ENUM_VALID, NOT_NULL, and TYPE_CHECK are all pure rule-based. The LLM fallback is optional and only activates on complex cases you opt into.

Stop shipping
untested edge cases.

Quell reads your existing specs, finds the edge cases with no tests, and writes verified pytest tests to disk. Every test is proven to fail when the bug is injected. Gap-first. Confidence-first.

Start for free Read the docs

$pip install quelltest

MIT licensed · Python 3.11+ · runs offline · rule engine, no API key

Untested edge caseswill bite youin production.Quell finds them first.

Three buckets.Nothing dropped silently.

One number.How production-readyare your edge cases?

From spec to verified testin a few seconds.

Read existing specs

Rule engine generates candidates

5-gate verification pipeline

Written to disk with libcst

Every WRITTEN test passes five gates.Most tools run one.

High coverage. Low PRS.Both can be true simultaneously.

No annotations needed.Quell reads what's already there.

Most tools run one gate.We run five.

Ask Claude about your edge cases.

Free to start.Scale when you need to.

Questions, answered.

Stop shippinguntested edge cases.

Untested edge cases
will bite you
in production.
Quell finds them first.

Three buckets.
Nothing dropped silently.

One number.
How production-ready
are your edge cases?

From spec to verified test
in a few seconds.

Every WRITTEN test passes five gates.
Most tools run one.

High coverage. Low PRS.
Both can be true simultaneously.

No annotations needed.
Quell reads what's already there.

Most tools run one gate.
We run five.

Free to start.
Scale when you need to.

Stop shipping
untested edge cases.