API Testing with Contract Checks: Validate JSON Schemas in CI (Python + pytest)

API Testing with Contract Checks: Validate JSON Schemas in CI (Python + pytest)

Most teams test APIs by asserting a couple of fields (“status is 200”, “id exists”). That catches obvious breakage, but it won’t catch a subtle change like a field becoming nullable, a string turning into an integer, or a nested object shape shifting. Those are the changes that quietly break frontends and integrations.

A practical upgrade is contract testing at the HTTP layer: call real endpoints (staging/local), then validate responses against a JSON Schema. This is lightweight enough for junior/mid devs, and powerful enough to prevent accidental breaking changes.

In this hands-on guide, you’ll build a small pytest suite that:

  • Calls an API endpoint
  • Validates the response against a JSON Schema
  • Checks schema compatibility (additive changes OK, breaking changes fail)
  • Runs in CI with clear failures

What you’ll build

We’ll assume your API has an endpoint like GET /api/users/123 returning JSON:

{ "id": 123, "email": "[email protected]", "name": "Dev User", "roles": ["editor", "admin"], "profile": { "bio": "Hello", "website": "https://example.com" }, "created_at": "2026-03-01T12:34:56Z" }

Your tests will validate that shape using a schema file (checked into your repo) so a PR that changes output must either remain compatible—or update the contract deliberately.

Project setup

Create a test folder and install dependencies:

pip install pytest requests jsonschema

Suggested structure:

your-repo/ contracts/ user.schema.json tests/ test_contracts.py conftest.py

Define a JSON Schema contract

Create contracts/user.schema.json:

{ "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "User", "type": "object", "additionalProperties": false, "required": ["id", "email", "name", "roles", "profile", "created_at"], "properties": { "id": { "type": "integer" }, "email": { "type": "string", "format": "email" }, "name": { "type": "string", "minLength": 1 }, "roles": { "type": "array", "items": { "type": "string" }, "minItems": 0 }, "profile": { "type": "object", "additionalProperties": false, "required": ["bio", "website"], "properties": { "bio": { "type": "string" }, "website": { "type": ["string", "null"], "format": "uri" } } }, "created_at": { "type": "string", "format": "date-time" } } }

Key choices:

  • additionalProperties: false forces the API to only return known fields. This is strict. If your team prefers “additive fields are OK”, set this to true.
  • required lists fields clients depend on. Be intentional here.
  • Use type: ["string","null"] for nullable fields so you don’t “accidentally” break clients by returning null.

Write the contract test with pytest

Create tests/conftest.py to configure a base URL:

import os import pytest @pytest.fixture(scope="session") def base_url() -> str: # Example: http://localhost:8000 or https://staging.example.com return os.environ.get("API_BASE_URL", "http://localhost:8000")

Now create tests/test_contracts.py:

import json from pathlib import Path import requests from jsonschema import Draft202012Validator, FormatChecker CONTRACT_DIR = Path(__file__).resolve().parents[1] / "contracts" def load_schema(name: str) -> dict: path = CONTRACT_DIR / name return json.loads(path.read_text(encoding="utf-8")) def validate_json(instance: dict, schema: dict) -> None: validator = Draft202012Validator(schema, format_checker=FormatChecker()) errors = sorted(validator.iter_errors(instance), key=lambda e: e.path) if errors: pretty = [] for err in errors: loc = "$" + "".join(f"[{repr(p)}]" if isinstance(p, int) else f".{p}" for p in err.path) pretty.append(f"{loc}: {err.message}") raise AssertionError("Schema validation failed:\n" + "\n".join(pretty)) def test_user_contract(base_url: str): schema = load_schema("user.schema.json") r = requests.get(f"{base_url}/api/users/123", timeout=10) assert r.status_code == 200, r.text data = r.json() validate_json(data, schema)

Run it:

API_BASE_URL=http://localhost:8000 pytest -q

If the API returns an extra field (and your schema has additionalProperties: false), you’ll get a clear failure. If the API returns id as a string, you’ll get a type error. This is exactly what you want in CI: fast, deterministic breakage detection.

Make tests robust: handle non-deterministic fields

Some endpoints include data that varies (timestamps, request IDs, pagination tokens). You have three common options:

  • Model it in the schema (e.g., timestamps as format: date-time).
  • Exclude it from the contract if clients don’t rely on it.
  • Write a small “normalizer” that strips noisy fields before validation.

Example normalizer for a trace ID you don’t want in contracts:

def normalize_user(payload: dict) -> dict: payload = dict(payload) payload.pop("trace_id", None) return payload def test_user_contract(base_url: str): schema = load_schema("user.schema.json") r = requests.get(f"{base_url}/api/users/123", timeout=10) assert r.status_code == 200, r.text data = normalize_user(r.json()) validate_json(data, schema)

Guard against breaking changes: compare “old vs new” schema

Validating one response against one schema is good. But you can also prevent PRs from introducing breaking contract changes by checking schema compatibility.

Here’s a simple (not perfect, but effective) compatibility check:

  • Removing a required field is breaking.
  • Changing a field’s type is breaking.
  • Tightening constraints (like increasing minLength) can be breaking.
  • Adding optional fields is usually safe.

Create tests/test_schema_compat.py:

import json from pathlib import Path CONTRACT_DIR = Path(__file__).resolve().parents[1] / "contracts" def load(path: Path) -> dict: return json.loads(path.read_text(encoding="utf-8")) def types_of(schema: dict) -> set[str]: t = schema.get("type") if t is None: return set() if isinstance(t, list): return set(t) return {t} def assert_schema_compatible(old: dict, new: dict, path: str = "$"): # Required fields must not be removed old_req = set(old.get("required", [])) new_req = set(new.get("required", [])) removed_required = old_req - new_req if removed_required: raise AssertionError(f"{path}: removed required fields: {sorted(removed_required)}") # If both are objects, compare properties recursively if "properties" in old and "properties" in new: old_props = old["properties"] new_props = new["properties"] for key, old_prop_schema in old_props.items(): if key not in new_props: # Property removed: breaking if it was required if key in old_req: raise AssertionError(f"{path}.{key}: required property removed") continue new_prop_schema = new_props[key] old_types = types_of(old_prop_schema) new_types = types_of(new_prop_schema) # Type narrowing is breaking (e.g., ["string","null"] -> ["string"]) if old_types and new_types and not old_types.issubset(new_types): raise AssertionError( f"{path}.{key}: type narrowed from {sorted(old_types)} to {sorted(new_types)}" ) # Recurse into nested objects assert_schema_compatible(old_prop_schema, new_prop_schema, f"{path}.{key}") def test_user_schema_is_backward_compatible(): # Convention: commit previous schema copy in repo for comparison. # Example: contracts/user.schema.prev.json is the "last released" contract. old_path = CONTRACT_DIR / "user.schema.prev.json" new_path = CONTRACT_DIR / "user.schema.json" if not old_path.exists(): # First run: no baseline to compare against. # In a real repo, you’d add the baseline once and then enforce compatibility. return old_schema = load(old_path) new_schema = load(new_path) assert_schema_compatible(old_schema, new_schema)

How to use it in practice:

  • When you release a contract, copy user.schema.json to user.schema.prev.json (or tag it in a release branch).
  • In PRs, developers update user.schema.json to reflect intended changes.
  • CI fails if the schema change is breaking, forcing a discussion (version bump, new endpoint version, or client migration plan).

Run it in CI

The CI job should call your API (local container, docker-compose, or staging). The minimal GitHub Actions step looks like:

- name: Run contract tests env: API_BASE_URL: http://localhost:8000 run: | pip install -r requirements-dev.txt pytest -q

If your API requires auth, inject a token via environment variables and add headers in the request:

import os def auth_headers() -> dict: token = os.environ.get("API_TOKEN") if not token: return {} return {"Authorization": f"Bearer {token}"} # usage: r = requests.get(f"{base_url}/api/users/123", headers=auth_headers(), timeout=10)

Practical tips (what actually helps teams)

  • Start with 1–3 critical endpoints. Don’t try to schema-validate your whole API on day one.
  • Be careful with strictness. additionalProperties: false is great for internal APIs with tightly controlled clients; for public APIs, allowing extra fields can be kinder.
  • Validate arrays carefully. If clients rely on “at least one item”, add minItems. If empty arrays are fine, keep it at 0.
  • Use formats as guardrails. date-time, email, and uri catch real bugs.
  • Make schema updates intentional. If a response shape changes, update the schema in the same PR so reviewers see it.

Takeaway

Contract checks with JSON Schema are one of the most cost-effective testing upgrades you can make. They’re simple enough to adopt quickly, but they protect you from the painful class of “it still returns 200, but clients broke anyway” regressions.

Once this is in place, you can extend it naturally: add more schemas, validate error responses (4xx/5xx payloads), and enforce backwards compatibility on your “public contract” endpoints.


Leave a Reply

Your email address will not be published. Required fields are marked *