Skip to content

feat: add schema-compare command to test harness#2891

Merged
markphelps merged 5 commits intomainfrom
schema-compare-harness
Apr 7, 2026
Merged

feat: add schema-compare command to test harness#2891
markphelps merged 5 commits intomainfrom
schema-compare-harness

Conversation

@markphelps
Copy link
Copy Markdown
Contributor

@markphelps markphelps commented Mar 30, 2026

Summary

Adds a schema-compare command to the test harness (tools/test-harness/) that compares static (Go tree-sitter) vs runtime (Python) schema generation for regression detection.

What's new

--cog-ref flag

Build and test cog from any git ref (branch, tag, or commit SHA) without manual builds. Clones the repo, builds the CLI binary via go build and the SDK wheel via uv build, then uses both automatically:

# Test schema gen from main
uv run cog-test schema-compare --no-gpu --cog-ref main

# Test from a feature branch
uv run cog-test schema-compare --no-gpu --cog-ref feat/new-schema

# Test from a specific commit
uv run cog-test run --no-gpu --cog-ref abc1234

Resolution priority:

  • CLI binary: --cog-binary > --cog-ref > --cog-version > manifest default > latest stable
  • SDK: --sdk-wheel > --cog-ref (auto-built) > --sdk-version > manifest default > latest stable

Parallel schema builds

The static and runtime builds now run concurrently using distinct Docker image tags (:static and :runtime), roughly halving wall-clock time per model.

How it works

For each model, schema-compare:

  1. Builds with COG_STATIC_SCHEMA=1 and without — in parallel
  2. Extracts OpenAPI schemas from the Docker image labels
  3. Compares the two JSON schemas for exact equality
  4. Reports a structured diff on mismatch showing the exact JSON paths that diverge
cd tools/test-harness

# Compare all non-GPU models (latest release)
uv run cog-test schema-compare --no-gpu

# Compare against main branch
uv run cog-test schema-compare --no-gpu --cog-ref main

# Compare a specific fixture with a local binary
uv run cog-test schema-compare --model fixture-scalar-types --cog-binary /path/to/cog

Local fixture models

7 local fixture models (fixtures/models/) covering the full input type matrix:

Fixture Coverage
scalar-types str, int, float, bool, Secret
optional-types PEP 604 X | None and Optional[X] for all types
list-types list[X] and List[X] for str, int, Path, File
optional-list-types list[X] | None and Optional[List[X]]
constraints-and-choices ge/le constraints, string/int choices
file-path-types Path, File, optional Path/File
complex-output BaseModel structured output

These use repo: local in the manifest and are loaded from fixtures/models/ without cloning.

Example output

Schema Comparison Report (CLI ref:main)
=======================================

  + fixture-scalar-types      schemas match (static 12.3s, runtime 11.8s)
  + fixture-optional-types    schemas match (static 11.5s, runtime 11.2s)
  x fixture-complex-output    schemas differ
      $.components.schemas.Output.properties.score.title: value mismatch
        static:  "Score"
        runtime: "score"

@markphelps markphelps force-pushed the schema-compare-harness branch 3 times, most recently from 2c85d07 to a3e6902 Compare March 30, 2026 17:42
Add a new `schema-compare` command that builds each model twice — once
with COG_STATIC_SCHEMA=1 (Go tree-sitter) and once without (Python
runtime) — then compares the resulting OpenAPI schemas for exact JSON
equality. Differences are reported with a structured diff showing the
exact JSON paths that diverge.

Also add 7 local fixture models covering the full input type matrix:
scalar types, optional types (PEP 604 + typing.Optional), list types,
optional list types, constraints/choices, File/Path types, and
structured BaseModel output.
@markphelps markphelps force-pushed the schema-compare-harness branch from a3e6902 to 1ce7b49 Compare March 30, 2026 17:51
@ask-bonk
Copy link
Copy Markdown

ask-bonk bot commented Apr 7, 2026

After reviewing the PR thoroughly, I found the code to be well-structured with proper error handling, good type annotations, and comprehensive documentation. The schema comparison logic is sound, resource cleanup is handled correctly with finally blocks, and the fixture models provide good coverage of input types.

LGTM

github run

Add --cog-ref to build cog CLI + SDK wheel from any git ref (branch,
tag, or commit SHA), enabling testing against main or feature branches
without manual builds. When used, the SDK wheel is automatically built
from the same ref and passed via COG_SDK_WHEEL.

Also parallelize the static/runtime builds in schema-compare using
ThreadPoolExecutor with distinct image tags, roughly halving wall-clock
time per model.

Extract _resolve_versions() helper to consolidate version resolution
logic across all commands and produce clearer log output (no spurious
PyPI lookups when --cog-ref provides a wheel).
@ask-bonk
Copy link
Copy Markdown

ask-bonk bot commented Apr 7, 2026

After reviewing the PR thoroughly, I found the code to be well-structured with proper error handling, good type annotations, and comprehensive documentation. The schema comparison logic is sound, resource cleanup is handled correctly with finally blocks, and the fixture models provide good coverage of input types.

LGTM

github run

@markphelps markphelps marked this pull request as ready for review April 7, 2026 18:51
@markphelps markphelps requested a review from a team as a code owner April 7, 2026 18:51
@markphelps markphelps requested a review from michaeldwan April 7, 2026 18:58
The static schema generator was using TitleCaseSingle (which only
uppercases the first character) instead of TitleCase (which splits on
underscores) for the cancel endpoint path parameter. This produced
"Prediction_id" instead of "Prediction Id", diverging from the
Python runtime schema generator.

Found by the test harness schema-compare command.
@ask-bonk
Copy link
Copy Markdown

ask-bonk bot commented Apr 7, 2026

After reviewing the PR thoroughly, I found the code to be well-structured with proper error handling, good type annotations, and comprehensive documentation. The schema comparison logic is sound, resource cleanup is handled correctly with finally blocks, and the fixture models provide good coverage of input types.

LGTM

github run

@michaeldwan michaeldwan added this pull request to the merge queue Apr 7, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2026
@michaeldwan michaeldwan enabled auto-merge April 7, 2026 21:05
@ask-bonk
Copy link
Copy Markdown

ask-bonk bot commented Apr 7, 2026

After thoroughly reviewing all the changes in PR #2891, I found the code to be well-structured and properly implemented:

Summary: This PR adds a schema-compare command to the test harness that compares static (Go tree-sitter) vs runtime (Python) schema generation, with parallel builds, git ref support, and comprehensive fixture models.

Key findings:

  • Clean implementation with proper error handling and resource cleanup
  • Good type annotations throughout Python code
  • Parallel execution implemented correctly using ThreadPoolExecutor
  • 7 fixture models provide comprehensive input type coverage
  • Documentation is clear and complete
  • Minor bug fix in pkg/schema/openapi.go (OpenAPI spec compliance)
  • No breaking changes or security concerns

LGTM

github run

@michaeldwan michaeldwan added this pull request to the merge queue Apr 7, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2026
@markphelps markphelps added this pull request to the merge queue Apr 7, 2026
Merged via the queue into main with commit 6019c1b Apr 7, 2026
34 checks passed
@markphelps markphelps deleted the schema-compare-harness branch April 7, 2026 21:31
markphelps added a commit that referenced this pull request Apr 8, 2026
…icate/cog into mphelps/push-phase-progress

* 'mphelps/push-phase-progress' of https://github.com/replicate/cog: (95 commits)
  feat: add metric name validation (#2911)
  Rename `cog run` to `cog exec` (#2916)
  chore(deps): bump github.com/google/go-containerregistry (#2884)
  fix: replace removed libgl1-mesa-glx in tensorflow integration test (#2914)
  ci: enforce stub freshness in CI, fix existing stub drift (#2912)
  feat: add schema-compare command to test harness (#2891)
  chore(deps): bump uuid from 1.22.0 to 1.23.0 in /crates (#2887)
  chore(deps): bump github.com/hashicorp/go-version from 1.7.0 to 1.9.0 (#2909)
  chore(deps): bump insta from 1.46.3 to 1.47.2 in /crates (#2908)
  fix: support list[X] | None inputs + integration tests for PEP 604 union File/Path coercion (#2882)
  ci: exclude Dependabot PRs from auto-code review (#2910)
  chore(deps): bump actions/checkout from 4 to 6 (#2904)
  chore(deps): bump github.com/testcontainers/testcontainers-go/modules/registry (#2886)
  fix: metrics bugs in coglet prediction server (#2896)
  Bump version to 0.17.2 (#2903)
  fix(coglet): propagate metric scope to async event loop thread (#2902)
  chore: remove unnecessary nolint directive in test (#2803)
  feat(coglet): add Sentry error reporting for infrastructure errors (#2865)
  fix: homebrew cask postflight xattr references wrong binary name (#2899)
  fix: include custom metrics in cog predict --json output (#2897)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants