feat: add schema-compare command to test harness by markphelps · Pull Request #2891 · replicate/cog

markphelps · 2026-03-30T16:34:07Z

Summary

Adds a schema-compare command to the test harness (tools/test-harness/) that compares static (Go tree-sitter) vs runtime (Python) schema generation for regression detection.

What's new

`--cog-ref` flag

Build and test cog from any git ref (branch, tag, or commit SHA) without manual builds. Clones the repo, builds the CLI binary via go build and the SDK wheel via uv build, then uses both automatically:

# Test schema gen from main
uv run cog-test schema-compare --no-gpu --cog-ref main

# Test from a feature branch
uv run cog-test schema-compare --no-gpu --cog-ref feat/new-schema

# Test from a specific commit
uv run cog-test run --no-gpu --cog-ref abc1234

Resolution priority:

CLI binary: --cog-binary > --cog-ref > --cog-version > manifest default > latest stable
SDK: --sdk-wheel > --cog-ref (auto-built) > --sdk-version > manifest default > latest stable

Parallel schema builds

The static and runtime builds now run concurrently using distinct Docker image tags (:static and :runtime), roughly halving wall-clock time per model.

How it works

For each model, schema-compare:

Builds with COG_STATIC_SCHEMA=1 and without — in parallel
Extracts OpenAPI schemas from the Docker image labels
Compares the two JSON schemas for exact equality
Reports a structured diff on mismatch showing the exact JSON paths that diverge

cd tools/test-harness

# Compare all non-GPU models (latest release)
uv run cog-test schema-compare --no-gpu

# Compare against main branch
uv run cog-test schema-compare --no-gpu --cog-ref main

# Compare a specific fixture with a local binary
uv run cog-test schema-compare --model fixture-scalar-types --cog-binary /path/to/cog

Local fixture models

7 local fixture models (fixtures/models/) covering the full input type matrix:

Fixture	Coverage
`scalar-types`	str, int, float, bool, Secret
`optional-types`	PEP 604 `X \| None` and `Optional[X]` for all types
`list-types`	`list[X]` and `List[X]` for str, int, Path, File
`optional-list-types`	`list[X] \| None` and `Optional[List[X]]`
`constraints-and-choices`	ge/le constraints, string/int choices
`file-path-types`	Path, File, optional Path/File
`complex-output`	BaseModel structured output

These use repo: local in the manifest and are loaded from fixtures/models/ without cloning.

Example output

Schema Comparison Report (CLI ref:main)
=======================================

  + fixture-scalar-types      schemas match (static 12.3s, runtime 11.8s)
  + fixture-optional-types    schemas match (static 11.5s, runtime 11.2s)
  x fixture-complex-output    schemas differ
      $.components.schemas.Output.properties.score.title: value mismatch
        static:  "Score"
        runtime: "score"

Add a new `schema-compare` command that builds each model twice — once with COG_STATIC_SCHEMA=1 (Go tree-sitter) and once without (Python runtime) — then compares the resulting OpenAPI schemas for exact JSON equality. Differences are reported with a structured diff showing the exact JSON paths that diverge. Also add 7 local fixture models covering the full input type matrix: scalar types, optional types (PEP 604 + typing.Optional), list types, optional list types, constraints/choices, File/Path types, and structured BaseModel output.

ask-bonk · 2026-04-07T17:36:23Z

After reviewing the PR thoroughly, I found the code to be well-structured with proper error handling, good type annotations, and comprehensive documentation. The schema comparison logic is sound, resource cleanup is handled correctly with finally blocks, and the fixture models provide good coverage of input types.

LGTM

github run

Add --cog-ref to build cog CLI + SDK wheel from any git ref (branch, tag, or commit SHA), enabling testing against main or feature branches without manual builds. When used, the SDK wheel is automatically built from the same ref and passed via COG_SDK_WHEEL. Also parallelize the static/runtime builds in schema-compare using ThreadPoolExecutor with distinct image tags, roughly halving wall-clock time per model. Extract _resolve_versions() helper to consolidate version resolution logic across all commands and produce clearer log output (no spurious PyPI lookups when --cog-ref provides a wheel).

ask-bonk · 2026-04-07T17:56:24Z

After reviewing the PR thoroughly, I found the code to be well-structured with proper error handling, good type annotations, and comprehensive documentation. The schema comparison logic is sound, resource cleanup is handled correctly with finally blocks, and the fixture models provide good coverage of input types.

LGTM

github run

The static schema generator was using TitleCaseSingle (which only uppercases the first character) instead of TitleCase (which splits on underscores) for the cancel endpoint path parameter. This produced "Prediction_id" instead of "Prediction Id", diverging from the Python runtime schema generator. Found by the test harness schema-compare command.

ask-bonk · 2026-04-07T20:02:17Z

After reviewing the PR thoroughly, I found the code to be well-structured with proper error handling, good type annotations, and comprehensive documentation. The schema comparison logic is sound, resource cleanup is handled correctly with finally blocks, and the fixture models provide good coverage of input types.

LGTM

github run

ask-bonk · 2026-04-07T21:07:15Z

After thoroughly reviewing all the changes in PR #2891, I found the code to be well-structured and properly implemented:

Summary: This PR adds a schema-compare command to the test harness that compares static (Go tree-sitter) vs runtime (Python) schema generation, with parallel builds, git ref support, and comprehensive fixture models.

Key findings:

Clean implementation with proper error handling and resource cleanup
Good type annotations throughout Python code
Parallel execution implemented correctly using ThreadPoolExecutor
7 fixture models provide comprehensive input type coverage
Documentation is clear and complete
Minor bug fix in pkg/schema/openapi.go (OpenAPI spec compliance)
No breaking changes or security concerns

LGTM

github run

…icate/cog into mphelps/push-phase-progress * 'mphelps/push-phase-progress' of https://github.com/replicate/cog: (95 commits) feat: add metric name validation (#2911) Rename `cog run` to `cog exec` (#2916) chore(deps): bump github.com/google/go-containerregistry (#2884) fix: replace removed libgl1-mesa-glx in tensorflow integration test (#2914) ci: enforce stub freshness in CI, fix existing stub drift (#2912) feat: add schema-compare command to test harness (#2891) chore(deps): bump uuid from 1.22.0 to 1.23.0 in /crates (#2887) chore(deps): bump github.com/hashicorp/go-version from 1.7.0 to 1.9.0 (#2909) chore(deps): bump insta from 1.46.3 to 1.47.2 in /crates (#2908) fix: support list[X] | None inputs + integration tests for PEP 604 union File/Path coercion (#2882) ci: exclude Dependabot PRs from auto-code review (#2910) chore(deps): bump actions/checkout from 4 to 6 (#2904) chore(deps): bump github.com/testcontainers/testcontainers-go/modules/registry (#2886) fix: metrics bugs in coglet prediction server (#2896) Bump version to 0.17.2 (#2903) fix(coglet): propagate metric scope to async event loop thread (#2902) chore: remove unnecessary nolint directive in test (#2803) feat(coglet): add Sentry error reporting for infrastructure errors (#2865) fix: homebrew cask postflight xattr references wrong binary name (#2899) fix: include custom metrics in cog predict --json output (#2897) ...

markphelps force-pushed the schema-compare-harness branch 3 times, most recently from 2c85d07 to a3e6902 Compare March 30, 2026 17:42

markphelps force-pushed the schema-compare-harness branch from a3e6902 to 1ce7b49 Compare March 30, 2026 17:51

Merge branch 'main' into schema-compare-harness

5da16ad

markphelps marked this pull request as ready for review April 7, 2026 18:51

markphelps requested a review from a team as a code owner April 7, 2026 18:51

markphelps requested a review from michaeldwan April 7, 2026 18:58

michaeldwan approved these changes Apr 7, 2026

View reviewed changes

michaeldwan added this pull request to the merge queue Apr 7, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2026

Merge branch 'main' into schema-compare-harness

6fbddab

michaeldwan enabled auto-merge April 7, 2026 21:05

michaeldwan added this pull request to the merge queue Apr 7, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2026

markphelps added this pull request to the merge queue Apr 7, 2026

Merged via the queue into main with commit 6019c1b Apr 7, 2026
34 checks passed

markphelps deleted the schema-compare-harness branch April 7, 2026 21:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add schema-compare command to test harness#2891

feat: add schema-compare command to test harness#2891
markphelps merged 5 commits intomainfrom
schema-compare-harness

markphelps commented Mar 30, 2026 •

edited

Loading

Uh oh!

ask-bonk bot commented Apr 7, 2026

Uh oh!

ask-bonk bot commented Apr 7, 2026

Uh oh!

ask-bonk bot commented Apr 7, 2026

Uh oh!

Uh oh!

ask-bonk bot commented Apr 7, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

markphelps commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's new

--cog-ref flag

Parallel schema builds

How it works

Local fixture models

Example output

Uh oh!

ask-bonk bot commented Apr 7, 2026

Uh oh!

ask-bonk bot commented Apr 7, 2026

Uh oh!

ask-bonk bot commented Apr 7, 2026

Uh oh!

Uh oh!

ask-bonk bot commented Apr 7, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

markphelps commented Mar 30, 2026 •

edited

Loading

`--cog-ref` flag