Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pkg/schema/openapi.go
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ func buildOpenAPISpec(info *PredictorInfo) map[string]any {
"parameters": []any{
map[string]any{
"required": true,
"schema": map[string]any{"title": TitleCaseSingle(cancelParam), "type": "string"},
"schema": map[string]any{"title": TitleCase(cancelParam), "type": "string"},
"name": cancelParam,
"in": "path",
},
Expand Down
115 changes: 89 additions & 26 deletions tools/test-harness/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,37 +5,58 @@ Designed to test any cog model from any repo.

## Quick Start

All commands use `uv run` which automatically installs dependencies from
`pyproject.toml` — no manual venv setup needed.

```bash
cd tools/test-harness

# Create a venv and install dependencies
python3 -m venv .venv
source .venv/bin/activate
pip install pyyaml

# List all models in the manifest
python -m harness list
uv run cog-test list

# Run all non-GPU models
python -m harness run --no-gpu
uv run cog-test run --no-gpu

# Run a specific model
python -m harness run --model hello-world
uv run cog-test run --model hello-world

# Run GPU models only (requires NVIDIA GPU + nvidia-docker)
python -m harness run --gpu-only
uv run cog-test run --gpu-only

# Output JSON report
python -m harness run --no-gpu --output json --output-file results/report.json
uv run cog-test run --no-gpu --output json --output-file results/report.json

# Build images only (no predictions)
python -m harness build --no-gpu
uv run cog-test build --no-gpu

# Compare static (Go) vs runtime (Python) schema generation
uv run cog-test schema-compare --no-gpu

# Compare schemas for a specific fixture model
uv run cog-test schema-compare --model fixture-scalar-types

# Use a locally-built cog binary
uv run cog-test schema-compare --no-gpu --cog-binary /path/to/cog

# Test from a git branch (builds CLI + SDK automatically)
uv run cog-test schema-compare --no-gpu --cog-ref main
uv run cog-test schema-compare --no-gpu --cog-ref feat/new-schema

# Test from a specific commit
uv run cog-test schema-compare --no-gpu --cog-ref abc1234

# Test fully from source (manual build)
mise run build:cog && mise run build:sdk
uv run cog-test schema-compare --no-gpu \
--cog-binary dist/go/*/cog \
--sdk-wheel dist/python/cog-*.whl
```

## Prerequisites

- Python 3.10+
- [uv](https://docs.astral.sh/uv/) (or Python 3.10+ with `pip install pyyaml`)
- Docker
- For `--cog-ref`: Go and uv on PATH (run `mise install` to set up)
- For GPU models: NVIDIA GPU + nvidia-docker runtime

### Version Resolution
Expand All @@ -47,19 +68,19 @@ skipping any alpha/beta/rc tags. You can override either via the CLI or in

```bash
# Use the latest stable CLI + SDK (default)
python -m harness run --no-gpu
uv run cog-test run --no-gpu

# Pin a specific CLI version
python -m harness run --cog-version v0.16.12 --no-gpu
uv run cog-test run --cog-version v0.16.12 --no-gpu

# Pin a specific SDK version
python -m harness run --sdk-version 0.16.12 --no-gpu
uv run cog-test run --sdk-version 0.16.12 --no-gpu

# Use a pre-release CLI
python -m harness run --cog-version v0.17.0-rc.2 --no-gpu
uv run cog-test run --cog-version v0.17.0-rc.2 --no-gpu

# Use a locally-built binary (overrides --cog-version)
python -m harness run --cog-binary ./dist/go/darwin-arm64/cog --no-gpu
uv run cog-test run --cog-binary ./dist/go/darwin-arm64/cog --no-gpu
```

You can also pin versions in `manifest.yaml` under `defaults`:
Expand All @@ -70,7 +91,9 @@ defaults:
cog_version: "latest" # or pin e.g. "v0.16.12"
```

**Resolution priority** (for both CLI and SDK): CLI flag > manifest default > latest stable.
**Resolution priority:**
- CLI binary: `--cog-binary` > `--cog-ref` > `--cog-version` > manifest default > latest stable
- SDK: `--sdk-wheel` > `--cog-ref` (auto-built) > `--sdk-version` > manifest default > latest stable

## Manifest Format

Expand Down Expand Up @@ -156,12 +179,13 @@ No code changes required.
## CLI Reference

```
usage: cog-test {run,build,list} [options]
usage: cog-test {run,build,list,schema-compare} [options]

Commands:
run Build and test models (full pipeline)
build Build Docker images only (no predictions)
list List models defined in the manifest
run Build and test models (full pipeline)
build Build Docker images only (no predictions)
list List models defined in the manifest
schema-compare Compare static (Go) vs runtime (Python) schema generation

Common options:
--manifest PATH Path to manifest.yaml
Expand All @@ -170,24 +194,63 @@ Common options:
--gpu-only Only run GPU models
--sdk-version VER SDK version (default: latest stable from PyPI)
--cog-version TAG CLI version to download (default: latest stable)
--cog-binary PATH Path to local cog binary (overrides --cog-version)
--cog-binary PATH Path to local cog binary (overrides --cog-version and --cog-ref)
--cog-ref REF Git ref to build from source (branch, tag, or SHA; requires go + uv)
--sdk-wheel PATH Local wheel, URL, or 'pypi[:ver]' (sets COG_SDK_WHEEL, overrides --sdk-version and --cog-ref wheel)
--keep-images Don't clean up Docker images after run

Run-specific options:
Run/schema-compare options:
--output {console,json} Output format (default: console)
--output-file PATH Write report to file
```

### Schema Comparison

The `schema-compare` command builds each model **twice in parallel** — once
with `COG_STATIC_SCHEMA=1` (Go tree-sitter parser) and once without (Python
runtime schema generation) — then compares the resulting OpenAPI schemas
for exact JSON equality. Any difference is reported as a failure with a
structured diff showing the exact paths that diverge.

This is useful for catching regressions when changing either the Go static
schema generator (`pkg/schema/`) or the Python SDK schema generation
(`python/cog/_adt.py`, `python/cog/_inspector.py`, `python/cog/_schemas.py`).

### Local Fixture Models

Models with `repo: local` are loaded from `fixtures/models/<path>/` instead
of being cloned from GitHub. These are small predictors designed to cover
the full input type matrix for schema comparison testing:

| Fixture | What it covers |
|---------|----------------|
| `scalar-types` | str, int, float, bool, Secret |
| `optional-types` | PEP 604 `X \| None` and `Optional[X]` for all types |
| `list-types` | `list[X]` and `List[X]` for str, int, Path, File |
| `optional-list-types` | `list[X] \| None` and `Optional[List[X]]` |
| `constraints-and-choices` | ge/le constraints, string/int choices |
| `file-path-types` | Path, File, optional Path/File |
| `complex-output` | BaseModel structured output |

## Architecture

```
tools/test-harness/
├── manifest.yaml # Declarative test definitions
├── fixtures/ # Test input files (images, etc.)
├── fixtures/
│ ├── *.png # Test input files (images, etc.)
│ └── models/ # Local fixture models for schema comparison
│ ├── scalar-types/
│ ├── optional-types/
│ ├── list-types/
│ ├── optional-list-types/
│ ├── constraints-and-choices/
│ ├── file-path-types/
│ └── complex-output/
├── harness/
│ ├── cli.py # CLI entry point
│ ├── cog_resolver.py # Resolves + downloads cog CLI and SDK versions
│ ├── runner.py # Clone -> patch -> build -> predict -> validate
│ ├── runner.py # Clone -> patch -> build -> predict -> validate + schema compare
│ ├── patcher.py # Patches cog.yaml with sdk_version + overrides
│ ├── validators.py # Output validation strategies
│ └── report.py # Console + JSON report generation
Expand Down
3 changes: 3 additions & 0 deletions tools/test-harness/fixtures/models/complex-output/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build:
python_version: "3.12"
predict: "predict.py:Predictor"
15 changes: 15 additions & 0 deletions tools/test-harness/fixtures/models/complex-output/predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from cog import BaseModel, BasePredictor, Input


class Output(BaseModel):
text: str
score: float
tags: list[str]


class Predictor(BasePredictor):
def predict(
self,
prompt: str = Input(description="Input prompt"),
) -> Output:
return Output(text=f"generated: {prompt}", score=0.95, tags=["a", "b"])
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build:
python_version: "3.12"
predict: "predict.py:Predictor"
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
from cog import BasePredictor, Input


class Predictor(BasePredictor):
def predict(
self,
prompt: str = Input(description="The prompt", default="hello"),
temperature: float = Input(
description="Sampling temperature", ge=0.0, le=2.0, default=0.7
),
top_k: int = Input(description="Top-K", ge=1, le=100, default=50),
mode: str = Input(
description="Quality mode",
choices=["fast", "balanced", "quality"],
default="balanced",
),
style: int = Input(
description="Style preset",
choices=[1, 2, 3],
default=1,
),
) -> str:
return f"{prompt}-{temperature}-{top_k}-{mode}-{style}"
3 changes: 3 additions & 0 deletions tools/test-harness/fixtures/models/file-path-types/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build:
python_version: "3.12"
predict: "predict.py:Predictor"
13 changes: 13 additions & 0 deletions tools/test-harness/fixtures/models/file-path-types/predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from cog import BasePredictor, File, Input, Path


class Predictor(BasePredictor):
def predict(
self,
image: Path = Input(description="An image path"),
document: File = Input(description="A file upload"),
# Optional variants
mask: Path | None = Input(description="Optional mask path", default=None),
attachment: File | None = Input(description="Optional file", default=None),
) -> str:
return f"image={image} mask={mask}"
3 changes: 3 additions & 0 deletions tools/test-harness/fixtures/models/list-types/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build:
python_version: "3.12"
predict: "predict.py:Predictor"
14 changes: 14 additions & 0 deletions tools/test-harness/fixtures/models/list-types/predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
from typing import List

from cog import BasePredictor, File, Input, Path


class Predictor(BasePredictor):
def predict(
self,
tags: list[str] = Input(description="List of strings"),
numbers: List[int] = Input(description="List of ints"),
paths: list[Path] = Input(description="List of paths"),
files: list[File] = Input(description="List of files"),
) -> str:
return f"tags={len(tags)} numbers={len(numbers)}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build:
python_version: "3.12"
predict: "predict.py:Predictor"
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from typing import List, Optional

from cog import BasePredictor, File, Input, Path


class Predictor(BasePredictor):
def predict(
self,
text: str = Input(description="Required anchor field"),
# PEP 604 optional lists
opt_tags: list[str] | None = Input(
description="Optional list of strings", default=None
),
opt_paths: list[Path] | None = Input(
description="Optional list of paths", default=None
),
opt_files: list[File] | None = Input(
description="Optional list of files", default=None
),
# typing.Optional style
opt_ints: Optional[List[int]] = Input(
description="Optional list of ints", default=None
),
) -> str:
return f"{text}"
3 changes: 3 additions & 0 deletions tools/test-harness/fixtures/models/optional-types/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build:
python_version: "3.12"
predict: "predict.py:Predictor"
19 changes: 19 additions & 0 deletions tools/test-harness/fixtures/models/optional-types/predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from typing import Optional

from cog import BasePredictor, File, Input, Path


class Predictor(BasePredictor):
def predict(
self,
text: str = Input(description="Required string"),
# PEP 604 style optionals
opt_str: str | None = Input(description="Optional string", default=None),
opt_int: int | None = Input(description="Optional int", default=None),
opt_float: float | None = Input(description="Optional float", default=None),
opt_bool: bool | None = Input(description="Optional bool", default=None),
# typing.Optional style
opt_path: Optional[Path] = Input(description="Optional path", default=None),
opt_file: Optional[File] = Input(description="Optional file", default=None),
) -> str:
return f"{text}"
3 changes: 3 additions & 0 deletions tools/test-harness/fixtures/models/scalar-types/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build:
python_version: "3.12"
predict: "predict.py:Predictor"
13 changes: 13 additions & 0 deletions tools/test-harness/fixtures/models/scalar-types/predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from cog import BasePredictor, Input, Secret


class Predictor(BasePredictor):
def predict(
self,
text: str = Input(description="A string input"),
count: int = Input(description="An integer", default=5),
temperature: float = Input(description="A float", default=0.7),
flag: bool = Input(description="A boolean", default=True),
api_key: Secret = Input(description="A secret key"),
) -> str:
return f"{text}-{count}-{temperature}-{flag}"
Loading
Loading