- targets available locally or via Docker Compose
- parity contract fixtures up to date
- benchmark quality tools installed locally:
hyperfine(forBENCH_ENGINE=hyperfine)benchstat(go install golang.org/x/perf/cmd/benchstat@latest)
make benchmark
make report
make benchmark-schema-validatemake benchmark-modkit
make benchmark-nestjsPer-target runs also emit results/latest/environment.fingerprint.json and results/latest/environment.manifest.json.
Use GitHub Actions workflow benchmark-manual with bounded workflow_dispatch inputs:
frameworks: comma-separated subset ofmodkit,nestjs,baseline,wire,fx,doruns: integer in range1..10benchmark_requests: integer in range50..1000
Runs that exceed bounds are rejected before benchmark execution.
Optional OSS measurement engine:
BENCH_ENGINE=hyperfine make benchmarkFramework services use shared default limits from docker-compose.yml:
- CPU:
BENCHMARK_CPU_LIMIT(default1.00) - memory:
BENCHMARK_MEMORY_LIMIT(default1024m)
Override for local experimentation:
BENCHMARK_CPU_LIMIT=2.00 BENCHMARK_MEMORY_LIMIT=1536m docker compose up --buildBenchmark scripts must run parity first for each target. If parity fails, skip benchmark for that target and record the skip reason.
results/latest/raw/*.json- raw benchmark outputsresults/latest/environment.fingerprint.json- runtime and toolchain versions for the runresults/latest/environment.manifest.json- timestamped runner metadata and result indexresults/latest/summary.json- normalized summaryresults/latest/report.md- markdown reportresults/latest/benchmark-quality-summary.json- policy quality gate outputresults/latest/tooling/benchstat/*.txt- benchstat comparison outputsschemas/benchmark-raw-v1.schema.json- raw benchmark artifact contractschemas/benchmark-summary-v1.schema.json- summary artifact contract
make benchmark-schema-validate
make benchmark-stats-check
make benchmark-variance-check
make benchmark-benchstat-check
make ci-benchmark-quality-check
make todo-debt-check
make report-disclaimer-check
make methodology-changelog-check
make publication-sync-checkQuality thresholds and required metrics are versioned in stats-policy.json.
- run from a clean working tree when possible
- keep runtime versions stable
- include host and Docker metadata in report notes
- benchmark smoke job timeout budget: 25 minutes
- benchmark quality summary artifact retention: 14 days
- expected CI compute envelope: one benchmark smoke run per ref due to concurrency cancellation; superseded runs are canceled before full benchmark execution