Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Pull request overview
Fixes a cyclopts CLI parsing crash triggered when --load-pattern is combined with load-pattern subfields like --target-qps / --concurrency in the online benchmark command.
Changes:
- Annotates
LoadPatternto adjust how cyclopts maps nested parameters (@cyclopts.Parameter(name="*")). - Updates the CLI parameter definition for
LoadPattern.typeto avoid the prior name collision.
Comments suppressed due to low confidence (1)
src/inference_endpoint/config/schema.py:360
- This change is a regression fix for a CLI crash when combining
--load-patternwith nested load-pattern fields (e.g.--target-qps). There’s existing automated test coverage for config validation intests/unit/commands/test_benchmark.py, but no test currently exercises cyclopts parsing for this flag combination.
Add a regression test that parses benchmark online ... --load-pattern poisson --target-qps 100 (or directly parses OnlineBenchmarkConfig via cyclopts) and asserts it no longer raises and that config.settings.load_pattern.type/target_qps are set as expected.
@cyclopts.Parameter(name="*")
class LoadPattern(BaseModel):
"""Load pattern configuration.
Different patterns use target_qps differently:
- max_throughput: target_qps used for calculating total queries (offline, optional with default)
- poisson: target_qps sets scheduler rate (online, required - validated)
- concurrency: issue at fixed target_concurrency (online, required - validated)
"""
model_config = ConfigDict(extra="forbid", frozen=True)
type: Annotated[
LoadPatternType,
cyclopts.Parameter(name="--load-pattern", help="Load pattern type"),
] = LoadPatternType.MAX_THROUGHPUT
target_qps: Annotated[
float | None, cyclopts.Parameter(alias="--target-qps", help="Target QPS")
] = Field(None, gt=0)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request modifies the LoadPattern class in the configuration schema by applying a class-level cyclopts.Parameter decorator and updating the type field's parameter definition to use the name argument instead of alias. I have no feedback to provide.
arekay-nv
left a comment
There was a problem hiding this comment.
Can we also add a test for this - seems like a change that shouldn't have gone in.
90fe9c8 to
80a79ef
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Review Council — Multi-AI Code ReviewReviewed by: Claude (Codex ran but produced investigation output, not structured findings) | Depth: standard Found 3 issues across 3 files:
Also addressed all Copilot review comments (pinned SHAs, quoted pip install, heredoc for inline Python, expanded pre-commit |
8915750 to
ffb87d9
Compare
| @pytest.mark.schema_fuzz | ||
| @pytest.mark.slow | ||
| @hyp_settings(max_examples=2000, deadline=5000) | ||
| @given(tokens=online_tokens()) |
There was a problem hiding this comment.
Fuzz test catches 53f08fc
The bug caused --load-pattern poisson --target-qps 100 to crash:
$ inference-endpoint benchmark online \
--endpoints http://localhost:8000 --model m --dataset d.pkl \
--load-pattern poisson --target-qps 100
IndexError: tuple index out of range
Reverted the fix and ran this test — Hypothesis finds it in 1.62s:
E IndexError: tuple index out of range
E Falsifying example: test_online_cli_no_crash(
E tokens=['benchmark', 'online', '--endpoints', 'http://h:80',
E '--model', 'm', '--dataset', 'd.pkl',
E '--load-pattern', 'poisson', '--target-qps', '100',
E '--name', 'test-val'],
E )
============================== 1 failed in 1.62s ===============================
| type: offline | ||
| model_params: | ||
| name: "meta-llama/Llama-3.1-8B-Instruct" | ||
| name: '<MODEL_NAME eg: meta-llama/Llama-3.1-8B-Instruct>' |
There was a problem hiding this comment.
Templates auto-generated from schema defaults by scripts/regenerate_templates.py.
Full YAML spec with placeholder overrides (model name, dataset)
Pre-commit validates templates are valid locally.
CI checks if they're up to date — if stale it will suggest to, run python scripts/regenerate_templates.py.
Is this overkill? Should we drop?
There was a problem hiding this comment.
This creates more burden to the end user to understand all the flags, and we should keep the template simple. cc: @arekay-nv to review as well
There was a problem hiding this comment.
I think this makes reproduction easier. Only the flags with values in <> need to be specified, all others have defaults.
We can even make this part of a pre-commit to have an integration test that takes in these templates, substitutes for an echo server just to make sure that the minimal version runs.
For beginners, they can always use the minimal command line which already has the defaults baked in.
| pip install -e ".[dev,test,performance]" | ||
| pip-audit | ||
|
|
||
| schema-updated: |
There was a problem hiding this comment.
new schema-updated CI job:
triggers on PRs touching schema.py/config.py/cli.py.
| - id: validate-templates | ||
| name: Validate YAML templates against schema | ||
| entry: python -c "from pathlib import Path; from inference_endpoint.config.schema import BenchmarkConfig; [BenchmarkConfig.from_yaml_file(f) for f in sorted(Path('src/inference_endpoint/config/templates').glob('*.yaml'))]" | ||
| - id: check-templates |
There was a problem hiding this comment.
reuse --check mode
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/inference_endpoint/config/templates/concurrency_template.yaml
Outdated
Show resolved
Hide resolved
Bug: LoadPattern.type had alias= instead of name= on cyclopts.Parameter, and class was missing @cyclopts.Parameter(name="*"). This caused any CLI invocation with --load-pattern to crash with IndexError. Tests: - Hypothesis fuzz tests auto-discover all CLI flags from cyclopts assemble_argument_collection() and test 4000 random combinations (offline + online/poisson + online/concurrency) - Added test_concurrency_benchmark with streaming on/off - hypothesis==6.151.10 added to test deps, schema_fuzz pytest marker CI & tooling: - schema-updated CI job: fuzz tests + template validation on schema changes - regenerate_templates.py: auto-generates YAML templates from schema defaults - Pre-commit checks templates are up to date (--check mode) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ffb87d9 to
b781ff7
Compare
What does this PR do?
Fixes CLI crash when
--load-pattern+--target-qpsare used together (IndexError: tuple index out of range), and adds test coverage to prevent regressions.Bug fix
LoadPattern.typeusedalias=instead ofname=oncyclopts.Parameter, and class was missing@cyclopts.Parameter(name="*")— caused cyclopts to fail resolving--load-patterninto a config key path.Test coverage
test_cli.py: Hypothesis fuzz tests auto-discover all CLI flags fromassemble_argument_collection()and test 4000 random combinations (up to 10 flags each) across offline + online/poisson + online/concurrency. Validated: catches this bug in 1.62s.test_benchmark_command.py: Addedtest_concurrency_benchmarkwith streaming on/off — all 3 execution modes now covered.hypothesis==6.151.10added to test deps,schema_fuzzpytest marker.CI & tooling
schema-updatedCI job: triggers on PRs touchingschema.py/config.py/cli.py— runs fuzz tests + validates YAML templates.regenerate_templates.py: auto-generates YAML templates from schema defaults + overrides. Pre-commit hook regenerates locally onschema.pychanges (skipped in CI).Type of change