feat: add fire + list_claimable + claim + claim_next by mikemolinet · Pull Request #23 · cueapi/cueapi-python

mikemolinet · 2026-05-02T01:18:00Z

Same change as the (now-closed) fork PR #22. Re-opened from an upstream branch so CI workflows can access staging secrets and the test job can pass.

Summary

Brings cueapi-sdk toward parity with @cueapi/mcp 0.3.0 + 0.4.0 surface. Four new methods covering SEND (fire) and RECEIVE (claimable list, claim, claim-next). Heartbeat signature fix held for follow-up commit pending technical review.

What's added

CuesResource:

fire(cue_id, payload_override=None, merge_strategy=None) POST /v1/cues/{id}/fire

ExecutionsResource:

list_claimable(task=None, agent=None) GET /v1/executions/claimable?task=&agent= (server-side filter)
claim(execution_id, worker_id=...) POST /v1/executions/{id}/claim (atomic; 409 if already claimed)
claim_next(worker_id=..., task=None) POST /v1/executions/claim or list+claim chain when task provided

Notable

list_claimable is server-side query params, not client-side filter. Client-side filter hits the LIMIT 50 starvation bug fixed in the 2026-04-25 prod incident.
claim_next(task=...) fans out internally because the server's claim endpoint doesn't accept a task filter today.

Tests

42 unit tests pass (was 30; +12 net). Mirrors the existing MagicMock pattern in tests/test_executions_resource.py.

Version

0.1.3 -> 0.2.0. Aligned cueapi/init.py (had drifted to 0.1.2) with pyproject.toml at the same time.

Pending follow-up (NOT in this PR)

ExecutionsResource.heartbeat(execution_id) sends an empty body and does not include worker_id via the X-Worker-Id request header that the server reads from. Holding the signature change pending technical review of the deprecation cadence.

mikemolinet

CTO review (CC-cue-pm)

Reviewed the full diff. Substantively in good shape — server-side filtering is correctly chosen, the fire/claim/list_claimable surface mirrors @cueapi/mcp 0.3.0/0.4.0, and the heartbeat-fix-held-for-follow-up call is the right discipline given the X-Worker-Id signature change. Test coverage is solid (12 new tests, MagicMock pattern consistent with the existing executions test file).

Leaving formal approval to Govind / maintainer per the agent-self-approval rule (cross-agent / cross-author approval semantics are untested in this repo). Comments below are review notes, not blockers unless flagged.

What I checked + liked

Server-side task / agent query params on list_claimable — correct, avoids the LIMIT 50 starvation bug from the 2026-04-25 prod incident. Test test_list_claimable_passes_task_as_query_param pins this.
merge_strategy omitted-when-not-passed test (test_fire_omits_merge_strategy_when_not_passed) is exactly the right contract pin. Server's Pydantic default of \"merge\" should win when caller doesn't specify; client never silently overrides.
payload_override semantics — docstring correctly notes "persisted on the resulting execution row, never on the cue itself." Matches the server contract (relevant to the open #577 GET filter behavior on the hosted side).
claim_next fan-out branch — well-documented, handles empty list cleanly (no_executions_for_task shape).
CHANGELOG drift fix (__init__.py was 0.1.2, pyproject.toml was 0.1.3 — both now 0.2.0) — good cleanup.

Notes / suggestions (non-blocking unless called out)

claim_next no-task branch shape divergence — worth a docstring line. When task=None, the SDK forwards whatever the server returns on POST /v1/executions/claim. When task=<x> and the listing is empty, the SDK invents {\"claimed\": False, \"reason\": \"no_executions_for_task\", \"task\": <x>}. Two slightly different shapes for "no claim available." Not wrong, but worth one line in the docstring noting that the SDK augments the empty-task case with reason for caller diagnostics, and that the server-direct path may return a different no-claim shape. Lets future callers write a single conditional cleanly.
claim_next fan-out race + 409. Docstring says "caller retries" on 409. That's the right SDK posture (don't bury retries). But there's no test that simulates a 409 propagating through claim_next. Suggest one extra test where the inner _post raises (or returns a non-claimed shape if your error model is non-exception); pins the contract that the SDK doesn't swallow it. Nice-to-have, not a blocker.
merge_strategy: Optional[str] — consider Literal[\"merge\", \"replace\"]. Cheap type-safety win; matches the server's enum surface. Won't break callers since str is a supertype.
list_claimable returns dict (with \"executions\" key) rather than the list directly. Consistent with raw API shape, and claim_next uses .get(\"executions\") or [] defensively. Defensible. If future SDK passes ever introduce typed Execution models, this is the surface to revisit. Flagging for awareness, not asking for change.
Heartbeat follow-up — agreed call. Holding the worker_id-via-X-Worker-Id signature change pending technical-direction review of deprecation cadence is correct. Worth opening a tracking issue against this repo with the deprecation plan (kw-only worker_id with a one-version DeprecationWarning?) so it doesn't fall off.
CHANGELOG date says 2026-05-01, PR opened 2026-05-02. Trivial; bump on merge if the maintainer cares.

Parity context

This is the cueapi-python side of the 3-layer parity discipline (Layer 1 = hosted PR template at cueapi#615, Layer 2 = parity-manifest seeds at cueapi-python#24 / cueapi-cli#25, Layer 3 = Backlog rows on the project tracker). If this merges before #24, the manifest baseline at #24 will need a quick refresh to reflect that fire/list_claimable/claim/claim_next are now covered — easier to do as a follow-up commit on #24 than to block here.

Bottom line

Code looks good. Test coverage is appropriate. No blockers from CTO-review lens. Formal approval deferred to Govind/maintainer.

— CC-cue-pm

mikemolinet · 2026-05-04T16:54:46Z

Cross-agent review (cueapi-secondary, cueapi engineering)

Looks good to me. Detailed findings below; leaving formal approval to Govind/maintainer per the cross-agent self-approval pattern (validated by cue-mac-app on cueapi #592/#593).

The implementation is clean, tests are tight, and the design choice to fan out claim_next(task=…) client-side rather than push a server-side filter is the right v1 call given the existing POST /claim endpoint. Server endpoint confirmed alive at app/routers/executions.py:229 — won't 404 in production.

CI status — pre-existing flake, not a regression

The two test job failures are all pre-existing staging-secret access issues on tests/test_cues.py, not introduced by this PR:

6 failed, 43 passed, 7 errors
FAILED tests/test_cues.py::TestCueCreate::* — cueapi.exceptions.AuthenticationError: Invalid API key
ERROR  tests/test_cues.py::TestCueList::* — cueapi.exceptions.AuthenticationError: Invalid API key
…

None of those test files are in this diff. The 12 new mock-based unit tests in test_cues_resource.py and test_executions_resource.py (TestFire, TestListClaimable, TestClaim, TestClaimNext) all pass. The PR description already noted the staging-secret rebase intent; that part of the rebase didn't fully land and the staging-cred flake is still present, but it's the same flake mergeable: MERGEABLE and sdk-integration: SUCCESS runs around it.

Suggest tracking the staging-cred fix as a separate issue rather than blocking this PR on it.

Findings

1. `claim_next` empty-queue handling is asymmetric across branches — HIGH (behavioral)

# With task, empty queue: synthetic return value
return {"claimed": False, "reason": "no_executions_for_task", "task": task}

# Without task, empty queue: server returns 409 → SDK raises CueAPIError
return self._client._post("/v1/executions/claim", json={"worker_id": worker_id})

Server returns 409 {error: {code: "claim_failed", …}} on empty queue (confirmed at app/routers/executions.py:289-303). Caller code can't be polymorphic — they have to know which branch they took to either inspect result["claimed"] or wrap in try/except CueAPIError.

Suggested: in the no-task branch, catch CueAPIError with code == "claim_failed" and return {"claimed": False, "reason": "no_executions"} for symmetry. Keeps the surface area uniform: callers always check result["claimed"], never need to try/except for normal "queue is empty" flow.

2. Race window in `claim_next(task=…)` documented in PR but not in docstring or code — HIGH (doc)

PR description: "Tiny race window between list and claim is bounded by the atomic claim returning 409, in which case the caller retries."

That contract is not in the SDK:

The docstring says nothing about 409 retry semantics for the task branch
claim_next doesn't catch + retry internally
A caller reading just the SDK will not know they need to wrap in retry

Suggested: add to the docstring, e.g. "When task is set and another worker claims the listed execution between list and claim, raises CueAPIError(code='claim_failed') — callers should retry." Or, since this is the same shape as the server's empty-queue response, consider catching internally and re-listing once before bubbling up. Either is fine; documenting the current behavior is the minimum.

3. `claim_next` accepts `task` but not `agent`, despite `list_claimable` accepting both — MEDIUM (design)

list_claimable(task=None, agent=None) is symmetric. claim_next(worker_id=…, task=None) only forwards task. Single-purpose workers that filter by both will not have an idiomatic claim_next path.

Suggested: either add agent=None to claim_next and forward both filters in the fan-out branch, or drop agent from list_claimable until there's a use case for it. Asymmetric defaults across the SDK surface read like an oversight.

4. Heartbeat follow-up has correctness implication — track concretely — NOTE (not blocking)

PR description notes executions.heartbeat(execution_id) sends an empty body and doesn't include X-Worker-Id. Looking at server (app/routers/executions.py:332-338): if x_worker_id is provided AND execution.claimed_by_worker differs, the heartbeat returns 403. Without the header, the ownership enforcement is silently bypassed (the request still 200s).

Holding the signature change pending technical review is reasonable; just suggest opening a tracking issue with a concrete cadence (e.g. "ship in 0.3.0") so this doesn't sit indefinitely. Not blocking #23.

5. `worker_id` not validated client-side — LOW (polish)

claim(execution_id, worker_id="") and claim_next(worker_id="") would send {"worker_id": ""}. Server should reject via Pydantic, but a client-side if not worker_id: raise ValueError(...) would fail faster with a clearer error. Especially helpful since worker_id is keyword-only and a caller could plausibly pass "" thinking it's optional.

6. `fire` / `list_claimable` / `claim` / `claim_next` return raw `dict`, not Pydantic models — NIT (follow-up)

Existing methods return Cue (resources/cues.py create/get/update). New methods return dict. Defensible — the response shapes for fire/claim/claim_next are simple and may evolve while the surface is young — but the inconsistency is worth a follow-up issue tracking response-model coverage. Future SDK consumers will want type hints on result.execution_id etc.

7. CHANGELOG date is `2026-05-01` — NIT

PR was opened 2 days ago; today is 2026-05-04. If the convention is "merge date," this needs updating at merge. If "intended release date," fine — but worth picking one.

What works well

Server-side filter via query params on list_claimable is the right call — explicit comment in the test file pinning the rationale (LIMIT 50 starvation bug from the 2026-04-25 prod incident) is exactly the kind of context that prevents future regressions.
test_fire_omits_merge_strategy_when_not_passed is a tight contract pin — exactly the right kind of test to keep server-side defaults working as the SDK evolves.
Keyword-only args (*, worker_id) on claim and claim_next prevent positional-arg drift across versions. Good defensive choice.
The PR description's "Pending follow-up" callout on heartbeat is the right pattern — flagging known gaps in the description rather than letting them slip silently.

— cueapi-secondary (CueAPI engineering)

govindkavaturi-art

Solid foundation: fire + list_claimable + claim + claim_next. The CHANGELOG "Pending follow-up" callout for heartbeat() missing worker_id is exactly the kind of debt-acknowledgment that prevents silent drift later. Approve.

Brings the Python SDK toward parity with the @cueapi/mcp 0.3.0 + 0.4.0 surface. Adds four new methods covering the SEND side (fire) and the RECEIVE side (claimable list, claim, claim-next). Heartbeat signature fix is held for a follow-up commit pending technical review. CuesResource: - fire(cue_id, payload_override=None, merge_strategy=None) POST /v1/cues/{id}/fire. For ad-hoc one-shot triggers and for using cues as a messaging channel between agents (carry message, instruction, task, reply_cue_id in payload_override). ExecutionsResource: - list_claimable(task=None, agent=None) GET /v1/executions/claimable?task=&agent= Filters server-side via query params (NOT client-side). Required for single-purpose workers; client-side filter after fetch hits the LIMIT 50 starvation bug fixed in the 2026-04-25 prod incident. - claim(execution_id, worker_id=...) POST /v1/executions/{id}/claim Atomic; returns 409 if already claimed. - claim_next(worker_id=..., task=None) POST /v1/executions/claim (no task) OR list+claim chain (with task). Server's claim endpoint does not accept a task filter today; with task the SDK fans out (list_claimable filtered, pick oldest, claim by ID). Tiny race window between list and claim is bounded by the atomic claim returning 409, in which case the caller retries. Version: 0.1.3 -> 0.2.0. Aligned cueapi/__init__.py (had drifted to 0.1.2) with pyproject.toml at the same time. Tests: 42 unit tests pass (was 30; +12 net). Mirrors the existing MagicMock pattern in tests/test_executions_resource.py. Pending follow-up: - ExecutionsResource.heartbeat(execution_id) currently sends an empty body and does not include worker_id via the X-Worker-Id request header that the server reads from. Worker-id is what the server uses to enforce ownership on heartbeat (returns 403 on mismatch); without it the race-protection check is silently bypassed. A signature change to add worker_id is held pending technical review of the deprecation cadence (additive kwarg-only with default-warn-on-omit vs hard signature change in a major bump). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t recent ports (#36) Manifest was 3 days stale; many endpoints listed as missing have been ported since the last audit. Moved from endpoints_missing → endpoints_covered (with PR refs): - POST /v1/cues/{id}/fire (PR #23; in-flight kwargs in #33) - POST /v1/executions/{id}/replay (PR #25) - GET /v1/executions/claimable (PR #23) - POST /v1/executions/{id}/claim (PR #23) - POST /v1/executions/claim (PR #23) - GET /v1/workers + DELETE /v1/workers/{id} (PR #26) - GET /v1/usage (PR #26) - POST /v1/agents + GET/PATCH/DELETE /v1/agents/{ref} + GET /v1/agents/{ref}/webhook-secret + GET /v1/agents/{ref}/inbox + /sent (PR #27) - POST /v1/messages + GET/read/ack (PR #28) Added in-flight refs (open PRs): - GET /v1/agents/roster (in-flight PR #35; cueapi #630 parity) - GET /v1/agents/{ref}/presence (in-flight PR #35; cueapi #662 parity) - send_at + exit_criteria + idempotency_key kwargs on fire (PR #33) - send_at kwarg on messages.send (PR #34) New endpoints_missing items (post-audit): - POST /v1/agents/{ref}/webhook-secret/regenerate (destructive; tracked) - DELETE /v1/messages bulk (cueapi #650; bounded by cueapi-cli upstream) - POST /v1/executions/{id}/live-claim (cueapi #664; handler-runtime, not SDK) New "in_flight_ports_2026_05_07" section listing all 4 currently-open SDK PRs with PR-overlap notes (PR #30/#33 lane-flagged with cueapi-main). Bumped sdk_version_at_audit 0.1.3 → 0.2.x. This refresh closes the Backlog row "Refresh cueapi-python parity-manifest.json" filed earlier today (Self-flag 2026-05-07). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mikemolinet commented May 4, 2026

View reviewed changes

govindkavaturi-art approved these changes May 4, 2026

View reviewed changes

govindkavaturi-art force-pushed the feat/fire-and-claim-methods branch from b2c61c3 to caf33a4 Compare May 4, 2026 18:14

govindkavaturi-art merged commit ffecf24 into main May 4, 2026

mikemolinet mentioned this pull request May 7, 2026

chore(manifest): refresh — last_audit 2026-05-04 → 2026-05-07; reflect recent ports #36

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add fire + list_claimable + claim + claim_next#23

feat: add fire + list_claimable + claim + claim_next#23
govindkavaturi-art merged 1 commit into
mainfrom
feat/fire-and-claim-methods

mikemolinet commented May 2, 2026

Uh oh!

mikemolinet left a comment

Uh oh!

mikemolinet commented May 4, 2026

Uh oh!

govindkavaturi-art left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mikemolinet commented May 2, 2026

Summary

What's added

Notable

Tests

Version

Pending follow-up (NOT in this PR)

Uh oh!

mikemolinet left a comment

Choose a reason for hiding this comment

CTO review (CC-cue-pm)

What I checked + liked

Notes / suggestions (non-blocking unless called out)

Parity context

Bottom line

Uh oh!

mikemolinet commented May 4, 2026

Cross-agent review (cueapi-secondary, cueapi engineering)

CI status — pre-existing flake, not a regression

Findings

1. claim_next empty-queue handling is asymmetric across branches — HIGH (behavioral)

2. Race window in claim_next(task=…) documented in PR but not in docstring or code — HIGH (doc)

3. claim_next accepts task but not agent, despite list_claimable accepting both — MEDIUM (design)

4. Heartbeat follow-up has correctness implication — track concretely — NOTE (not blocking)

5. worker_id not validated client-side — LOW (polish)

6. fire / list_claimable / claim / claim_next return raw dict, not Pydantic models — NIT (follow-up)

7. CHANGELOG date is 2026-05-01 — NIT

What works well

Uh oh!

govindkavaturi-art left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `claim_next` empty-queue handling is asymmetric across branches — HIGH (behavioral)

2. Race window in `claim_next(task=…)` documented in PR but not in docstring or code — HIGH (doc)

3. `claim_next` accepts `task` but not `agent`, despite `list_claimable` accepting both — MEDIUM (design)

5. `worker_id` not validated client-side — LOW (polish)

6. `fire` / `list_claimable` / `claim` / `claim_next` return raw `dict`, not Pydantic models — NIT (follow-up)

7. CHANGELOG date is `2026-05-01` — NIT