Context
Fast lane runs currently bypass concurrency limits entirely. This is safe at current volumes — production data shows peak simultaneous in-flight of 1 across all sync webhook users — but it means there's no upper bound on how many fast lane runs a given project can have in flight.
As adoption grows, we need admission control at the point of entry (the webhook endpoint) to prevent any single project from monopolising fast lane worker capacity.
Desired behaviour
When a sync webhook request arrives and the project has exceeded its fast lane limit:
- Reject immediately — do not enqueue the run. The run should never enter
available state.
- Return 429 Too Many Requests with standard rate limiting headers (
Retry-After, RateLimit-Limit).
- The caller can then implement backoff/retry, same as any rate-limited API.
This should feel predictable to the caller — if you've got a script firing requests in a loop, you get a clear signal to back off rather than silently queueing work that may never complete within the sync timeout.
What we need
1. Per-project fast lane limit
A configurable cap on how many fast lane runs can be in-flight simultaneously for a given project. At minimum settable by super admins.
This is not the same as the existing project concurrency — it's a separate limit for the fast lane queue specifically.
2. Realtime capacity check at the webhook endpoint
Before enqueuing a sync run, the webhook handler needs to answer two questions:
- Has this project/workflow hit its fast lane limit? — a count of in-flight fast lane runs against the configured cap.
- Are there any fast lane worker slots available? — if no worker can pick this up, there's no point enqueuing it.
These are distinct checks. The first is about per-project policy, the second is about cluster-level capacity. Both need to be fast (webhook callers are waiting) and accurate enough to avoid over-admission.
Some mechanisms to consider:
- Database counters for the per-project limit (simple, consistent, but adds a write to the hot path)
- ETS/PubSub counters for a faster local view of in-flight counts (eventually consistent across nodes)
- Worker capacity feedback via the claim protocol for the slot availability question (workers are the authority on their own capacity, but stale between claim cycles)
These aren't mutually exclusive — the solution will likely combine more than one approach.
Out of scope
- Changes to the existing concurrency exemption — that stays as-is
- Round-robin claim efficiency for dedicated queue workers (tracked separately)
Related
Context
Fast lane runs currently bypass concurrency limits entirely. This is safe at current volumes — production data shows peak simultaneous in-flight of 1 across all sync webhook users — but it means there's no upper bound on how many fast lane runs a given project can have in flight.
As adoption grows, we need admission control at the point of entry (the webhook endpoint) to prevent any single project from monopolising fast lane worker capacity.
Desired behaviour
When a sync webhook request arrives and the project has exceeded its fast lane limit:
availablestate.Retry-After,RateLimit-Limit).This should feel predictable to the caller — if you've got a script firing requests in a loop, you get a clear signal to back off rather than silently queueing work that may never complete within the sync timeout.
What we need
1. Per-project fast lane limit
A configurable cap on how many fast lane runs can be in-flight simultaneously for a given project. At minimum settable by super admins.
This is not the same as the existing project concurrency — it's a separate limit for the fast lane queue specifically.
2. Realtime capacity check at the webhook endpoint
Before enqueuing a sync run, the webhook handler needs to answer two questions:
These are distinct checks. The first is about per-project policy, the second is about cluster-level capacity. Both need to be fast (webhook callers are waiting) and accurate enough to avoid over-admission.
Some mechanisms to consider:
These aren't mutually exclusive — the solution will likely combine more than one approach.
Out of scope
Related