Fast lane admission control: per-project rate limiting at the webhook endpoint

## Context

Fast lane runs currently bypass concurrency limits entirely. This is safe at current volumes — production data shows peak simultaneous in-flight of 1 across all sync webhook users — but it means there's no upper bound on how many fast lane runs a given project can have in flight.

As adoption grows, we need admission control at the point of entry (the webhook endpoint) to prevent any single project from monopolising fast lane worker capacity.

## Desired behaviour

When a sync webhook request arrives and the project has exceeded its fast lane limit:

1. **Reject immediately** — do not enqueue the run. The run should never enter `available` state.
2. **Return 429 Too Many Requests** with standard rate limiting headers (`Retry-After`, `RateLimit-Limit`).
3. The caller can then implement backoff/retry, same as any rate-limited API.

This should feel predictable to the caller — if you've got a script firing requests in a loop, you get a clear signal to back off rather than silently queueing work that may never complete within the sync timeout.

## What we need

### 1. Per-project fast lane limit

A configurable cap on how many fast lane runs can be in-flight simultaneously for a given project. At minimum settable by super admins.

This is **not** the same as the existing project concurrency — it's a separate limit for the fast lane queue specifically.

### 2. Realtime capacity check at the webhook endpoint

Before enqueuing a sync run, the webhook handler needs to answer two questions:

1. **Has this project/workflow hit its fast lane limit?** — a count of in-flight fast lane runs against the configured cap.
2. **Are there any fast lane worker slots available?** — if no worker can pick this up, there's no point enqueuing it.

These are distinct checks. The first is about per-project policy, the second is about cluster-level capacity. Both need to be fast (webhook callers are waiting) and accurate enough to avoid over-admission.

Some mechanisms to consider:

- **Database counters** for the per-project limit (simple, consistent, but adds a write to the hot path)
- **ETS/PubSub counters** for a faster local view of in-flight counts (eventually consistent across nodes)
- **Worker capacity feedback** via the claim protocol for the slot availability question (workers are the authority on their own capacity, but stale between claim cycles)

These aren't mutually exclusive — the solution will likely combine more than one approach.

## Out of scope

- Changes to the existing concurrency exemption — that stays as-is
- Round-robin claim efficiency for dedicated queue workers (tracked separately)

## Related

- Parent: #4498

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast lane admission control: per-project rate limiting at the webhook endpoint #4573

Context

Desired behaviour

What we need

1. Per-project fast lane limit

2. Realtime capacity check at the webhook endpoint

Out of scope

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fast lane admission control: per-project rate limiting at the webhook endpoint #4573

Description

Context

Desired behaviour

What we need

1. Per-project fast lane limit

2. Realtime capacity check at the webhook endpoint

Out of scope

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions