Skip to content

docs: add state machine design doc (0005)#731

Open
pgrayy wants to merge 3 commits intostrands-agents:mainfrom
pgrayy:designs/0005-state-machine
Open

docs: add state machine design doc (0005)#731
pgrayy wants to merge 3 commits intostrands-agents:mainfrom
pgrayy:designs/0005-state-machine

Conversation

@pgrayy
Copy link
Copy Markdown
Member

@pgrayy pgrayy commented Apr 2, 2026

Description

Proposes restructuring the Agent loop into discrete steps coordinated by an orchestrator. The design decomposes the loop into five layers (Clients, Steps, Middleware, Plugins, Orchestrators) while keeping the public API unchanged.

Covers capabilities enabled by this decomposition: cross-cutting middleware (e.g., guardrails), checkpointing for durable execution, sub-orchestration for swappable execution strategies, and isolated invocation state for concurrent use.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Documentation Preview Ready

Your documentation preview has been successfully deployed!

Preview URL: https://d3ehv1nix5p99z.cloudfront.net/pr-cms-731/docs/user-guide/quickstart/overview/

Updated at: 2026-04-03T16:59:40.829Z

}
```

`AgentContext` holds read-only dependencies:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: AgentState and AgentContext feels a bit confusing to me. Also context is overloaded term in AI imho. When I say agent context, first thing I think about is messages array

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my comment below, probably shouldn't have bothered introducing AgentContext. Everything could just be AgentState and this is something we already discussed as part of a team with #551.


### Plugins

Plugins register hook callbacks to observe and indirectly influence step execution. The SDK fires lifecycle events (e.g., `BeforeModelCallEvent`, `AfterToolCallEvent`) at the appropriate points, and plugin callbacks react to them by setting flags like `retry` or `cancel` that the step or middleware responds to.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this different from Python plugins? They also contain tools. I see plugins as a bundling of core agent concepts (tools, hooks, etc)

| `@traced` | Creates a telemetry span around the step, records result or error |
| `@retryable` | Retries the step on transient errors with configurable backoff |

**Custom middleware** is user-provided via the `middleware` param on the Agent constructor. It implements the `Middleware` interface:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the differences between hooks and middlewares?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess hooks trigger on complete events vs middlewares that run on streaming? so would middleware be somewhat similar to ChunkStreamEvent

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hook is based on Event, I think middleware allows you touch functions?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Middleware would be similar to a plugin that performs some action on before and after events for a specific step. It however is more flexible in that it can directly alter execution by transforming inputs and outputs. Using plugins, we would have to build logic into the steps themselves to react to transformations to inputs and outputs (think of event.cancelTool).


### Isolated Invocation State

Each invocation gets its own `AgentState` instance. Steps receive state explicitly, so concurrent invocations on the same agent don't share mutable data:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but do we deep copy the state? I mean, if we are sharing through references, this doesn't work right?

Copy link
Copy Markdown
Member Author

@pgrayy pgrayy Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would defer to #551. I should have been more clear that we already discussed this state stuff before in that doc to avoid confusion. It is the same principle idea.

`AgentContext` holds read-only dependencies:

```typescript
interface AgentContext {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if any of the params here are truly immutable though, e.g.

  • model: people might update model ids, etc. based on throttling. this is a classic application concern
  • system prompt is updated with skills. also to inject context in agent applications
  • tool registry: one of the biggest topics is tool context bloat, and tool selectors. meaning we can expect the tools to be dynamic too

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. strands-agents/tools#389 strands-agents/tools#390

this is all about context management. and to manage context, you need to make it dynamic

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah and probably we don't need to really distinguish. Could just have AgentState. No need for AgentContext.


There are two kinds:

**Built-in middleware** ships with the SDK and is always present. It's configured through state or context at runtime. One possible way to manage built-in middleware is via decorator syntax (`@`) on step class methods, though the exact mechanism is an implementation detail. Examples:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @zastrowm

nit - more of an implementation detail - and one I might push back against
The idea is more of built-in middlewhere

class RateLimiter implements Middleware {
constructor(private _maxPerSecond: number) {}

wrap(step: Step): Step {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @zastrowm

This presumes that we have all steps pre-defined on this interface; any-way that we can make it more generic - like the way that hooks are (e.g. anyone can define a hook)

I'm thinking of something like:

agent.addMiddleware(ModelInvocationState, async (ctxt, state, next) => {
    await this._acquireToken()
    yield* next(ctx, state)
})


### Plugins

Plugins register hook callbacks to observe and indirectly influence step execution. The SDK fires lifecycle events (e.g., `BeforeModelCallEvent`, `AfterToolCallEvent`) at the appropriate points, and plugin callbacks react to them by setting flags like `retry` or `cancel` that the step or middleware responds to.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @zastrowm

I'd be curious to know when hooks are fired; Are they always at the "built-in" layer?


### Orchestrators

`Orchestrator` is a generic base class that coordinates steps and other orchestrators. Like `Step`, it provides `invoke` derived from `stream`. Orchestrators can nest: a parent orchestrator treats a sub-orchestrator the same as a step.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @zastrowm

Is there any way that an Orchestrator could be a middleware?  Or to refactor it to be be implemented via middleware.
I guess I'm wondering if we have to introduce Orchestrators in addition to Middleware or if we can simplify to just middleware

}
}
}
```
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this today of course but what I really want to highlight is that it should be easier to setup if we had the right abstractions. Doing this in Python today for example requires the placement of awkward if/elses that wrap large amounts of code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 also enables more fine grained cancellations, I guess?


Because orchestrators and steps share the same `invoke`/`stream` interface, any slot in the step sequence can be a sub-orchestrator that coordinates its own steps internally. The agent loop doesn't distinguish between the two.

Tool execution is one example. The default `ToolOrchestrator` runs tools sequentially, but swapping in a `ConcurrentToolOrchestrator` changes the execution strategy without touching `ToolStep` or the agent loop:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @zastrowm

I like this idea, but poking again I wonder if we can abstract this to be middleware.  E.g. ToolOrchestrator is middleware that takes in tool-executor


```typescript
interface AgentContext {
readonly model: Model
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we need a separate Bidi agent context? How does this idea work with bidi?

Copy link
Copy Markdown
Member Author

@pgrayy pgrayy Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BidiAgent I don't think so. I think there are enough similarities for them to share. Would need to think about it some more though. I know we had trouble before unifying but I would like us to take another stab at it.

With all that said, it is to be expected that other orchestration patterns that would require a separate state/context. Graph is an example. I wouldn't expect graph to utilize AgentState. I would do as we have today and keep MultiAgentState.


Clients are stateless, reusable, and unaware of the agent loop.

### Steps
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the thinking around step concurrency in this mental model?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is all up to the orchestrator to define. And as an example of how this could look following this pattern, I would recommend checking out https://github.com/strands-agents/sdk-typescript/blob/main/src/multiagent/graph.ts. It utilizes a queue to organize concurrency of node executions (step executions) inside a graph (orchestrator).

**ModelStep**: calls the LLM, yields streaming events, and returns the stop reason and message.

```typescript
class ModelStep extends AgentStep<ModelStreamEvent, ModelStepResult> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect MultiagentStep in future? so Will modelStep extends MultiagentStep?

```typescript
class ToolStep extends AgentStep<ToolStreamEvent, ToolStepResult> {
readonly name = 'tool'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add status enum into the field?
From there we can derive transition step ( just imagine)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants