Skip to content

Extend nitro-testnode to run consensus and execution separately#177

Open
bragaigor wants to merge 7 commits intomasterfrom
braga/consensus-exec-diff-proc
Open

Extend nitro-testnode to run consensus and execution separately#177
bragaigor wants to merge 7 commits intomasterfrom
braga/consensus-exec-diff-proc

Conversation

@bragaigor
Copy link
Copy Markdown

@bragaigor bragaigor commented Feb 13, 2026

Extend nitro-testnode to run consensus and execution in different processes

relates to NIT-4241
fixes NIT-4202

Pulled by OffchainLabs/nitro#4386

Signed-off-by: Igor Braga <5835477+bragaigor@users.noreply.github.com>
@bragaigor bragaigor force-pushed the braga/consensus-exec-diff-proc branch from 52beb21 to 3b1d3b7 Compare February 19, 2026 16:56
@bragaigor bragaigor changed the base branch from release to master February 19, 2026 16:56
Comment on lines +227 to +248
regular-follower-node:
pid: host # allow debugging
image: nitro-node-dev-testnode
entrypoint: /usr/local/bin/nitro
ports:
- "127.0.0.1:7447:8547"
- "127.0.0.1:7548:8548"
volumes:
- "seqdata:/home/user/.arbitrum/local/nitro"
- "l1keystore:/home/user/l1keystore"
- "config:/config"
- "tokenbridge-data:/tokenbridge-data"
command:
- --conf.file=/config/consensus_config.json
- --http.api=net,web3,eth,txpool,debug,timeboost,auctioneer
- --node.seq-coordinator.my-url=http://sequencer:8547
- --http.api=net,web3,eth,txpool,debug,timeboost,auctioneer
- --graphql.enable
- --graphql.vhosts=*
- --graphql.corsdomain=*
depends_on:
- geth
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't really need this but it was very helpful in debugging. I've added it behind --follower-node flag, but let me know if you prefer to remove it

Signed-off-by: Igor Braga <5835477+bragaigor@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends nitro-testnode to support running a follower node with Nitro consensus and execution split across separate processes (communicating over RPC), and adds a convenience flag for starting a regular follower node.

Changes:

  • Add --run-consensus-and-execution-in-different-processes and --follower-node flags to test-node.bash to start additional follower node services in simple mode.
  • Generate an additional consensus_config.json in scripts/config.ts for the new follower node services.
  • Add new docker-compose services (consensus-follower-node, execution-follower-node, regular-follower-node) and corresponding volumes, plus a small README update.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File Description
test-node.bash Adds CLI flags/help text and wires them into the NODES set; adjusts init logging/sleep ordering.
scripts/config.ts Writes a new consensus_config.json derived from the base config.
docker-compose.yaml Introduces new follower node services for split consensus/execution and a regular follower node, with new volumes/ports.
README.md Documents how to print an account private key via the print-private-key script.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Copy Markdown
Member

@pmikolajczyk41 pmikolajczyk41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so running ./test-node.bash --dev --init-force --run-consensus-and-execution-in-different-processes --simple spawned indeed two follower nodes; their logs are not very dynamic, but at least consensus one seems to make progress with lines like:

INFO [03-13|10:00:32.413] InboxTracker                             sequencerBatchCount=9 messageCount=472 l1Block=414 l1Timestamp=2026-03-13T10:00:32+0000

for the execution follower, I guess, I'd have to throw some RPC requests at it


the only thing that looked suspicious was miscommunication between the two:

# consensus follower:
INFO [03-13|09:59:02.958] Reading message result remotely.         msgIdx=294
INFO [03-13|09:59:02.958] rpc response                             method=nitroexecution_resultAtMessageIndex logId=2794 err="result not found"             result=null attempt=0 args=[294] errorData=null
ERROR[03-13|09:59:02.958] Error getting message result             msgIdx=294 err="result not found"

# execution follower
WARN [03-13|09:59:02.958] Served nitroexecution_resultAtMessageIndex reqid=2794 duration="43.625µs" err="result not found"

and this continuous to happen every ~2 minutes; in particular, logs from exec follower look:

WARN [03-13|09:59:02.958] Served nitroexecution_resultAtMessageIndex reqid=2794 duration="43.625µs" err="result not found"
WARN [03-13|10:01:02.619] Served nitroexecution_resultAtMessageIndex reqid=5215 duration="40.541µs" err="result not found"
WARN [03-13|10:02:33.140] Served nitroexecution_resultAtMessageIndex reqid=7036 duration="40.166µs" err="result not found"
WARN [03-13|10:03:33.533] Served nitroexecution_resultAtMessageIndex reqid=8318 duration="37.75µs"  err="result not found"

@bragaigor
Copy link
Copy Markdown
Author

the only thing that looked suspicious was miscommunication between the two:

I noticed that as well. I'm not sure if that's something related to docker configuration or if we're missing something in the code? Maybe @diegoximenes could shed a light? As in the request above is regarding _resultAtMessageIndex endpoint, but I don't see any other requests being made unless they are not being logged? Do we still need OffchainLabs/nitro#4264 to make some of that work maybe?

I did some further investigation (added some logs to client (consensus-followe) and server (execution-follower)), and I was able to see constant successful communication between them; I was able to spot the following endpoint requests:

  • _markFeedStart
  • _headMessageIndex
  • _setConsensusSyncData
  • _setFinalityData
  • _digestMessage

for which all completed successfully. So it seems related to something on server side (execution-follower-node) specific to _resultAtMessageIndex. And after some more digging it seems like whenever a _resultAtMessageIndex is made and execution is behind consensus then execution can't find the block, here are some of the logs I added to execution side:

Time Requested msgIdx currentHead Gap
13:18:59 137 128 9
13:20:29 314 257 57
13:21:59 492 475 17

is it expected for msgIdx to always be behind currentHead? I got currenHead by asking the blockchain s.bc.CurrentBlock(), while msgIdx is the requested msgIdx.

CC: @pmikolajczyk41

Signed-off-by: Igor Braga <5835477+bragaigor@users.noreply.github.com>
pmikolajczyk41
pmikolajczyk41 previously approved these changes Mar 13, 2026
Signed-off-by: Igor Braga <5835477+bragaigor@users.noreply.github.com>
Comment on lines +431 to +432
// Use a different persistent.chain name than the sequencer ("local") to avoid
// storage lock conflicts between services sharing the same volume.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is confusing to me.
Which services are sharing the same volume?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we run either execution-follower-node or regular-follower-node where they share the same volume we acquire a lock on that file:
https://github.com/OffchainLabs/go-ethereum/blob/c21e595c40ec1fdc84d4a73fb6e52d9679597698/node/node.go#L322-L330

- --conf.file=/config/base_node_config.json
- --node.feed.output.enable
- --http.api=net,web3,eth,txpool,debug,timeboost,auctioneer
- --node.seq-coordinator.my-url=http://sequencer:8547
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only kept --conf.file=

- "tokenbridge-data:/tokenbridge-data"
command:
- --conf.file=/config/base_node_config.json
- --node.seq-coordinator.my-url=http://sequencer:8547
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only kept --conf.file=

- "tokenbridge-data:/tokenbridge-data"
command:
- --conf.file=/config/base_node_config.json
- --node.seq-coordinator.my-url=http://sequencer:8547
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only kept --conf.file=

README.md Outdated

### `--run-consensus-and-execution-in-different-processes`

Adds a **split-process follower** where consensus and execution run as separate containers communicating over authenticated RPC.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not authenticated, this is the flag being used: --node.rpc-server.authenticated=false

@diegoximenes
Copy link
Copy Markdown
Contributor

diegoximenes commented Mar 23, 2026

Regarding Error getting message result, this is related to this task.

Regarding is it expected for msgIdx to always be behind currentHead?, which is also related to the task that I linked.
It is OK for the requested msgIdx bigger than current head on Execution side.
ConsensusExecutionSyncer can trigger that.
The issue is if the client is not handling it properly, which is the case today.
It should log a debug and not an error as it is today.
The "real test" is comparing the block hashed from the sequencer and the follower node.

Signed-off-by: Igor Braga <5835477+bragaigor@users.noreply.github.com>
Signed-off-by: Igor Braga <5835477+bragaigor@users.noreply.github.com>
@bragaigor bragaigor requested a review from diegoximenes March 23, 2026 22:20
@bragaigor bragaigor assigned diegoximenes and unassigned bragaigor Mar 23, 2026
@bragaigor
Copy link
Copy Markdown
Author

The "real test" is comparing the block hashed from the sequencer and the follower node.

I confirmed that both regular-follower-node and execution-only-follower-node are deriving the same block hashes as the sequencer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants