Skip to content

chore: store pending block separately#3073

Draft
alpe wants to merge 2 commits intomainfrom
alex/pending_block_fix
Draft

chore: store pending block separately#3073
alpe wants to merge 2 commits intomainfrom
alex/pending_block_fix

Conversation

@alpe
Copy link
Contributor

@alpe alpe commented Feb 13, 2026

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alpe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors how pending blocks are handled within the system. It transitions from storing unconfirmed blocks as regular block data with empty signatures to using a dedicated metadata store. This change enhances the robustness of block production, particularly during node restarts, and includes a crucial migration path to ensure compatibility with existing data formats, thereby preventing potential issues like double-signing after a system upgrade. Additionally, module replacement directives were enabled in application-level go.mod files.

Highlights

  • Pending Block Storage Refactor: Introduced a dedicated mechanism for storing pending blocks in the metadata store, separating them from finalized block data.
  • Legacy Pending Block Migration: Implemented a migration strategy to convert old-style pending blocks (stored at height N+1 with empty signatures) to the new metadata format upon node initialization, preventing double-signing issues during upgrades.
  • Block Production Logic Update: Refactored the block production process to utilize the new getPendingBlock, savePendingBlock, and deletePendingBlock functions for managing unconfirmed blocks.
  • Module Replacement Directives: Uncommented replace directives in go.mod files for apps/evm and apps/testapp, likely for local development or testing setups.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • apps/evm/go.mod
    • Uncommented replace directives to enable local module replacements.
  • apps/evm/go.sum
    • Removed outdated module hashes for github.com/evstack/ev-node and github.com/evstack/ev-node/execution/evm.
  • apps/testapp/go.mod
    • Uncommented a replace directive to enable local module replacement.
  • apps/testapp/go.sum
    • Removed outdated module hashes for github.com/evstack/ev-node.
  • block/internal/executing/executor.go
    • Added a call to migrateLegacyPendingBlock during state initialization.
    • Updated pending block retrieval to use e.getPendingBlock instead of e.store.GetBlockData.
    • Refactored early block saving into a new e.savePendingBlock function.
    • Added a call to e.deletePendingBlock before committing the batch.
  • block/internal/executing/executor_restart_test.go
    • Updated context.Background() calls to t.Context() for better test isolation.
    • Modified the test to use exec1.savePendingBlock for simulating pending block storage.
  • block/internal/executing/pending.go
    • Added a new file to encapsulate pending block management logic.
    • Implemented getPendingBlock to retrieve pending blocks from metadata.
    • Implemented savePendingBlock to store blocks as pending in metadata.
    • Implemented deletePendingBlock to remove pending block metadata.
    • Implemented migrateLegacyPendingBlock to handle migration of old-style pending blocks.
  • pkg/store/header_store_adapter_test.go
    • Updated context.Background() calls to t.Context() for better test isolation.
  • pkg/store/keys.go
    • Made getHeaderKey, getDataKey, getSignatureKey, getMetaKey, and getIndexKey functions public (Get*Key) and added internal aliases.
  • pkg/store/store.go
    • Updated internal calls to getMetaKey to use the newly public GetMetaKey function.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link
Contributor

claude bot commented Feb 13, 2026

Claude finished @alpe's task in 2m 29s —— View job


Code Review Complete

I've performed a comprehensive review of this pending block storage refactor. The PR introduces a significant architectural improvement by separating pending blocks from finalized block storage, along with critical migration logic to prevent double-signing issues during upgrades.

Summary

Overall Assessment: The refactor is well-architected and addresses an important reliability issue. The migration logic is crucial for preventing double-signing after upgrades. However, there are several issues that need to be addressed before merging.


Critical Issues

1. Migration Safety: Signature Check is Too Strict 🔴

Location: block/internal/executing/pending.go:109-111

if len(pendingHeader.Signature) != 0 {
    return errors.New("pending block with signatures found")
}

Issue: This error check will fail if a legacy pending block somehow has a non-empty signature, halting node initialization. This creates a potential startup failure scenario with no recovery path.

Risk: If a node has a partially committed block with a signature (perhaps from an interrupted commit or edge case), the node cannot start.

Recommendation: Either:

  1. Log a warning and skip migration if signature is present (assuming it's a valid committed block), or
  2. Provide more context in the error message about how to recover, or
  3. Make this a warning and proceed with migration anyway (overwriting the old data)

Suggested fix:

if len(pendingHeader.Signature) != 0 {
    e.logger.Warn().Uint64("height", candidateHeight).Msg("found block at N+1 with signature - likely already committed, skipping migration")
    return nil
}

2. Race Condition: Context Used for Delete Operation 🔴

Location: block/internal/executing/executor.go:535

if err := e.deletePendingBlock(e.ctx, batch); err != nil {
    e.logger.Warn().Err(err).Uint64("height", newHeight).Msg("failed to delete pending block metadata")
}

Issue: The delete operation uses e.ctx (the executor's lifetime context) instead of the ctx parameter (the block production context). This is inconsistent and could lead to subtle bugs if the block production context has specific cancellation semantics.

Recommendation: Use the same context as other batch operations:

if err := e.deletePendingBlock(ctx, batch); err != nil {

High Priority Issues

3. Inconsistent Error Handling in getPendingBlock 🟡

Location: block/internal/executing/pending.go:21-48

Issue: The function returns (nil, nil, nil) when header is not found, but returns (nil, nil, err) when data is not found. This asymmetry could lead to subtle bugs.

headerBytes, err := e.store.GetMetadata(ctx, headerKey)
if err != nil {
    if errors.Is(err, ds.ErrNotFound) {
        return nil, nil, nil  // <- Returns nil error
    }
    return nil, nil, err
}

dataBytes, err := e.store.GetMetadata(ctx, dataKey)
if err != nil {
    if errors.Is(err, ds.ErrNotFound) {
        return nil, nil, nil  // <- Also returns nil error
    }
    return nil, nil, err
}

Issue: What if header exists but data doesn't? You get (nil, nil, nil) which is indistinguishable from "no pending block". This is an inconsistent state that should be treated as an error.

Recommendation:

dataBytes, err := e.store.GetMetadata(ctx, dataKey)
if err != nil {
    if errors.Is(err, ds.ErrNotFound) {
        return nil, nil, fmt.Errorf("pending header exists but data is missing: corrupt state")
    }
    return nil, nil, err
}

4. Missing Atomic Check-and-Delete for Pending Block 🟡

Location: block/internal/executing/pending.go:82-91

Issue: The deletePendingBlock function doesn't verify that the pending block being deleted matches the block being committed. If there's a race or corruption, you might delete the wrong pending block data.

Current code:

func (e *Executor) deletePendingBlock(ctx context.Context, batch store.Batch) error {
    if err := batch.Delete(ds.NewKey(store.GetMetaKey(headerKey))); err != nil {
        return fmt.Errorf("delete pending header: %w", err)
    }
    // ... deletes without verification
}

Recommendation: Add verification by checking the height or hash of the pending block before deletion to ensure you're deleting the correct data.


5. Store Key Functions Made Public Without Documentation 🟡

Location: pkg/store/keys.go:37-67

Issue: Multiple internal key generation functions were made public (GetHeaderKey, GetDataKey, etc.) but lack documentation explaining:

  • Why they're public
  • Who should use them
  • What the stability guarantees are

Impact: This expands the public API surface without clear contracts, making future refactoring harder.

Recommendation: Add godoc comments:

// GetHeaderKey returns the datastore key for a block header at the given height.
// This is exported for use by block management code that needs to construct
// keys for batch operations. The key format is considered stable.
func GetHeaderKey(height uint64) string {

Medium Priority Issues

6. Test Uses t.Context() in One Place, context.Background() in Others 🟡

Location: block/internal/executing/executor_restart_test.go & pkg/store/header_store_adapter_test.go

Issue: Inconsistent context usage in tests. Most tests were updated to use t.Context() but this is only partially applied.

In executor_restart_test.go: Uses t.Context() throughout (good!)
In header_store_adapter_test.go: Only line 602 uses t.Context(), rest use context.Background()

Recommendation: For consistency and better test lifecycle management, use t.Context() everywhere in tests. However, this is not critical since tests are isolated.


7. Migration Deletes Index Key Without Checking if it Exists 🟡

Location: block/internal/executing/pending.go:129-138

Issue: The migration unconditionally deletes the index key, but datastore Delete operations may error if the key doesn't exist (depending on the datastore implementation).

Current code:

for _, key := range []string{
    store.GetHeaderKey(candidateHeight),
    store.GetDataKey(candidateHeight),
    store.GetSignatureKey(candidateHeight),
    store.GetIndexKey(headerHash[:]),
} {
    if err := batch.Delete(ds.NewKey(key)); err != nil {
        return fmt.Errorf("delete legacy key %s: %w", key, err)
    }
}

Recommendation: Depending on datastore behavior, you may want to ignore ErrNotFound during migration cleanup:

if err := batch.Delete(ds.NewKey(key)); err != nil && !errors.Is(err, ds.ErrNotFound) {
    return fmt.Errorf("delete legacy key %s: %w", key, err)
}

8. go.mod Changes: Replace Directives Uncommented 🟡

Location: apps/evm/go.mod, apps/testapp/go.mod

Issue: The PR uncomments replace directives for local development. This is fine for development but:

  1. The PR description doesn't explain why
  2. These should typically not be committed unless intentional
  3. go.sum entries were removed, which could cause CI issues

Questions:

  • Are these replace directives meant to be permanent?
  • Is this for local testing only and should be reverted before merge?
  • Were the go.sum changes intentional?

Recommendation: Add a comment in the PR description explaining the go.mod changes.


Code Quality Observations

9. Good: Comprehensive Test Coverage

The test TestExecutor_RestartUsesPendingBlock is well-structured and thoroughly tests the restart scenario with pending blocks. It properly simulates:

  • Block production
  • Crash with pending block
  • Restart and recovery
  • Verification of correct behavior

10. Good: Clear Separation of Concerns

The new pending.go file cleanly encapsulates pending block management logic, making it easy to understand and maintain.

11. Good: Migration is Run Once at Startup

The migration logic runs during initializeState(), ensuring it happens exactly once per node startup, which is the correct approach.

12. Good: Logging for Migration

The migration logs when it migrates a block, which is helpful for operators to understand what happened.


Performance & Concurrency

13. Batch Operations are Properly Used

Both savePendingBlock and the migration use batches for atomic multi-key operations, which is correct.

14. No Lock Contention Issues Detected

The pending block operations don't introduce new concurrency issues - they follow the same patterns as existing block production code.


Security Considerations

15. Double-Signing Prevention is the Core Goal

The migration correctly addresses the double-signing risk. However, see issue #1 about the signature check being too strict.

16. No Secrets Exposure

No sensitive data is logged or exposed inappropriately.


Documentation & Developer Experience

17. Missing PR Description 🟡

The PR has no description, making it hard for reviewers to understand:

  • The problem being solved
  • Why the refactor was needed
  • What the migration path is
  • Testing performed

Recommendation: Add a description covering:

  • Problem: Pending blocks stored as regular blocks can cause double-signing after restart
  • Solution: Store pending blocks in metadata with migration
  • Testing: Unit tests + manual testing
  • Breaking changes: None (migration handles upgrades)

Summary of Recommendations

Must Fix Before Merge (Critical):

  1. ✅ Fix signature check in migration to handle edge cases gracefully
  2. ✅ Fix context inconsistency in deletePendingBlock call
  3. ✅ Add inconsistent state detection in getPendingBlock

Should Fix (High Priority):
4. Add godoc to newly public store functions
5. Clarify go.mod replace directive changes in PR description

Consider (Medium Priority):
6. Handle ErrNotFound gracefully in migration cleanup
7. Add PR description


Final Verdict

This is a well-thought-out refactor that addresses a real reliability issue (double-signing on restart). The architecture is sound, and the migration logic is the right approach. However, the critical issues (#1, #2, #3) should be addressed before merging to ensure production robustness.

The changes demonstrate good understanding of the codebase and follow Go idioms well. With the recommended fixes, this will be a solid improvement to the system.


Tasks

  • Analyze PR context and scope
  • Review core pending block management logic
  • Review migration logic for legacy pending blocks
  • Review block production changes in executor.go
  • Review store package changes
  • Review test updates
  • Review go.mod changes
  • Check for security, performance, and concurrency issues
  • Compile final review feedback

@github-actions
Copy link
Contributor

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedFeb 13, 2026, 2:44 PM

@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

❌ Patch coverage is 37.50000% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.73%. Comparing base (6a7c84c) to head (522cdfd).

Files with missing lines Patch % Lines
block/internal/executing/pending.go 28.76% 40 Missing and 12 partials ⚠️
block/internal/executing/executor.go 25.00% 2 Missing and 4 partials ⚠️
pkg/store/store.go 66.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3073      +/-   ##
==========================================
- Coverage   61.89%   61.73%   -0.16%     
==========================================
  Files         111      112       +1     
  Lines       11113    11189      +76     
==========================================
+ Hits         6878     6908      +30     
- Misses       3496     3530      +34     
- Partials      739      751      +12     
Flag Coverage Δ
combined 61.73% <37.50%> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

e.logger.Info().Uint64("height", state.LastBlockHeight).
Str("chain_id", state.ChainID).Msg("initialized state")

// Migrate any old-style pending block (stored at height N+1 via SaveBlockData
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link #2795 as a todo.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the storage of pending blocks by moving them from the main block storage to a separate metadata area. This is a good architectural improvement that isolates uncommitted data. The changes include a new pending.go file with helpers for managing pending blocks, a migration path for old-style pending blocks, and updates to the block production logic to use the new mechanism.

My review includes a couple of suggestions to improve robustness and observability. One suggestion is to log a warning when an inconsistent pending block state is detected (e.g., a header without data). Another is to handle failures in deleting pending blocks more strictly by aborting the block commit, which would prevent leaving stale data in the store.

Comment on lines +535 to +537
if err := e.deletePendingBlock(e.ctx, batch); err != nil {
e.logger.Warn().Err(err).Uint64("height", newHeight).Msg("failed to delete pending block metadata")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If deletePendingBlock fails, the current implementation logs a warning and proceeds to commit the batch. This leaves a stale pending block in the metadata store. While this might not cause issues in the next block production cycle (due to height checks), it indicates an underlying problem with the batch or store and pollutes the storage. It would be safer to treat this as a critical error and abort the commit by returning an error. This ensures that the state remains consistent.

Suggested change
if err := e.deletePendingBlock(e.ctx, batch); err != nil {
e.logger.Warn().Err(err).Uint64("height", newHeight).Msg("failed to delete pending block metadata")
}
if err := e.deletePendingBlock(e.ctx, batch); err != nil {
return fmt.Errorf("failed to delete pending block metadata for height %d: %w", newHeight, err)
}

Comment on lines +30 to +36
dataBytes, err := e.store.GetMetadata(ctx, dataKey)
if err != nil {
if errors.Is(err, ds.ErrNotFound) {
return nil, nil, nil
}
return nil, nil, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current logic correctly handles the case where pending data is not found after a pending header has been found, but it does so silently. If a pending header exists without corresponding data, it indicates an inconsistent state that should be logged as a warning. This would help in debugging potential issues where pending blocks are not saved correctly.

Suggested change
dataBytes, err := e.store.GetMetadata(ctx, dataKey)
if err != nil {
if errors.Is(err, ds.ErrNotFound) {
return nil, nil, nil
}
return nil, nil, err
}
dataBytes, err := e.store.GetMetadata(ctx, dataKey)
if err != nil {
if errors.Is(err, ds.ErrNotFound) {
e.logger.Warn().Msg("found pending header without pending data, ignoring")
return nil, nil, nil
}
return nil, nil, err
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants