Conversation
Summary of ChangesHello @alpe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors how pending blocks are handled within the system. It transitions from storing unconfirmed blocks as regular block data with empty signatures to using a dedicated metadata store. This change enhances the robustness of block production, particularly during node restarts, and includes a crucial migration path to ensure compatibility with existing data formats, thereby preventing potential issues like double-signing after a system upgrade. Additionally, module replacement directives were enabled in application-level Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Claude finished @alpe's task in 2m 29s —— View job Code Review CompleteI've performed a comprehensive review of this pending block storage refactor. The PR introduces a significant architectural improvement by separating pending blocks from finalized block storage, along with critical migration logic to prevent double-signing issues during upgrades. SummaryOverall Assessment: The refactor is well-architected and addresses an important reliability issue. The migration logic is crucial for preventing double-signing after upgrades. However, there are several issues that need to be addressed before merging. Critical Issues1. Migration Safety: Signature Check is Too Strict 🔴Location: if len(pendingHeader.Signature) != 0 {
return errors.New("pending block with signatures found")
}Issue: This error check will fail if a legacy pending block somehow has a non-empty signature, halting node initialization. This creates a potential startup failure scenario with no recovery path. Risk: If a node has a partially committed block with a signature (perhaps from an interrupted commit or edge case), the node cannot start. Recommendation: Either:
Suggested fix: if len(pendingHeader.Signature) != 0 {
e.logger.Warn().Uint64("height", candidateHeight).Msg("found block at N+1 with signature - likely already committed, skipping migration")
return nil
}2. Race Condition: Context Used for Delete Operation 🔴Location: if err := e.deletePendingBlock(e.ctx, batch); err != nil {
e.logger.Warn().Err(err).Uint64("height", newHeight).Msg("failed to delete pending block metadata")
}Issue: The delete operation uses Recommendation: Use the same context as other batch operations: if err := e.deletePendingBlock(ctx, batch); err != nil {High Priority Issues3. Inconsistent Error Handling in getPendingBlock 🟡Location: Issue: The function returns headerBytes, err := e.store.GetMetadata(ctx, headerKey)
if err != nil {
if errors.Is(err, ds.ErrNotFound) {
return nil, nil, nil // <- Returns nil error
}
return nil, nil, err
}
dataBytes, err := e.store.GetMetadata(ctx, dataKey)
if err != nil {
if errors.Is(err, ds.ErrNotFound) {
return nil, nil, nil // <- Also returns nil error
}
return nil, nil, err
}Issue: What if header exists but data doesn't? You get Recommendation: dataBytes, err := e.store.GetMetadata(ctx, dataKey)
if err != nil {
if errors.Is(err, ds.ErrNotFound) {
return nil, nil, fmt.Errorf("pending header exists but data is missing: corrupt state")
}
return nil, nil, err
}4. Missing Atomic Check-and-Delete for Pending Block 🟡Location: Issue: The Current code: func (e *Executor) deletePendingBlock(ctx context.Context, batch store.Batch) error {
if err := batch.Delete(ds.NewKey(store.GetMetaKey(headerKey))); err != nil {
return fmt.Errorf("delete pending header: %w", err)
}
// ... deletes without verification
}Recommendation: Add verification by checking the height or hash of the pending block before deletion to ensure you're deleting the correct data. 5. Store Key Functions Made Public Without Documentation 🟡Location: Issue: Multiple internal key generation functions were made public (
Impact: This expands the public API surface without clear contracts, making future refactoring harder. Recommendation: Add godoc comments: // GetHeaderKey returns the datastore key for a block header at the given height.
// This is exported for use by block management code that needs to construct
// keys for batch operations. The key format is considered stable.
func GetHeaderKey(height uint64) string {Medium Priority Issues6. Test Uses t.Context() in One Place, context.Background() in Others 🟡Location: Issue: Inconsistent context usage in tests. Most tests were updated to use In executor_restart_test.go: Uses Recommendation: For consistency and better test lifecycle management, use 7. Migration Deletes Index Key Without Checking if it Exists 🟡Location: Issue: The migration unconditionally deletes the index key, but datastore Delete operations may error if the key doesn't exist (depending on the datastore implementation). Current code: for _, key := range []string{
store.GetHeaderKey(candidateHeight),
store.GetDataKey(candidateHeight),
store.GetSignatureKey(candidateHeight),
store.GetIndexKey(headerHash[:]),
} {
if err := batch.Delete(ds.NewKey(key)); err != nil {
return fmt.Errorf("delete legacy key %s: %w", key, err)
}
}Recommendation: Depending on datastore behavior, you may want to ignore ErrNotFound during migration cleanup: if err := batch.Delete(ds.NewKey(key)); err != nil && !errors.Is(err, ds.ErrNotFound) {
return fmt.Errorf("delete legacy key %s: %w", key, err)
}8. go.mod Changes: Replace Directives Uncommented 🟡Location: Issue: The PR uncomments
Questions:
Recommendation: Add a comment in the PR description explaining the go.mod changes. Code Quality Observations9. Good: Comprehensive Test Coverage ✅The test
10. Good: Clear Separation of Concerns ✅The new 11. Good: Migration is Run Once at Startup ✅The migration logic runs during 12. Good: Logging for Migration ✅The migration logs when it migrates a block, which is helpful for operators to understand what happened. Performance & Concurrency13. Batch Operations are Properly Used ✅Both 14. No Lock Contention Issues Detected ✅The pending block operations don't introduce new concurrency issues - they follow the same patterns as existing block production code. Security Considerations15. Double-Signing Prevention is the Core Goal ✅The migration correctly addresses the double-signing risk. However, see issue #1 about the signature check being too strict. 16. No Secrets Exposure ✅No sensitive data is logged or exposed inappropriately. Documentation & Developer Experience17. Missing PR Description 🟡The PR has no description, making it hard for reviewers to understand:
Recommendation: Add a description covering:
Summary of RecommendationsMust Fix Before Merge (Critical):
Should Fix (High Priority): Consider (Medium Priority): Final VerdictThis is a well-thought-out refactor that addresses a real reliability issue (double-signing on restart). The architecture is sound, and the migration logic is the right approach. However, the critical issues (#1, #2, #3) should be addressed before merging to ensure production robustness. The changes demonstrate good understanding of the codebase and follow Go idioms well. With the recommended fixes, this will be a solid improvement to the system. Tasks
|
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3073 +/- ##
==========================================
- Coverage 61.89% 61.73% -0.16%
==========================================
Files 111 112 +1
Lines 11113 11189 +76
==========================================
+ Hits 6878 6908 +30
- Misses 3496 3530 +34
- Partials 739 751 +12
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| e.logger.Info().Uint64("height", state.LastBlockHeight). | ||
| Str("chain_id", state.ChainID).Msg("initialized state") | ||
|
|
||
| // Migrate any old-style pending block (stored at height N+1 via SaveBlockData |
There was a problem hiding this comment.
Code Review
This pull request refactors the storage of pending blocks by moving them from the main block storage to a separate metadata area. This is a good architectural improvement that isolates uncommitted data. The changes include a new pending.go file with helpers for managing pending blocks, a migration path for old-style pending blocks, and updates to the block production logic to use the new mechanism.
My review includes a couple of suggestions to improve robustness and observability. One suggestion is to log a warning when an inconsistent pending block state is detected (e.g., a header without data). Another is to handle failures in deleting pending blocks more strictly by aborting the block commit, which would prevent leaving stale data in the store.
| if err := e.deletePendingBlock(e.ctx, batch); err != nil { | ||
| e.logger.Warn().Err(err).Uint64("height", newHeight).Msg("failed to delete pending block metadata") | ||
| } |
There was a problem hiding this comment.
If deletePendingBlock fails, the current implementation logs a warning and proceeds to commit the batch. This leaves a stale pending block in the metadata store. While this might not cause issues in the next block production cycle (due to height checks), it indicates an underlying problem with the batch or store and pollutes the storage. It would be safer to treat this as a critical error and abort the commit by returning an error. This ensures that the state remains consistent.
| if err := e.deletePendingBlock(e.ctx, batch); err != nil { | |
| e.logger.Warn().Err(err).Uint64("height", newHeight).Msg("failed to delete pending block metadata") | |
| } | |
| if err := e.deletePendingBlock(e.ctx, batch); err != nil { | |
| return fmt.Errorf("failed to delete pending block metadata for height %d: %w", newHeight, err) | |
| } |
| dataBytes, err := e.store.GetMetadata(ctx, dataKey) | ||
| if err != nil { | ||
| if errors.Is(err, ds.ErrNotFound) { | ||
| return nil, nil, nil | ||
| } | ||
| return nil, nil, err | ||
| } |
There was a problem hiding this comment.
The current logic correctly handles the case where pending data is not found after a pending header has been found, but it does so silently. If a pending header exists without corresponding data, it indicates an inconsistent state that should be logged as a warning. This would help in debugging potential issues where pending blocks are not saved correctly.
| dataBytes, err := e.store.GetMetadata(ctx, dataKey) | |
| if err != nil { | |
| if errors.Is(err, ds.ErrNotFound) { | |
| return nil, nil, nil | |
| } | |
| return nil, nil, err | |
| } | |
| dataBytes, err := e.store.GetMetadata(ctx, dataKey) | |
| if err != nil { | |
| if errors.Is(err, ds.ErrNotFound) { | |
| e.logger.Warn().Msg("found pending header without pending data, ignoring") | |
| return nil, nil, nil | |
| } | |
| return nil, nil, err | |
| } |
No description provided.