maintainer,scheduler: add dispatcher drain runtime#4760
maintainer,scheduler: add dispatcher drain runtime#4760hongyunyan wants to merge 3 commits intosplit/pr-4190-2a-drain-protocolfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a drainScheduler to handle node evacuation by moving dispatchers away from a target node. It integrates this into the existing scheduling framework, ensuring that basic and balance schedulers avoid the drain target to prevent churn. It also adds configuration for BalanceMoveBatchSize and enhances status reporting to include drain progress. Feedback suggests using the operator controller's CountInflightDrainMovesFromNode method in the drainScheduler to eliminate redundant logic.
maintainer/scheduler/drain.go
Outdated
| if availableSize <= 0 { | ||
| return time.Now().Add(time.Millisecond * 200) | ||
| } | ||
| drainSlots := s.fixedDrainMoveLimit - s.countInflightDrainMoves(target) |
There was a problem hiding this comment.
The drainScheduler implements its own countInflightDrainMoves method which is identical to the newly added oc.CountInflightDrainMovesFromNode(target). Using the controller's method is more efficient as it avoids creating a temporary slice of all operators and iterates over the internal map directly under a read lock.
| drainSlots := s.fixedDrainMoveLimit - s.countInflightDrainMoves(target) | |
| drainSlots := s.fixedDrainMoveLimit - s.operatorController.CountInflightDrainMovesFromNode(target) |
maintainer/scheduler/drain.go
Outdated
| func (s *drainScheduler) countInflightDrainMoves(target node.ID) int { | ||
| count := 0 | ||
| for _, op := range s.operatorController.GetAllOperators() { | ||
| moveOp, ok := op.(*operator.MoveDispatcherOperator) | ||
| if !ok { | ||
| continue | ||
| } | ||
| if moveOp.IsFinished() { | ||
| continue | ||
| } | ||
| if moveOp.OriginNode() != target { | ||
| continue | ||
| } | ||
| count++ | ||
| } | ||
| return count | ||
| } |
What problem does this PR solve?
The dispatcher-drain work split out of #4190 still mixes maintainer runtime behavior, coordinator scheduling, and public API orchestration in one review unit. This PR extracts the maintainer-side dispatcher drain runtime so reviewers can focus on target consumption, progress reporting, and drain-aware scheduling inside one changefeed after the protocol layer from #4759.
Issue Number: ref #4190
What is changed and how it works?
Background:
Motivation:
Summary:
DrainStateand wire it into controller stateDrainProgressreporting to maintainer statusbalance-move-batch-sizeto scheduler configconfig.NewDefaultSchedulerConfig()so new scheduler defaults do not silently break themHow it works:
Check List
Tests
go test ./maintainer/...go test ./pkg/scheduler ./pkg/configQuestions
Will it cause performance regression or break compatibility?
This PR does not expose a new public API by itself. It only adds maintainer-local dispatcher drain runtime and bounded balancing behavior on top of the internal drain target plumbing from #4759.
Do you need to update user documentation, design documentation or monitoring documentation?
No additional user-facing documentation is needed for this split. It is an internal decomposition of the drain-capture implementation.
Release note