bgpd: Replace update-delay with convergence-wait#21355
bgpd: Replace update-delay with convergence-wait#21355donaldsharp wants to merge 1 commit intoFRRouting:masterfrom
update-delay with convergence-wait#21355Conversation
The `update-delay` command implies that FRR is just waiting for some time before actually sending updates to a peer. This is not what actually happens. What actually happens is that all bestpath processing is delayed until after this timer pops as well. In an interest of clarifying behavior, especially in light of a desire to add more control to when certain actions happen in bgp. We are changing what the command name is in FRR. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
As discussed in the technical meeting. I would like to propose that we rename update-delay to convergence-wait inside of bgp. I've added a 3 year CPP_NOTICE and duplicated the json outputed as well. From my perspective we would need general agreement from a good chunk of the community before this could go forward. |
Greptile SummaryThis PR renames the BGP However, there are several concrete bugs that need to be addressed before merge:
Confidence Score: 2/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["BGP peer establishes"] --> B{"bgp_convergence_wait_configured?"}
B -- No --> Z["Normal route advertisement"]
B -- Yes --> C["bgp_convergence_wait_begin()\nPlug process_queue\nStart t_convergence_wait timer\nSet peer.convergence_wait_over = 0"]
C --> D{"t_establish_wait != t_convergence_wait?"}
D -- Yes --> E["Start t_establish_wait timer"]
D -- No --> F["Wait for EORs / peer events"]
E --> F
F --> G{"Peer EOR received or\nrestarted peer?"}
G -- Yes --> H["bgp_check_convergence_wait()\nCount received_eors + restarted_peers"]
G -- No --> I{"t_convergence_wait\nexpired?"}
I -- Yes --> J["bgp_convergence_wait_timer()\ncalls bgp_convergence_wait_end()"]
H --> K{"All peers accounted for?"}
K -- No --> F
K -- Yes --> J
J --> L["bgp_convergence_wait_end()\nUnplug process_queue\nSet convergence_wait_over=1\nResume zebra + peer updates\nUpdate timestamps"]
L --> Z
Prompt To Fix All With AIThis is a comment left during a code review.
Path: bgpd/bgp_vty.c
Line: 22097-22098
Comment:
**New `convergence-wait` commands missing from `BGP_NODE`**
`bgp_convergence_wait_cmd` and `no_bgp_convergence_wait_cmd` are defined with `DEFPY` but are never installed into `BGP_NODE`. As a result, the new per-VRF `convergence-wait` command is inaccessible to users — only the hidden (legacy) `update-delay` command is registered here. The comment says `/* bgp convergence-wait command */` but the two new commands are never installed.
```suggestion
/* bgp convergence-wait command */
install_element(BGP_NODE, &bgp_convergence_wait_cmd);
install_element(BGP_NODE, &no_bgp_convergence_wait_cmd);
install_element(BGP_NODE, &bgp_update_delay_cmd);
install_element(BGP_NODE, &no_bgp_update_delay_cmd);
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: bgpd/bgp_vty.c
Line: 2651-2658
Comment:
**Hidden `no_bgp_update_delay` has wrong CLI string, breaking backward compatibility**
The hidden `no_bgp_update_delay_cmd` was changed to use `"no convergence-wait ..."` instead of keeping `"no update-delay ..."`. This has two problems:
1. **Backward compatibility is broken**: existing configs that saved `no update-delay` in a VRF context will now be rejected.
2. **Duplicate CLI string**: both `no_bgp_convergence_wait_cmd` and this hidden command register `"no convergence-wait ..."` for `BGP_NODE`. However, since only `no_bgp_update_delay_cmd` is installed in `BGP_NODE` (the new `no_bgp_convergence_wait_cmd` is not — see other comment), the deconfig path works only via this hidden command with the wrong name.
The correct fix is to keep the old CLI string for backward compat:
```suggestion
DEFPY_HIDDEN (no_bgp_update_delay,
no_bgp_update_delay_cmd,
"no update-delay [(0-3600) [(1-3600)]]",
NO_STR
"Force initial delay for best-path and updates\n"
"Max delay in seconds\n"
"Establish wait in seconds\n")
{
return bgp_convergence_wait_deconfig_vty(vty);
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: tests/topotests/bgp_update_delay/test_bgp_update_delay.py
Line: 131-132
Comment:
**JSON key mismatch: test expects `convergenceDelayLimit` but code emits `convergenceWaitLimit`**
The `bgp_show_summary` function in `bgp_vty.c` emits `"convergenceWaitLimit"`, but the test (and the parallel `_bgp_check_update_delay_and_wait` and `_bgp_check_vrf_update_delay_and_wait` functions below) expect `"convergenceDelayLimit"`. These tests will fail.
Similarly, the test at lines 148–151 expects `"convergenceWaitEstablishWait"` but the code emits `"convergenceDelayEstablishWait"`.
The JSON key names are inconsistent throughout the new code — some use the `convergenceDelay*` prefix and others use `convergenceWait*`. Either the test expectations or the production code needs to be aligned to a single naming scheme.
Affected test assertions:
- `"convergenceDelayLimit"` (lines ~132, 159) — code emits `"convergenceWaitLimit"`
- `"convergenceWaitEstablishWait"` (line ~151, 164) — code emits `"convergenceDelayEstablishWait"`
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: bgpd/bgp_vty.c
Line: 12582
Comment:
**Duplicate JSON key with inconsistent casing: `bgpconvergenceWaitTime`**
Three JSON fields are now emitted for the same value:
- `"bgpConvergenceWaitTime"` (new canonical name)
- `"bgpUpdateDelayTime"` (old name, kept for backward compat)
- `"bgpconvergenceWaitTime"` (lowercase `c` — looks like an accidental typo)
The third entry with a lowercase `c` does not match any expected naming convention and appears to be a copy-paste error. It should be removed or, if intentional, should be documented.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "bgpd: Replace `update-delay` with `conve..." | Re-trigger Greptile |
| install_element(BGP_NODE, &bgp_update_delay_cmd); | ||
| install_element(BGP_NODE, &no_bgp_update_delay_cmd); |
There was a problem hiding this comment.
New
convergence-wait commands missing from BGP_NODE
bgp_convergence_wait_cmd and no_bgp_convergence_wait_cmd are defined with DEFPY but are never installed into BGP_NODE. As a result, the new per-VRF convergence-wait command is inaccessible to users — only the hidden (legacy) update-delay command is registered here. The comment says /* bgp convergence-wait command */ but the two new commands are never installed.
| install_element(BGP_NODE, &bgp_update_delay_cmd); | |
| install_element(BGP_NODE, &no_bgp_update_delay_cmd); | |
| /* bgp convergence-wait command */ | |
| install_element(BGP_NODE, &bgp_convergence_wait_cmd); | |
| install_element(BGP_NODE, &no_bgp_convergence_wait_cmd); | |
| install_element(BGP_NODE, &bgp_update_delay_cmd); | |
| install_element(BGP_NODE, &no_bgp_update_delay_cmd); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: bgpd/bgp_vty.c
Line: 22097-22098
Comment:
**New `convergence-wait` commands missing from `BGP_NODE`**
`bgp_convergence_wait_cmd` and `no_bgp_convergence_wait_cmd` are defined with `DEFPY` but are never installed into `BGP_NODE`. As a result, the new per-VRF `convergence-wait` command is inaccessible to users — only the hidden (legacy) `update-delay` command is registered here. The comment says `/* bgp convergence-wait command */` but the two new commands are never installed.
```suggestion
/* bgp convergence-wait command */
install_element(BGP_NODE, &bgp_convergence_wait_cmd);
install_element(BGP_NODE, &no_bgp_convergence_wait_cmd);
install_element(BGP_NODE, &bgp_update_delay_cmd);
install_element(BGP_NODE, &no_bgp_update_delay_cmd);
```
How can I resolve this? If you propose a fix, please make it concise.| no_bgp_update_delay_cmd, | ||
| "no update-delay [(0-3600) [(1-3600)]]", | ||
| "no convergence-wait [(0-3600) [(1-3600)]]", | ||
| NO_STR | ||
| "Force initial delay for best-path and updates\n" | ||
| "Max delay in seconds\n" | ||
| "Establish wait in seconds\n") | ||
| { | ||
| return bgp_update_delay_deconfig_vty(vty); | ||
| return bgp_convergence_wait_deconfig_vty(vty); |
There was a problem hiding this comment.
Hidden
no_bgp_update_delay has wrong CLI string, breaking backward compatibility
The hidden no_bgp_update_delay_cmd was changed to use "no convergence-wait ..." instead of keeping "no update-delay ...". This has two problems:
- Backward compatibility is broken: existing configs that saved
no update-delayin a VRF context will now be rejected. - Duplicate CLI string: both
no_bgp_convergence_wait_cmdand this hidden command register"no convergence-wait ..."forBGP_NODE. However, since onlyno_bgp_update_delay_cmdis installed inBGP_NODE(the newno_bgp_convergence_wait_cmdis not — see other comment), the deconfig path works only via this hidden command with the wrong name.
The correct fix is to keep the old CLI string for backward compat:
| no_bgp_update_delay_cmd, | |
| "no update-delay [(0-3600) [(1-3600)]]", | |
| "no convergence-wait [(0-3600) [(1-3600)]]", | |
| NO_STR | |
| "Force initial delay for best-path and updates\n" | |
| "Max delay in seconds\n" | |
| "Establish wait in seconds\n") | |
| { | |
| return bgp_update_delay_deconfig_vty(vty); | |
| return bgp_convergence_wait_deconfig_vty(vty); | |
| DEFPY_HIDDEN (no_bgp_update_delay, | |
| no_bgp_update_delay_cmd, | |
| "no update-delay [(0-3600) [(1-3600)]]", | |
| NO_STR | |
| "Force initial delay for best-path and updates\n" | |
| "Max delay in seconds\n" | |
| "Establish wait in seconds\n") | |
| { | |
| return bgp_convergence_wait_deconfig_vty(vty); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: bgpd/bgp_vty.c
Line: 2651-2658
Comment:
**Hidden `no_bgp_update_delay` has wrong CLI string, breaking backward compatibility**
The hidden `no_bgp_update_delay_cmd` was changed to use `"no convergence-wait ..."` instead of keeping `"no update-delay ..."`. This has two problems:
1. **Backward compatibility is broken**: existing configs that saved `no update-delay` in a VRF context will now be rejected.
2. **Duplicate CLI string**: both `no_bgp_convergence_wait_cmd` and this hidden command register `"no convergence-wait ..."` for `BGP_NODE`. However, since only `no_bgp_update_delay_cmd` is installed in `BGP_NODE` (the new `no_bgp_convergence_wait_cmd` is not — see other comment), the deconfig path works only via this hidden command with the wrong name.
The correct fix is to keep the old CLI string for backward compat:
```suggestion
DEFPY_HIDDEN (no_bgp_update_delay,
no_bgp_update_delay_cmd,
"no update-delay [(0-3600) [(1-3600)]]",
NO_STR
"Force initial delay for best-path and updates\n"
"Max delay in seconds\n"
"Establish wait in seconds\n")
{
return bgp_convergence_wait_deconfig_vty(vty);
}
```
How can I resolve this? If you propose a fix, please make it concise.| output = json.loads(router.vtysh_cmd("show ip bgp sum json")) | ||
| expected = {"ipv4Unicast": {"updateDelayLimit": 20}} | ||
| expected = {"ipv4Unicast": {"convergenceDelayLimit": 20}} |
There was a problem hiding this comment.
JSON key mismatch: test expects
convergenceDelayLimit but code emits convergenceWaitLimit
The bgp_show_summary function in bgp_vty.c emits "convergenceWaitLimit", but the test (and the parallel _bgp_check_update_delay_and_wait and _bgp_check_vrf_update_delay_and_wait functions below) expect "convergenceDelayLimit". These tests will fail.
Similarly, the test at lines 148–151 expects "convergenceWaitEstablishWait" but the code emits "convergenceDelayEstablishWait".
The JSON key names are inconsistent throughout the new code — some use the convergenceDelay* prefix and others use convergenceWait*. Either the test expectations or the production code needs to be aligned to a single naming scheme.
Affected test assertions:
"convergenceDelayLimit"(lines ~132, 159) — code emits"convergenceWaitLimit""convergenceWaitEstablishWait"(line ~151, 164) — code emits"convergenceDelayEstablishWait"
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/bgp_update_delay/test_bgp_update_delay.py
Line: 131-132
Comment:
**JSON key mismatch: test expects `convergenceDelayLimit` but code emits `convergenceWaitLimit`**
The `bgp_show_summary` function in `bgp_vty.c` emits `"convergenceWaitLimit"`, but the test (and the parallel `_bgp_check_update_delay_and_wait` and `_bgp_check_vrf_update_delay_and_wait` functions below) expect `"convergenceDelayLimit"`. These tests will fail.
Similarly, the test at lines 148–151 expects `"convergenceWaitEstablishWait"` but the code emits `"convergenceDelayEstablishWait"`.
The JSON key names are inconsistent throughout the new code — some use the `convergenceDelay*` prefix and others use `convergenceWait*`. Either the test expectations or the production code needs to be aligned to a single naming scheme.
Affected test assertions:
- `"convergenceDelayLimit"` (lines ~132, 159) — code emits `"convergenceWaitLimit"`
- `"convergenceWaitEstablishWait"` (line ~151, 164) — code emits `"convergenceDelayEstablishWait"`
How can I resolve this? If you propose a fix, please make it concise.| @@ -12537,7 +12580,9 @@ DEFPY(show_bgp_router, | |||
| json_object_int_add(json, "bgpOutputQueueLimit", bm->outq_limit); | |||
| json_object_int_add(json, "zebraAnnounceCount", | |||
| zebra_announce_count(&bm->zebra_announce_head)); | |||
There was a problem hiding this comment.
Duplicate JSON key with inconsistent casing:
bgpconvergenceWaitTime
Three JSON fields are now emitted for the same value:
"bgpConvergenceWaitTime"(new canonical name)"bgpUpdateDelayTime"(old name, kept for backward compat)"bgpconvergenceWaitTime"(lowercasec— looks like an accidental typo)
The third entry with a lowercase c does not match any expected naming convention and appears to be a copy-paste error. It should be removed or, if intentional, should be documented.
Prompt To Fix With AI
This is a comment left during a code review.
Path: bgpd/bgp_vty.c
Line: 12582
Comment:
**Duplicate JSON key with inconsistent casing: `bgpconvergenceWaitTime`**
Three JSON fields are now emitted for the same value:
- `"bgpConvergenceWaitTime"` (new canonical name)
- `"bgpUpdateDelayTime"` (old name, kept for backward compat)
- `"bgpconvergenceWaitTime"` (lowercase `c` — looks like an accidental typo)
The third entry with a lowercase `c` does not match any expected naming convention and appears to be a copy-paste error. It should be removed or, if intentional, should be documented.
How can I resolve this? If you propose a fix, please make it concise.|
Is this really worth changing? The update-delay is a well-established concept (Please search for "Cisco IOS bgp update-delay"). It controls the initial best-path calculation, RIB installation and updates to peers. |
|
I think so yes, because it is confusingly named and we want to introduce more control to allow for bestpath processing but not sending route updates to the peers yet. We discussed this during the technical meeting and I was asked to give specific examples so I did. |
|
In addition to renaming, would the command be split into multiples? |
|
I do not understand your question, can you rephrase? |
The question is about whether the existing functionalities of "update-delay" would be replaced by multiple commands for "better control" as you mentioned, as renaming alone would not offer "better control". |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
|
I'm inclined to agree with @enkechen-panw. While the cmd name is definitely misleading, it's been there forever and well understood. For the other feature, one option is to call it something like "startup advertisement-delay". |
|
I agree with Donald to rename it, because it's more "descriptive" of what it actually does. More context is here: #21430. update-delay defers FIB/best-path, which is not really what we imagine (BGP UPDATE), while #21430 introduces advertisement-delay, which is what we expect (to delay only UPDATEs). So, in general, two changes are coming: a) rename "update-delay" to "convergence-wait"; b) add a new command "advertisement-delay". |
|
Could this be addressed via documentation? It's always tricky (and difficult) to change a customer-facing configuration. |
The
update-delaycommand implies that FRR is just waiting for some time before actually sending updates to a peer. This is not what actually happens. What actually happens is that all bestpath processing is delayed until after this timer pops as well. In an interest of clarifying behavior, especially in light of a desire to add more control to when certain actions happen in bgp. We are changing what the command name is in FRR.