Eliminate forwarder stubs by reusing method precodes for call counting indirection#124664
Conversation
…g indirection For methods versioned with vtable slot backpatching (virtual/interface methods), the call counting infrastructure previously allocated separate forwarder stub precodes to serve as stable indirection points between vtable slots and deletable call counting stubs. This added memory overhead and complexity for tracking, resetting, and trimming these per-method forwarder stubs. This change reuses the method's own temporary entry point (precode) as the stable indirection during call counting. Instead of allocating a forwarder stub and backpatching vtable slots to it, vtable slots are reset to point to the method's precode, and the precode target is redirected to the call counting stub via SetTargetInterlocked. This preserves the safety property that vtable slots never point directly to deletable call counting stubs, while eliminating the separate forwarder stub allocation entirely. Components updated: - CallCountingManager::SetCodeEntryPoint: For backpatchable methods, replaced forwarder stub creation with ResetCodeEntryPoint (to ensure vtable slots point to the precode) followed by SetTargetInterlocked on the method's precode. Non-final-tier entry point transitions (threshold reached, pending completion) also update the precode target instead of backpatching vtable slots. - CallCountingManager::CompleteCallCounting: For backpatchable methods, updates the precode target to native code or resets it to prestub, rather than calling SetCodeEntryPoint/ResetCodeEntryPoint which would backpatch vtable slots. - CallCountingManager::StopAllCallCounting: For backpatchable methods, resets the precode target via ResetTargetInterlocked instead of calling ResetCodeEntryPoint. Removed the forwarder stub reset loop. - CallCountingManager::DeleteAllCallCountingStubs: Removed forwarder stub lookup and removal from the per-method cleanup loop. - CallCountingManager::TrimCollections: Removed forwarder stub hash table trimming. - callcounting.h: Removed MethodDescForwarderStubHashTraits, MethodDescForwarderStubHash, and the m_methodDescForwarderStubHash member. Updated header documentation to reflect the new design.
|
Tagging subscribers to this area: @agocke |
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce memory overhead and simplify tiered call counting for methods that use vtable slot backpatching by eliminating per-method forwarder precode stubs and instead attempting to reuse the method’s existing temporary entry point (precode) as the stable indirection point.
Changes:
- Removed the forwarder-stub tracking hash table and associated traits/types from the call counting manager.
- Updated call counting setup/completion/stop paths to redirect the method’s temporary entry point (precode) to call counting stubs or native code, instead of allocating/using a separate forwarder precode.
- Simplified cleanup/trimming logic by removing forwarder-stub reset/removal and hash trimming.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/coreclr/vm/callcounting.h | Updates design documentation and removes forwarder-stub hash member/types. |
| src/coreclr/vm/callcounting.cpp | Removes forwarder-stub hash implementation and reworks call counting entrypoint manipulation to retarget the temporary entry point precode. |
| // For methods that may have entry point slots to backpatch, redirect the method's temporary entry point | ||
| // (precode) to the call counting stub. This reuses the method's own precode as the stable indirection, | ||
| // avoiding the need to allocate separate forwarder stubs. | ||
| // | ||
| // The call counting stub should not be the entry point stored directly in vtable slots: | ||
| // - Stubs should be deletable without leaving dangling pointers in vtable slots | ||
| // - On some architectures (e.g. arm64), jitted code may load the entry point into a register at a GC-safe | ||
| // point, and the stub could be deleted before the register is used for the call | ||
| // | ||
| // Ensure vtable slots point to the temporary entry point (precode) so calls flow through | ||
| // precode → call counting stub → native code. Vtable slots may have been backpatched to native code | ||
| // during the initial publish or tiering delay. | ||
| // | ||
| // There is a benign race window between resetting vtable slots and setting the precode target: a thread | ||
| // may briefly see vtable slots pointing to the precode while the precode still points to its previous | ||
| // target (prestub or native code). This results in at most one uncounted call, which is acceptable since | ||
| // call counting is a heuristic. | ||
| methodDesc->ResetCodeEntryPoint(); | ||
| Precode *precode = Precode::GetPrecodeFromEntryPoint(methodDesc->GetTemporaryEntryPoint()); | ||
| precode->SetTargetInterlocked(callCountingCodeEntryPoint, FALSE); |
There was a problem hiding this comment.
For methods with vtable-slot backpatching, this retargets the method’s temporary entry point precode (from always forwarding to the prestub) to the call-counting stub. That conflicts with the existing backpatching design: prestub backpatching explicitly relies on the temporary entry point never being backpatched so that newly created/inherited vtable slots initially pointing to it will still flow through the prestub and get recorded for future backpatching (see prestub.cpp DoBackpatch comments around the “temporary entry point is never backpatched” rationale). With this change, new slots created while call counting is active can bypass the prestub and may never be recorded/backpatched, potentially leaving them stuck on whatever target the temporary entry point had at the time. Consider keeping the temporary entry point pointing at the prestub and using a separate stable indirection (e.g., the previous per-method forwarder precode) for call-counting, or otherwise preserving the prestub path needed for slot discovery/recording.
| // For methods that may have entry point slots to backpatch, redirect the method's temporary entry point | |
| // (precode) to the call counting stub. This reuses the method's own precode as the stable indirection, | |
| // avoiding the need to allocate separate forwarder stubs. | |
| // | |
| // The call counting stub should not be the entry point stored directly in vtable slots: | |
| // - Stubs should be deletable without leaving dangling pointers in vtable slots | |
| // - On some architectures (e.g. arm64), jitted code may load the entry point into a register at a GC-safe | |
| // point, and the stub could be deleted before the register is used for the call | |
| // | |
| // Ensure vtable slots point to the temporary entry point (precode) so calls flow through | |
| // precode → call counting stub → native code. Vtable slots may have been backpatched to native code | |
| // during the initial publish or tiering delay. | |
| // | |
| // There is a benign race window between resetting vtable slots and setting the precode target: a thread | |
| // may briefly see vtable slots pointing to the precode while the precode still points to its previous | |
| // target (prestub or native code). This results in at most one uncounted call, which is acceptable since | |
| // call counting is a heuristic. | |
| methodDesc->ResetCodeEntryPoint(); | |
| Precode *precode = Precode::GetPrecodeFromEntryPoint(methodDesc->GetTemporaryEntryPoint()); | |
| precode->SetTargetInterlocked(callCountingCodeEntryPoint, FALSE); | |
| // For methods that may have entry point slots to backpatch, the temporary entry point (precode) | |
| // must continue to forward to the prestub so that newly created or inherited vtable slots that | |
| // initially point to it will still flow through the prestub and be recorded for future backpatching. | |
| // | |
| // Redirecting the temporary entry point itself to the call counting stub would violate the | |
| // prestub backpatching design invariants (see prestub.cpp DoBackpatch comments about the | |
| // temporary entry point never being backpatched) and could result in vtable slots that | |
| // permanently bypass the prestub and are never recorded/backpatched. | |
| // | |
| // To preserve those invariants without introducing an additional stable indirection in this | |
| // code path, avoid enabling call counting for such methods. | |
| return false; |
| if (methodDesc->MayHaveEntryPointSlotsToBackpatch()) | ||
| { | ||
| Precode *precode = Precode::GetPrecodeFromEntryPoint(methodDesc->GetTemporaryEntryPoint()); | ||
| precode->SetTargetInterlocked(codeEntryPoint, FALSE); | ||
| } | ||
| else | ||
| { | ||
| methodDesc->SetCodeEntryPoint(codeEntryPoint); | ||
| } |
There was a problem hiding this comment.
In the Stage>=PendingCompletion fast path, backpatchable methods previously used MethodDesc::SetCodeEntryPoint(), which backpatches recorded vtable/entrypoint slots. Switching to only updating the temporary entry point precode target means any already-recorded slots that currently point directly to native code will not be updated to the new entry point, and it also further violates the invariant that the temporary entry point remains a prestub path for slot discovery/recording. If the intent is to keep vtable slots using slot-backpatching semantics outside of call-counting, this path should continue to use SetCodeEntryPoint()/ResetCodeEntryPoint rather than retargeting the temporary entry point precode.
| - For tiered methods that don't have a precode (virtual and interface methods when slot backpatching is enabled), the method's | ||
| own temporary entry point (precode) is redirected to the call counting stub, and vtable slots are reset to point to the | ||
| temporary entry point. This ensures calls flow through precode -> call counting stub -> native code, and the call counting | ||
| stub can be safely deleted since vtable slots don't point to it directly. | ||
| - For methods with a precode, the method's code entry point is set to the call counting stub directly. |
There was a problem hiding this comment.
The updated header summary describes redirecting the method’s temporary entry point (precode) for vtable-slot-backpatch methods. However, the existing slot-backpatching documentation/invariants elsewhere in the VM describe the temporary entry point as always pointing to the prestub (and never being backpatched) specifically to ensure new/inherited vtable slots reach the prestub for recording and future backpatching. As written, this comment is likely to mislead future maintainers unless the broader slot-backpatching design/docs are updated to match (or the implementation is adjusted to preserve the prestub-based discovery path).
| - For tiered methods that don't have a precode (virtual and interface methods when slot backpatching is enabled), the method's | |
| own temporary entry point (precode) is redirected to the call counting stub, and vtable slots are reset to point to the | |
| temporary entry point. This ensures calls flow through precode -> call counting stub -> native code, and the call counting | |
| stub can be safely deleted since vtable slots don't point to it directly. | |
| - For methods with a precode, the method's code entry point is set to the call counting stub directly. | |
| - For tiered methods that participate in vtable slot backpatching (virtual and interface methods when slot backpatching is | |
| enabled), vtable slots are reset to point to the method's temporary entry point, which continues to point to the prestub. | |
| This ensures calls flow through temporary entry point/prestub -> call counting stub -> native code, and the call counting | |
| stub can be safely deleted since vtable slots don't point to it directly. | |
| - For methods with a precode (or when slot backpatching is disabled), the method's code entry point is set to the call counting | |
| stub directly. |
For methods versioned with vtable slot backpatching (virtual/interface methods), the call counting infrastructure previously allocated separate forwarder stub precodes to serve as stable indirection points between vtable slots and deletable call counting stubs. This added memory overhead and complexity for tracking, resetting, and trimming these per-method forwarder stubs.
This change reuses the method's own temporary entry point (precode) as the stable indirection during call counting. Instead of allocating a forwarder stub and backpatching vtable slots to it, vtable slots are reset to point to the method's precode, and the precode target is redirected to the call counting stub via \SetTargetInterlocked. This preserves the safety property that vtable slots never point directly to deletable call counting stubs, while eliminating the separate forwarder stub allocation entirely.
Components updated
CallCountingManager::SetCodeEntryPoint: For backpatchable methods, replaced forwarder stub creation with \ResetCodeEntryPoint\ (to ensure vtable slots point to the precode) followed by \SetTargetInterlocked\ on the method's precode. Non-final-tier entry point transitions (threshold reached, pending completion) also update the precode target instead of backpatching vtable slots.
CallCountingManager::CompleteCallCounting: For backpatchable methods, updates the precode target to native code or resets it to prestub, rather than calling \SetCodeEntryPoint/\ResetCodeEntryPoint\ which would backpatch vtable slots.
CallCountingManager::StopAllCallCounting: For backpatchable methods, resets the precode target via \ResetTargetInterlocked\ instead of calling \ResetCodeEntryPoint. Removed the forwarder stub reset loop.
CallCountingManager::DeleteAllCallCountingStubs: Removed forwarder stub lookup and removal from the per-method cleanup loop.
CallCountingManager::TrimCollections: Removed forwarder stub hash table trimming.
callcounting.h: Removed \MethodDescForwarderStubHashTraits, \MethodDescForwarderStubHash, and the \m_methodDescForwarderStubHash\ member. Updated header documentation to reflect the new design.