[CoreCLR/NativeAOT] Simplify synchronization in ManagedValueManager#10796
[CoreCLR/NativeAOT] Simplify synchronization in ManagedValueManager#10796simonrozsival wants to merge 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates CoreCLR/NativeAOT GC-bridge synchronization in ManagedValueManager by replacing the prior SemaphoreSlim/lock approach with a ReaderWriterLockSlim intended to better coordinate access to peer registration state during bridge processing.
Changes:
- Replaced
SemaphoreSlim+lock (RegisteredInstances)with a singleReaderWriterLockSlim(read locks for lookups, write locks for mutations). - Refactored
AddPeer,RemovePeer,PeekPeer, andGetSurfacedPeersto use the new lock strategy. - Removed locking around
HandleContext.referenceTrackingHandlesand altered bridge-processing start/finish synchronization to use the new lock.
Comments suppressed due to low confidence (1)
src/Mono.Android/Microsoft.Android.Runtime/ManagedValueManager.cs:74
CollectPeers()releasesrwLockbefore callingHandleContext.Free(), butFree()mutates the staticreferenceTrackingHandlesdictionary. This can race with bridge processing (or other writer operations) and can corrupt the dictionary. KeepHandleContext.Free()within the same write-lock scope (or otherwise synchronize access toreferenceTrackingHandles).
rwLock.EnterWriteLock ();
try {
Remove (context);
} finally {
rwLock.ExitWriteLock ();
}
HandleContext.Free (ref context);
}
src/Mono.Android/Microsoft.Android.Runtime/ManagedValueManager.cs
Outdated
Show resolved
Hide resolved
src/Mono.Android/Microsoft.Android.Runtime/ManagedValueManager.cs
Outdated
Show resolved
Hide resolved
src/Mono.Android/Microsoft.Android.Runtime/ManagedValueManager.cs
Outdated
Show resolved
Hide resolved
9d2c568 to
b5e7803
Compare
…op referenceTrackingHandles dict - Remove bridgeProcessingSemaphore: WaitForGCBridgeProcessing is now a no-op (any shared lock between bridge and managed threads causes 3-way deadlock) - Embed GCHandle directly in HandleContext struct as IntPtr field (eliminates need for Dictionary<IntPtr,GCHandle> shared between threads) - Move EnsureAllContextsAreOurs behind #if DEBUG (uses HashSet instead of the removed Dictionary) - Net result: bridge thread and managed threads share NO synchronization primitives
- Remove CollectPeers() call from AddPeer (no longer needed on hot path) - Schedule CollectPeers via Task.Run after FinishCrossReferenceProcessing (bridge thread must not take lock(RegisteredInstances) — deadlocks) - Bridge thread enqueues collected contexts, thread pool thread drains them
Replace lock(RegisteredInstances) with a dedicated Lock instance. Benchmark: neutral (30,580 vs 30,220 iterations in 2min stress test).
120d59c to
ca5bea5
Compare
ca5bea5 to
185d0b3
Compare
|
I had to make more changes because the In its final state, this PR just simplifies the existing code and cleans it up. The most significant change is removing the |
| bridgeProcessingSemaphore.Release (); | ||
| // Schedule cleanup of _registeredInstances on a thread pool thread. | ||
| // The bridge thread must not take lock(_registeredInstances) — see deadlock notes. | ||
| Task.Run (GetOrCreateInstance ().CollectPeers); |
There was a problem hiding this comment.
Should this call GetOrCreateInstance()? Is there a way to do the equivalent of Get() and return early if none found?
There was a problem hiding this comment.
public static ManagedValueManager GetOrCreateInstance () => s_instance.Value; actually we might rename this to be Instance to help confusion in the future.
|
I think there is a test failure worth looking into: |
|
@copilot address Jonathan's reveiw comments. I have struggled with the |
|
@simonrozsival I've opened a new pull request, #10806, to work on those changes. Once the pull request is ready, I'll request review from you. |
…GCBridgeProcessing (#10806) * Initial plan * Make WaitForGCBridgeProcessing synchronously drain pending contexts Fixes test failure in Dispose_Finalized by ensuring WaitForGCBridgeProcessing calls CollectPeers() to drain any contexts enqueued during bridge processing. Co-authored-by: simonrozsival <374616+simonrozsival@users.noreply.github.com> * Use TryGetInstance in bridge processing to avoid creating instance Add TryGetInstance() helper that checks IsValueCreated before accessing Value. Use it in BridgeProcessingFinished and ProcessCollectedContexts to avoid creating ManagedValueManager instance unnecessarily during bridge processing. Co-authored-by: simonrozsival <374616+simonrozsival@users.noreply.github.com> * Rename TryGetInstance to GetInstanceIfCreated and clarify comments Follow conventional naming: GetInstanceIfCreated (returns null) vs Try* (returns bool with out param). Clarify comments to reflect defensive checking rather than invariants. Co-authored-by: simonrozsival <374616+simonrozsival@users.noreply.github.com> * Rename GetOrCreateInstance to Instance property and make WaitForGCBridgeProcessing a no-op - Replace GetOrCreateInstance() method with Instance property getter for cleaner singleton pattern - Update JNIEnvInit.cs to use the new Instance property - Make WaitForGCBridgeProcessing() a no-op (remove CollectPeers call) - Update JavaObjectTest.WaitForGC() to explicitly call CollectPeers() after WaitForGCBridgeProcessing() Note: JavaObjectTest.cs change is in external/Java.Interop submodule and will need to be committed separately in the Java.Interop repository. Co-authored-by: simonrozsival <374616+simonrozsival@users.noreply.github.com> * Remove GetInstanceIfCreated and use Instance everywhere Remove the defensive null checks in bridge processing. If bridge processing is being called, the instance must already exist since objects need to be registered for bridge processing to happen. Co-authored-by: simonrozsival <374616+simonrozsival@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: simonrozsival <374616+simonrozsival@users.noreply.github.com>
|
There must be one spot using the old name: |
Summary
Simplify synchronization in
ManagedValueManagerby removing shared mutable state between the bridge thread and managed threads.Key insight
The bridge thread must not hold ANY managed lock that managed threads also hold. A managed thread holding a shared lock can be suspended at a GC safepoint (e.g.,
WeakReference.TryGetTarget()). If the GC then triggers bridge processing, the bridge thread will block trying to acquire the same lock, creating a 3-way deadlock: managed thread suspended → bridge blocked on lock → GC waiting for bridge.This was validated through extensive stress testing (~30,000 iterations, ~1,500 GC cycles per run) on a physical Android device.
Changes
Remove
bridgeProcessingSemaphore—WaitForGCBridgeProcessing()becomes a no-op. The semaphoreWait()/Release()pattern was not actually preventing anything useful.Embed
GCHandledirectly inHandleContextstruct — eliminates theDictionary<IntPtr, GCHandle> referenceTrackingHandleswhich was the last shared mutable state between bridge and managed threads. The GCHandle is stored as anIntPtrfield in the struct (native code only accesses the first two fields, so this is safe).Move
EnsureAllContextsAreOursbehind#if DEBUG— this validation used the now-removed dictionary. In DEBUG builds, a simpleHashSet<IntPtr>provides the same validation.Schedule
CollectPeersviaTask.Runafter bridge processing — instead of draining the collected contexts queue on everyAddPeercall, cleanup is scheduled on a thread pool thread afterFinishCrossReferenceProcessingreturns.Use dedicated
System.Threading.Lock— replacelock(RegisteredInstances)with a dedicatedLockinstance.Data ownership after this change
RegisteredInstances_registeredInstancesLockHandleContextnative memoryCollectedContextsqueueConcurrentQueue(lock-free)The bridge thread now touches zero shared locks in Release builds.
Fixes #10783