block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT#552
block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT#552blktests-ci[bot] wants to merge 1 commit intolinus-master_basefrom
Conversation
|
Upstream branch: c22e26b |
519f160 to
49ecc64
Compare
|
Upstream branch: 37a93dd |
b0802e7 to
b57a767
Compare
|
Upstream branch: 37a93dd |
b57a767 to
bf69262
Compare
49ecc64 to
0525b37
Compare
|
Upstream branch: 7449f86 |
bf69262 to
607d496
Compare
|
Upstream branch: 7449f86 |
607d496 to
a2259e8
Compare
0525b37 to
6de2940
Compare
|
Upstream branch: cee73b1 |
a2259e8 to
ff3d6b3
Compare
6de2940 to
bbff8a4
Compare
|
Upstream branch: ca4ee40 |
ff3d6b3 to
cbc9798
Compare
bbff8a4 to
be7af85
Compare
|
Upstream branch: 26a4cfa |
cbc9798 to
c60c125
Compare
be7af85 to
bfa4f99
Compare
|
Upstream branch: 0f2acd3 |
c60c125 to
a18d878
Compare
bfa4f99 to
e2350d3
Compare
ecd10e2 to
d0e1bed
Compare
|
Upstream branch: af4e9ef |
c237dde to
7b14387
Compare
d0e1bed to
6b51c57
Compare
|
Upstream branch: 0031c06 |
7b14387 to
ee76c58
Compare
6b51c57 to
78036b2
Compare
|
Upstream branch: ecc64d2 |
ee76c58 to
447dca3
Compare
|
Upstream branch: c107785 |
447dca3 to
a152354
Compare
bbb3394 to
901a429
Compare
|
Upstream branch: 5ee8dbf |
a152354 to
76a44a7
Compare
901a429 to
1f19ba6
Compare
|
Upstream branch: 1f318b9 |
76a44a7 to
fe65ef4
Compare
1f19ba6 to
e79276a
Compare
|
Upstream branch: None |
… on RT In RT kernel (PREEMPT_RT), commit 6bda857 ("block: fix ordering between checking QUEUE_FLAG_QUIESCED request adding") causes severe performance regression on systems with multiple MSI-X interrupt vectors. The above change introduced spinlock_t queue_lock usage in blk_mq_run_hw_queue() to synchronize QUEUE_FLAG_QUIESCED checks with blk_mq_unquiesce_queue(). While this works correctly in standard kernel, it causes catastrophic serialization in RT kernel where spinlock_t converts to sleeping rt_mutex. Problem in RT kernel: - blk_mq_run_hw_queue() is called from IRQ thread context - With multiple MSI-X vectors, all IRQ threads contend on the same queue_lock - queue_lock becomes rt_mutex (sleeping) in RT kernel - IRQ threads serialize and enter D-state waiting for lock - Throughput drops from 640 MB/s to 153 MB/s Solution: Convert quiesce_depth to atomic_t and use it directly for quiesce state checking, eliminating QUEUE_FLAG_QUIESCED entirely. This removes the need for any locking in the hot path. The atomic counter serves as both the depth tracker and the quiesce indicator (depth > 0 means quiesced). This eliminates the race window that existed between updating the depth and the flag. Memory ordering is ensured by: - smp_mb__after_atomic() after modifying quiesce_depth - smp_rmb() before re-checking quiesce state in blk_mq_run_hw_queue() Performance impact: - RT kernel: eliminates lock contention, restores full throughput - Non-RT kernel: atomic ops are similar cost to the previous spinlock acquire/release, no regression expected Test results on RT kernel: Hardware: Broadcom/LSI MegaRAID 12GSAS/PCIe Secure SAS39xx (megaraid_sas driver, 128 MSI-X vectors, 120 hw queues) - Before: 153 MB/s, IRQ threads in D-state - After: 640 MB/s, no IRQ threads blocked Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: 6bda857 ("block: fix ordering between checking QUEUE_FLAG_QUIESCED request adding") Cc: stable@vger.kernel.org Signed-off-by: Ionut Nechita <ionut.nechita@windriver.com>
fe65ef4 to
09b0064
Compare
Pull request for series with
subject: block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT
version: 3
url: https://patchwork.kernel.org/project/linux-block/list/?series=1053247