added prefetch and minor updates to speedup pileup#1986
Open
vasudeva8 wants to merge 2 commits intosamtools:developfrom
Open
added prefetch and minor updates to speedup pileup#1986vasudeva8 wants to merge 2 commits intosamtools:developfrom
vasudeva8 wants to merge 2 commits intosamtools:developfrom
Conversation
jkbonfield
reviewed
Feb 16, 2026
sam.c
Outdated
| lbnode_t **pptr = &iter->head; | ||
| while (*pptr != iter->tail) { | ||
| if ((*pptr)->next) | ||
| __builtin_prefetch((*pptr)->next); |
Contributor
There was a problem hiding this comment.
I think maybe we should have an hts_prefetch defined somewhere with compiler checks. __builtin_prefetch is a gcc thing, albeit supported on many other platforms and compilers too.
I'll push a new commit containing autoconf changes.
This is to facilitate the use of prefetch on deep pileups to avoid
memory latency.
On a Covid sample (local 34005_1#18.fixmate.bam file) of 150-200k
depth, the perf report -n stats for bam_plp64_next are:
prefetch 16523
volatile 16839
none 17895
These are the 5th best time of 10 runs, showing 8% gain for
prefetching and a 6% gain for the poor-man's alternative.
On shallow data it's not beneficial, but at a cost of ~2% instead.
(We may wish to consider removing the volatile case as it's a little
more costly on shallow data and a little less beneficial on deep, but
it's going to be rarely used regardless.)
NB: speed has not been checked on ARM CPUs.
Also added a check for __builtin_clz which we already use elsewhere
via compiler version checks, as we may as well add both.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added prefetch to improve pileup performance.
Updated overlap_remove to reduce hash search.
this is for samtools mpileup speed up (PR samtools#2301)