Skip to content

GH-49261: [C++][CI] Use differential fuzzing on IPC file fuzzer#49312

Draft
pitrou wants to merge 1 commit intoapache:mainfrom
pitrou:gh49261-differential-ipc-fuzz
Draft

GH-49261: [C++][CI] Use differential fuzzing on IPC file fuzzer#49312
pitrou wants to merge 1 commit intoapache:mainfrom
pitrou:gh49261-differential-ipc-fuzz

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Feb 17, 2026

Rationale for this change

Enable differential fuzzing to strengthen the invariants exercised by the IPC file fuzzer.

What changes are included in this PR?

When the IPC file fuzzer reads the IPC file successfully, also read the underlying IPC stream and compare the resulting contents for equality. Inequality when reading is treated as a hard failure (crashing the process so that an issue is reported).

There is a caveat: a technically valid IPC file might read differently than the enclosed IPC stream. It seems unlikely that the fuzzer would generate such a file, but we'll see.

See discussion on the dev ML:
https://lists.apache.org/thread/jpxl3yzm96wkxzb1clokxklsy32b3plh

Are these changes tested?

By manually running the fuzz target against existing seed files.

Are there any user-facing changes?

No.

@pitrou
Copy link
Member Author

pitrou commented Feb 17, 2026

@addisoncrump FYI and if you're not bored of this :)

@pitrou
Copy link
Member Author

pitrou commented Feb 17, 2026

@github-actions crossbow submit fuzz

@github-actions
Copy link

Revision: 589a1ff

Submitted crossbow builds: ursacomputing/crossbow @ actions-c68bd13b91

Task Status
test-build-cpp-fuzz GitHub Actions

@addisoncrump
Copy link

Looks like a reasonable differential fuzzer impl. Are the two sources of information different? Or is one just a "wrapped" version of another? If so, your fuzzer might be exploring a lot of code that isn't relevant to the actual conversion bit (i.e., the bit you're actually testing).

@pitrou
Copy link
Member Author

pitrou commented Feb 17, 2026

The IPC file format is just the IPC stream format + a fixed-size header + a footer with additional metadata for random access (like a ZIP catalog, basically).

However, since the IPC file format allows for random access, the IPC file reader has specifics shortcuts and heuristics to make better use of IO, meaning different code paths than the IPC stream format.

(it's not a problem to exercise the IPC stream reader code, either, even though we have a separate fuzz harness for it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants