initdb: reset and preserve errno when scanning cdb_init.d#1589
initdb: reset and preserve errno when scanning cdb_init.d#1589tuhaihe wants to merge 2 commits intoapache:mainfrom
Conversation
initdb's setup_cdb_schema() checked errno after readdir() iteration, but did not clear errno before entering the loop. On some environments (observed on Ubuntu 24.04), a stale errno value can survive into this check and be misinterpreted as a readdir failure, causing: error while reading cdb_init.d directory: Function not implemented Fix by: - setting errno=0 before the readdir() loop - storing readdir errno immediately after the loop - handling closedir() errors separately - using the saved readdir errno for the post-loop error check This prevents false-positive failures during initdb while preserving proper I/O error reporting.
| errno = 0; | ||
| while ((file = readdir(dir)) != NULL) | ||
| { | ||
| int namelen = strlen(file->d_name); | ||
|
|
||
| if (namelen > 4 && | ||
| strcmp(".sql", file->d_name + namelen - 4) == 0 && | ||
| /* | ||
| * Since 7X, we do not load gp_toolkit.sql anymore but will run | ||
| * CREATE EXTENSION gp_toolkit to do the same thing. But existing | ||
| * installation could still have gp_toolkit.sql until e.g. uninstallation | ||
| * or a major version upgrade. Ignore that file in any cases. | ||
| * XXX: should be no longer needed after 8X. | ||
| */ | ||
| (namelen < 14 || strcmp("gp_toolkit.sql", file->d_name) != 0)) | ||
| { | ||
| scriptnames = pg_realloc(scriptnames, | ||
| sizeof(char *) * (nscripts + 1)); |
There was a problem hiding this comment.
Good Catch.
But the current approach still has the same stale errno problem — just from a different source.
With errno = 0 set only once before the loop, consider: if pg_realloc() or pg_strdup() internally sets errno (e.g. via mmap/mremap fallback paths) during, say, the 8th iteration, and subsequent readdir() calls succeed (success does not clear errno), then when the loop ends normally with readdir() returning NULL, errno still holds the stale value from pg_realloc(). The if (errno) check then misinterprets it as a readdir failure — which is the same class of bug this patch is trying to fix.
The comma-expression idiom resets errno before every readdir() call, so when the loop exits, errno can only come from the final readdir():
while (errno = 0, (file = readdir(dir)) != NULL)This is the standard pattern in Postgres.
| if (closedir(dir)) | ||
| { | ||
| pg_log_error("error while closing cdb_init.d directory: %m"); |
There was a problem hiding this comment.
closedir() rarely fails in practice — we have plenty of code that doesn't check its return value — but it's fine to keep the check here, no objection.
initdb's setup_cdb_schema() checked errno after readdir() iteration, but did not clear errno before entering the loop.
On some environments (observed on Ubuntu 24.04), a stale errno value can survive into this check and be misinterpreted as a readdir failure, causing:
error while reading cdb_init.d directory: Function not implemented
Fix by:
This prevents false-positive failures during initdb while preserving proper I/O error reporting.
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheckmake -C src/test installcheck-cbdb-parallelImpact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions