Skip to content

initdb: reset and preserve errno when scanning cdb_init.d#1589

Open
tuhaihe wants to merge 2 commits intoapache:mainfrom
tuhaihe:fix-initdb
Open

initdb: reset and preserve errno when scanning cdb_init.d#1589
tuhaihe wants to merge 2 commits intoapache:mainfrom
tuhaihe:fix-initdb

Conversation

@tuhaihe
Copy link
Member

@tuhaihe tuhaihe commented Mar 2, 2026

initdb's setup_cdb_schema() checked errno after readdir() iteration, but did not clear errno before entering the loop.

On some environments (observed on Ubuntu 24.04), a stale errno value can survive into this check and be misinterpreted as a readdir failure, causing:
error while reading cdb_init.d directory: Function not implemented

Fix by:

  • setting errno=0 before the readdir() loop
  • storing readdir errno immediately after the loop
  • handling closedir() errors separately
  • using the saved readdir errno for the post-loop error check

This prevents false-positive failures during initdb while preserving proper I/O error reporting.

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


initdb's setup_cdb_schema() checked errno after readdir() iteration, but
did not clear errno before entering the loop.

On some environments (observed on Ubuntu 24.04), a stale errno value can
survive into this check and be misinterpreted as a readdir failure, causing:
  error while reading cdb_init.d directory: Function not implemented

Fix by:
- setting errno=0 before the readdir() loop
- storing readdir errno immediately after the loop
- handling closedir() errors separately
- using the saved readdir errno for the post-loop error check

This prevents false-positive failures during initdb while preserving proper
I/O error reporting.
Comment on lines +2027 to 2044
errno = 0;
while ((file = readdir(dir)) != NULL)
{
int namelen = strlen(file->d_name);

if (namelen > 4 &&
strcmp(".sql", file->d_name + namelen - 4) == 0 &&
/*
* Since 7X, we do not load gp_toolkit.sql anymore but will run
* CREATE EXTENSION gp_toolkit to do the same thing. But existing
* installation could still have gp_toolkit.sql until e.g. uninstallation
* or a major version upgrade. Ignore that file in any cases.
* XXX: should be no longer needed after 8X.
*/
(namelen < 14 || strcmp("gp_toolkit.sql", file->d_name) != 0))
{
scriptnames = pg_realloc(scriptnames,
sizeof(char *) * (nscripts + 1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Catch.
But the current approach still has the same stale errno problem — just from a different source.

With errno = 0 set only once before the loop, consider: if pg_realloc() or pg_strdup() internally sets errno (e.g. via mmap/mremap fallback paths) during, say, the 8th iteration, and subsequent readdir() calls succeed (success does not clear errno), then when the loop ends normally with readdir() returning NULL, errno still holds the stale value from pg_realloc(). The if (errno) check then misinterprets it as a readdir failure — which is the same class of bug this patch is trying to fix.
The comma-expression idiom resets errno before every readdir() call, so when the loop exits, errno can only come from the final readdir():

while (errno = 0, (file = readdir(dir)) != NULL)

This is the standard pattern in Postgres.

Comment on lines +2065 to +2067
if (closedir(dir))
{
pg_log_error("error while closing cdb_init.d directory: %m");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closedir() rarely fails in practice — we have plenty of code that doesn't check its return value — but it's fine to keep the check here, no objection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants