Skip to content

fix(penpal): eliminate unnecessary git check-ignore processes and full project walks#549

Merged
loganj merged 6 commits intomainfrom
penpal-git-check-ignore-timing
Mar 30, 2026
Merged

fix(penpal): eliminate unnecessary git check-ignore processes and full project walks#549
loganj merged 6 commits intomainfrom
penpal-git-check-ignore-timing

Conversation

@loganj
Copy link
Copy Markdown
Collaborator

@loganj loganj commented Mar 30, 2026

Eliminates `git check-ignore` subprocess spawns from the hot path by replacing full-project filesystem walks with targeted incremental cache mutations.

Key changes:

  • `UpsertFile` / `RemoveFile` update individual cache entries without a filesystem walk. The file watcher and `handleDeleteFile` use these instead of `RefreshProject`.
  • Watcher debounces file events (100ms per project) and flushes as batched incremental mutations. Directory Create/Rename events trigger a walk to pick up new `.md` files.
  • Unscanned projects fall back to full `RefreshProject` on file events rather than silently dropping them (previously changes were invisible until the user navigated to the project).
  • `ResolveFileInfo` applies the same scan rules for single-file resolution without spawning a subprocess.
  • `projectHasAnyMarkdown` drops the `git check-ignore` call — false positives from gitignored `.md` files are harmless since the full scan on first access filters properly.
  • `RescanWith` preserves cached files for unchanged projects on config changes; only rescans where sources changed or were never scanned.
  • `RefreshAllProjects` and `CheckAllProjectsHasFiles` capped at 4 goroutines to prevent subprocess storms.

Covered by 15 new unit tests for `UpsertFile`, `RemoveFile`, `ResolveFileInfo`, `RescanWith`, `SourcesChanged`, and `projectHasAnyMarkdown`.

Addresses PENPAL-2225.

loganj and others added 5 commits March 30, 2026 00:55
Add semaphore (cap 4) to RefreshAllProjects and CheckAllProjectsHasFiles
to prevent unbounded git check-ignore subprocess spawning at startup and
during workspace rescans.

Tighten the watcher Route 6 filter to only pass .md file events. Previously,
all Create events (including scanner temp files and backup artifacts) would
pass through and trigger a full project walk with a git check-ignore process.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ResolveFileInfo for single-file source resolution without walking
the filesystem or spawning git check-ignore processes. Applies the same
source-priority, SkipDirs, RequireSibling, and ClassifyFile rules as
scanProjectSources.

Add UpsertFile and RemoveFile methods on Cache for incremental mutations.
UpsertFile updates ModTime/Title for existing entries or resolves source
membership for new files. RemoveFile removes entries by project-relative
path. Both update project metadata (HasFiles, LastModified).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the per-project debounceRefresh (which triggered full
RefreshProject walks) with an accumulating debounce that collects
per-file paths and ops during the 100ms window, then applies
incremental UpsertFile/RemoveFile mutations.

This eliminates filesystem walks and git check-ignore subprocess
spawning for the common case of editing .md files. The watcher now
handles Write → re-stat + re-title, Create → resolve source + insert,
and Remove/Rename → remove from cache, all without touching the
filesystem beyond the specific file that changed.

Add IsProjectScanned method to support the guard that skips incremental
updates for projects that haven't been lazily scanned yet.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RescanWith now preserves cached file data for unchanged projects instead
of rescanning everything. It diffs old vs new project lists, compares
source configurations via SourcesChanged, and only walks projects that
are genuinely new or whose sources changed. Removed projects are cleaned
up from the cache.

handleDeleteFile uses RemoveFile for incremental cache removal instead
of a full RefreshProject walk.

refreshAfterConfigChange no longer calls populateProjects (which
redundantly ran CheckAllProjectsHasFiles). Instead it does targeted
git enrichment only for projects missing git info.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a file event fires for a project that hasn't been lazily scanned
yet, fall back to a full RefreshProject instead of silently dropping the
event. The previous guard skipped incremental updates entirely, which
meant file changes were invisible until the user navigated to the
project page to trigger the lazy scan.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 798066e873

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9489020eb0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@loganj loganj added the ai-outer-loop Managed by outer-loop label Mar 30, 2026
@loganj loganj force-pushed the penpal-git-check-ignore-timing branch from 9489020 to 137a93e Compare March 30, 2026 14:07
@loganj
Copy link
Copy Markdown
Collaborator Author

loganj commented Mar 30, 2026

outer-loop status

Feedback addressed — 1 thread(s) resolved.

  • apps/penpal/internal/cache/cache.go: **P2 BadgeGood catch. UpsertFile holds the write lock across os.Stat, extractTitle (

Updated 2026-03-30T18:55:10.296Z

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 137a93e287

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@loganj loganj force-pushed the penpal-git-check-ignore-timing branch from 137a93e to 0760f2b Compare March 30, 2026 14:49
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0760f2b404

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@loganj loganj force-pushed the penpal-git-check-ignore-timing branch from 0760f2b to 07acc3d Compare March 30, 2026 18:03
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 07acc3d18d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@loganj loganj force-pushed the penpal-git-check-ignore-timing branch from 07acc3d to cd0ad63 Compare March 30, 2026 18:49
Remove gitIgnoreChecker from projectHasAnyMarkdown — this lightweight
startup check no longer spawns a subprocess per project. A false positive
from .md files in gitignored dirs is harmless since the full scan on first
access applies proper filtering.

Remove IsProjectScanned guard from flushFileEvents — incremental updates
now apply regardless of scan state, avoiding fallback to full RefreshProject
which spawned git check-ignore. The lazy scan on first access reconciles.

Fix flaky TestSSE_BroadcastDelivery by draining background events before
asserting the expected broadcast.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@loganj loganj force-pushed the penpal-git-check-ignore-timing branch from cd0ad63 to bd0a2d5 Compare March 30, 2026 18:55
@loganj loganj merged commit 3b6e301 into main Mar 30, 2026
4 checks passed
@loganj loganj deleted the penpal-git-check-ignore-timing branch March 30, 2026 18:58
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bd0a2d593e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +757 to +758
if hasSkippedDir(relToSource, st) {
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip .git directories in incremental file resolution

ResolveFileInfo does not apply the same unconditional .git directory exclusion that scanProjectSources uses, so incremental updates can admit markdown files from .git when they receive fsnotify events. In this path, filtering relies on SkipDirs, but __all_markdown__ does not include .git, so a file like .git/notes.md is accepted and cached even though a full scan would skip it. This creates inconsistent results between incremental and full refreshes and can surface phantom files until a later rescan.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-outer-loop Managed by outer-loop

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant