Fix Semantic UI removal cleanup issues by JSv4 · Pull Request #1158 · Open-Source-Legal/OpenContracts

JSv4 · 2026-03-24T03:52:00Z

Summary

Fix folderIconAlpha color mismatch in ExtractCellFormatter.tsx: StatusDot box-shadows used hardcoded rgba(245, 158, 11, ...) (#F59E0B) while getCellBackground() used folderIconAlpha() (#D97706). Now both use folderIconAlpha() consistently.
Fix Revoke button danger styling in WorkerTokensSection.tsx: Changed from variant="secondary" with inline color: danger to variant="danger" for proper destructive-action visual treatment (matching the confirmation dialog's Revoke button).
Fix EditMessageModal doubled styling: Removed box-shadow and border-radius from StyledModalInner that conflicted with the wrapping OsModal component, preventing doubled borders/shadows (especially on mobile).
Add accessibility to ExtractCellFormatter popup: Added role="menu", aria-label, aria-haspopup, aria-expanded, tabIndex, and keyboard Enter/Space support to the custom popup and its trigger. Implemented roving tabindex with ArrowUp/ArrowDown/Home/End/Tab/Escape navigation per WAI-ARIA menu pattern.
Restore z-index on HeaderContainer in HeaderAndLayout.tsx: Added z-index: Z_INDEX.HEADER (50) to prevent header from rendering behind the sidebar when backdrop-filter creates new stacking contexts. 50 < sidebar's 90 is fine since they don't spatially overlap (flex column layout).
Extract magic number maxHeight: "70vh" in SelectDocumentsModal.tsx to MODAL_BODY_MAX_HEIGHT constant in constants.ts.
Fix SelectionActionMenu z-index trapped in PDFContainer stacking context: Moved context menus to createPortal(_, document.body) and used Z_INDEX.CONTEXT_MENU.
CAML Interactive Article System (PR Add CAML interactive article system with published npm packages #1156): Frontend CAML article editor, renderer, corpus landing integration, and Playwright component tests.

Test plan

Verify amber StatusDot glow on hover matches the cell background tint in extract grid
Verify Revoke button in Worker Tokens section displays with red/danger styling
Verify EditMessageModal renders without doubled borders/shadows on desktop and mobile viewports
Verify ExtractCellFormatter popup is keyboard-accessible (Enter/Space to open, Arrow keys to navigate, Tab/Escape to close)
Verify document header renders above sidebar during collapse/expand transitions
Verify SelectDocumentsModal scroll behavior unchanged with constant

- Use folderIconAlpha() for StatusDot box-shadows to match getCellBackground() - Change Revoke button to variant="danger" for proper destructive styling - Remove conflicting box-shadow/border-radius from EditMessageModal StyledModalInner - Add ARIA attributes and keyboard support to ExtractCellFormatter popup - Restore z-index on HeaderContainer to prevent rendering behind sidebar - Extract maxHeight magic number to MODAL_BODY_MAX_HEIGHT constant https://claude.ai/code/session_01TqJ4SrqDyLX9bKGnMh5DNT

claude · 2026-03-24T03:56:10Z

PR Review

This is a clean, focused cleanup PR. The changes are well-scoped and each fix is properly motivated.

Positive

folderIconAlpha consistency (ExtractCellFormatter.tsx): Replacing hardcoded rgba(245, 158, 11, ...) with folderIconAlpha() is the right fix — the two values differed (#F59E0B vs #D97706), causing the hover glow color to mismatch the cell background tint.
Revoke button (WorkerTokensSection.tsx): Using variant="danger" instead of variant="secondary" with an inline style override is correct. Previously the danger color only applied to the label text, not the button chrome.
EditMessageModal double-styling (EditMessageModal.tsx): Removing the redundant box-shadow and border-radius from StyledModalInner is correct — letting the wrapping OsModal own the chrome prevents the doubled-border artifact on mobile.
Constants usage: Both MODAL_BODY_MAX_HEIGHT and Z_INDEX.DROPDOWN follow the project rule of extracting magic numbers to constants.ts.

Issues

1. Incomplete ARIA menu pattern (ExtractCellFormatter.tsx)

The PR adds role="menu" to the popup container but the individual action items inside don't have role="menuitem". Per ARIA 1.1, the menu role requires owned children of type menuitem, menuitemcheckbox, or menuitemradio. Without this, screen readers won't navigate the popup correctly. Each clickable row inside the popup needs role="menuitem" tabIndex={-1}.

2. Missing Escape key to close popup

The onKeyDown handler on StatusDot only handles Enter/Space to open — Escape is not handled. The existing click-outside handler via popupRef won't catch keyboard Escape. This should be added here or tracked as a follow-up issue.

3. Focus not moved into menu on keyboard open

When the menu opens via Enter/Space, focus stays on the trigger. The ARIA authoring practices pattern requires focus to move to the first menuitem on keyboard-open. Without this, keyboard-only users can't reach the menu items after opening. This could be a follow-up if out of scope for this PR.

4. HeaderContainer and popup share Z_INDEX.DROPDOWN (100)

HeaderContainer is now z-index: Z_INDEX.DROPDOWN (100), the same value used by the ExtractCellFormatter popup in this same PR. If a popup inside the extract grid ever overlaps the document header area, both at 100 means render order falls back to DOM position. Unlikely to cause visible issues in practice, but a dedicated Z_INDEX.HEADER tier would make intent explicit and prevent future silent conflicts.

Minor

CHANGELOG-only formatting changes (blank lines between section headers) add noise to the diff but are harmless.
Prettier trailing-comma normalization is consistent and correct.

Summary

The core fixes (color consistency, danger variant, double-border removal, z-index restore) are all correct and well-motivated. The accessibility additions are a meaningful improvement over the previous state (no ARIA at all), but role="menuitem" on children and the missing Escape key handler are real gaps worth addressing in this PR or a fast follow-up.

…1123-puDwk

codecov · 2026-03-24T04:37:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

… focus management - Add role="menuitem" and tabIndex={-1} to each IconButton in the ExtractCellFormatter popup menu for proper ARIA menu pattern - Add Escape key handler on StatusDot to close the popup and return focus to the trigger - Move focus to the first menu item when the popup opens via keyboard (Enter/Space), using requestAnimationFrame for reliable timing - Add Z_INDEX.HEADER (50) constant for sticky headers, separating them from DROPDOWN (100) to prevent z-index collisions - Update HeaderContainer to use Z_INDEX.HEADER instead of Z_INDEX.DROPDOWN

claude · 2026-03-24T04:41:38Z

PR Review: Fix Semantic UI removal cleanup issues

Overall this is a clean, well-scoped PR. The color consistency fix, z-index restoration, modal double-styling removal, and constant extraction are all solid. One real bug in the accessibility implementation needs fixing before merge, and there are a few minor points worth noting.

Bug: `firstMenuItemRef` assigned to all menu items

In ExtractCellFormatter.tsx, the firstMenuItemRef is attached to all four menu items (Approve, Edit, View Sources, Reject):

// All four buttons receive this:
ref={firstMenuItemRef}
role="menuitem"
tabIndex={-1}

In React, when multiple elements share the same ref object, the ref ends up pointing to whichever element was last mounted in the DOM — i.e., the Reject button. So when the popup opens via keyboard, firstMenuItemRef.current?.focus() actually moves focus to the last item, not the first. This inverts the expected keyboard UX.

Fix: Only the first menu item (Approve) should receive ref={firstMenuItemRef}. Remove the ref prop from the other three buttons.

Minor: Incomplete ARIA menu keyboard pattern

The implementation adds role="menu" / role="menuitem" and tabIndex={-1} correctly, but the WAI-ARIA menu pattern requires arrow key navigation between items. Without it, a keyboard user who opens the menu gets focus on the first item but can't move to the others (Tab skips tabIndex={-1} items).

This doesn't have to be done in this PR if it's tracked as a follow-up, but worth noting. At minimum a comment in the code would help future contributors understand the intended navigation model.

Minor: CHANGELOG has spurious blank lines

The diff for CHANGELOG.md includes ~50+ consecutive blank lines added between sections (visible in the + lines in the diff). These appear to be an editor artifact rather than intentional formatting. Recommend cleaning these up before merge to keep the changelog readable.

Looks good

Color fix (folderIconAlpha replacing hardcoded rgba(245, 158, 11, ...)): correct and consistent.
variant="danger" on Revoke button: proper use of the component system; removes the inline style override.
Removed duplicate border-radius/box-shadow from StyledModalInner: right call — the OsModal wrapper already provides these.
Z_INDEX.HEADER: 50 constant: appropriate value (above sidebar at 90? — wait, 50 < 90). See note below.
MODAL_BODY_MAX_HEIGHT extraction: good DRY improvement.

Question: `Z_INDEX.HEADER` value

HEADER: 50 was added, but the PR description says the header was rendering behind the sidebar (z-index: 90). If that's the case, Z_INDEX.HEADER = 50 is still below the sidebar. Was the sidebar's z-index also changed, or is the sidebar's stacking context scoped so that 50 is sufficient? Worth a quick comment in the constants file clarifying the intended layering relationship between HEADER (50), DROPDOWN (100), and the sidebar.

Summary: The firstMenuItemRef shared-ref bug is the only change needed before this is ready to merge. The rest of the fixes are solid.

- CHANGELOG: corrected Z_INDEX.DROPDOWN -> Z_INDEX.HEADER to match actual code after prior review commit changed the constant - HeaderAndLayout.tsx: added comment explaining why Z_INDEX.HEADER (50) is sufficient despite sidebar using z-index 90 (flex column layout, no visual overlap)

claude · 2026-03-24T04:52:14Z

PR Review: Fix Semantic UI removal cleanup issues

Overall this is a clean, focused PR with good attention to the no-magic-numbers convention and accessibility improvements. A few things worth flagging:

Bugs / Correctness

PR description inconsistency — Z_INDEX.HEADER vs Z_INDEX.DROPDOWN

The PR summary says:

Added z-index: Z_INDEX.DROPDOWN (100) to prevent header from rendering behind the sidebar (z-index: 90)

But the actual code in HeaderAndLayout.tsx uses Z_INDEX.HEADER (50), and the CHANGELOG entry correctly says Z_INDEX.HEADER. The inline comment also explains why 50 is sufficient (flex column layout, no visual overlap with the sidebar). The implementation looks correct — the description just has the wrong constant name.

Accessibility: Incomplete ARIA menu keyboard navigation

ExtractCellFormatter.tsx adds role="menu" on the popup div and role="menuitem" on the four IconButton children. The WAI-ARIA authoring practices for a menu widget require arrow-key navigation between items (Up/Down Arrow), not just Tab. Currently:

Enter/Space opens the popup and moves focus to firstMenuItemRef (the Approve button). ✅
Escape closes the popup and returns focus to the trigger. ✅
Missing: Down/Up Arrow to move between menu items while the menu is open.
Missing: Tab handling — pressing Tab while the popup is open moves focus out of the menu without closing it, which is unexpected for a role="menu".

For a strict WAI-ARIA menu, Tab should either close the menu (moving focus to the next element outside) or cycle within it. Screen reader users expect arrow navigation inside menus. If full roving-tabindex arrow nav is out of scope for this PR, consider either (a) using role="group" + role="button" instead (no arrow-nav expectation) or (b) noting the limitation in the PR as a follow-up.

Minor: Redundant Escape handler

When isPopupOpen is true and the user presses Escape on the StatusDot, both handlers fire:

onKeyDown on StatusDot → setIsPopupOpen(false) + statusDotRef.current?.focus()
The global document.addEventListener("keydown", handleEscape) (registered in the useEffect) → setIsPopupOpen(false) + statusDotRef.current?.focus()

React batches the state update so there's no real breakage, but .focus() is called twice. Easy fix: remove the Escape branch from StatusDot's onKeyDown since the global handler covers it — or remove the global handler and keep only the onKeyDown. The current global handler already correctly returns focus to statusDotRef.

Test coverage

The PR adds no automated tests for the new keyboard/accessibility behaviour. Given that .ct.tsx Playwright component tests exist for this codebase, it would be worth adding at least:

Keyboard-open popup (Tab to StatusDot → Enter → first item focused)
Escape closes popup and returns focus to trigger

Not a blocker, but consistent with the project's test-plan checkboxes staying open.

Positives

folderIconAlpha() replacement correctly removes the two hardcoded rgba(245, 158, 11, …) values that were diverging from the getCellBackground() source of truth.
variant="danger" over variant="secondary" + style={{ color: OS_LEGAL_COLORS.danger }} is the right way to use the component API.
Removing box-shadow + border-radius from StyledModalInner is the correct fix for the doubled-border issue — those responsibilities belong to the OsModal wrapper.
MODAL_BODY_MAX_HEIGHT and Z_INDEX.HEADER extractions follow the project's no-magic-numbers convention properly.
The comment on z-index: ${Z_INDEX.HEADER} explaining the flex-layout stacking context is genuinely helpful for future readers.

…1123-puDwk

…d arrow key nav - Remove z-index from HeaderContainer in HeaderAndLayout.tsx to fix the component test failure where the TXT annotator's action menu (inside PDFContainer's stacking context at z-index: 1) was painted below the header (z-index: 50), causing Playwright's click to be intercepted. - Remove redundant Escape handler from StatusDot's onKeyDown since the global document keydown listener already handles it with focus restore. - Add arrow key navigation (ArrowLeft/Right/Up/Down) to the cell action popup menu via onKeyDown on the menu container, implementing roving tabindex with wraparound. - Add three component tests for keyboard accessibility: Enter to open popup with auto-focus, arrow key navigation between menu items, and Escape to close with focus return to status dot. - Update CHANGELOG to reflect the accessibility improvements.

claude · 2026-03-24T13:37:32Z

PR Review - Fix Semantic UI removal cleanup issues

Overall a focused, well-scoped cleanup PR. A few issues worth addressing.

Issue 1 - Bug: Z_INDEX.HEADER constant added but never used

constants.ts adds Z_INDEX.HEADER: 50 but the HeaderAndLayout.tsx fix described in the PR description is absent from the diff. The current HeaderAndLayout.tsx has position: relative but no z-index, while the sidebar has z-index: 90. Either add the HeaderAndLayout.tsx change or remove the unused constant (per the CLAUDE.md no dead code rule).

Issue 2 - Minor: onMouseLeave timeout not cleared on unmount

The pre-existing onMouseLeave handler (touched by this PR) calls setTimeout without cleanup. If the component unmounts during the 300ms window React may log a stale-state warning. Worth fixing with a useRef-tracked timeout while this code is being modified.

Issue 3 - Observation: arrow key behavior for role=menu

WAI-ARIA menu pattern specifies ArrowDown/ArrowUp for item navigation and Home/End for first/last item. ArrowRight/ArrowLeft are for sub-menu navigation, not item-to-item within the same menu. The current implementation handles all four arrow keys for item navigation, which diverges from the spec. Consider limiting to ArrowDown/ArrowUp and adding Home/End support.

Positives

The folderIconAlpha() fix is correct: both box-shadow values now match getCellBackground().
Switching to variant="danger" is the right pattern for destructive-action buttons and removes the inline OS_LEGAL_COLORS.danger style dependency.
The StyledModalInner double-shadow removal is clean and well-motivated.
Accessibility additions (role="menu", aria-haspopup, aria-expanded, focus-return on Escape, roving tabindex) are a genuine improvement.
Three new Playwright tests provide solid coverage of the keyboard interactions.
CHANGELOG updated as required.

- Fix failing component test: provide mock cell with fullSourceList to the arrow key navigation test so the View Sources button is enabled and focusable (disabled buttons cannot receive focus, causing the :not(:disabled) selector in the arrow key handler to skip them). - Fix onMouseLeave timeout leak: track the 300ms auto-close timeout via a ref and clear it on popup close, effect cleanup, and mouse re-enter. - Remove unused Z_INDEX.HEADER constant (added but never referenced).

claude · 2026-03-25T02:05:40Z

Code Review - See details below (posted in parts due to formatting constraints)

claude · 2026-03-25T02:06:14Z

Code Review - Full Review

Overall this is a clean, well-scoped cleanup PR with focused changes, good adherence to project conventions, and solid test coverage. A few observations below.

Issues to Address

HeaderAndLayout.tsx change is missing from the diff -- The PR description and CHANGELOG both reference a z-index fix for HeaderAndLayout.tsx (adding Z_INDEX.DROPDOWN to prevent the header from rendering behind the sidebar), but HeaderAndLayout.tsx does not appear in the file list for this PR. Either the change was accidentally dropped before pushing, or the description/CHANGELOG entry is stale. If the z-index fix is still needed it should be included; otherwise the references should be removed.
Tab key not handled inside the menu -- The PR correctly implements roving tabindex for arrow-key navigation per the ARIA menu pattern. That same pattern requires that pressing Tab while focus is inside the menu closes the menu and moves focus to the next element in the natural tab order. Without this, keyboard users have no standard way to exit the menu other than Escape. A small addition to the popup onKeyDown handler covers this: check for Tab key, call e.preventDefault(), setIsPopupOpen(false), and return focus to statusDotRef. Not a blocker but a noticeable gap given the otherwise thorough a11y work here.
:not(:disabled) selector may miss aria-disabled items -- The arrow-key navigation handler queries for menuitem elements using :not(:disabled). That pseudo-class only matches native form elements with the HTML disabled attribute. If any IconButton uses aria-disabled=true for visual-only disabling (e.g. View Sources when fullSourceList is empty), those buttons will still be included in the navigation cycle. Consider also excluding [aria-disabled=true], or confirm all disabled states go through the native disabled prop.
Minor: openedViaKeyboard state reset before rAF fires -- The useEffect sets openedViaKeyboard to false immediately after scheduling the requestAnimationFrame callback, meaning a re-render is scheduled before the rAF fires. This works in practice but the ordering is slightly fragile. Moving the state reset inside the rAF callback (after the focus call) makes the intent explicit. Alternatively, a plain useRef avoids the extra render entirely.

Positive Observations

Color fix: replacing hardcoded rgba(245, 158, 11, ...) with folderIconAlpha() in both box-shadow rules is the right fix -- the mismatch with getCellBackground() would have produced a visible color discrepancy on hover.
Revoke button: variant=danger is the correct way to express destructive actions via the component API instead of overriding with an inline style.
EditMessageModal: removing box-shadow and border-radius from StyledModalInner correctly eliminates the doubled-styling artifact; those properties belong on the outer OsModal.
MODAL_BODY_MAX_HEIGHT constant: clean application of the no-magic-numbers rule.
Test coverage: the three new Playwright keyboard tests (open via Enter, arrow navigation with wrap-around, Escape with focus return) cover the meaningful interaction paths introduced here.
mouseLeaveTimeout cleanup: the ref-based approach with explicit cleanup in both the effect body and the cleanup return correctly handles the race condition between the hover-close delay and popup re-open.

Documents the design for completing the batch embedding pipeline that PR 1148 left unfinished. The task layer still processes annotations one at a time despite batch API methods existing on embedders. Design: new _batch_embed_text_annotations() helper, EMBEDDING_API_BATCH_SIZE constant, BaseEmbedder.embed_texts_batch() default, MicroserviceEmbedder override, and task refactoring with text-only/multimodal partitioning.

8-task plan covering: EMBEDDING_API_BATCH_SIZE constant, BaseEmbedder embed_texts_batch() default, MicroserviceEmbedder override, test embedder updates, _batch_embed_text_annotations() helper, task refactoring with text-only/multimodal partitioning, and 11 integration tests.

…y annotations

- Fix content_modalities NOT NULL violation in TestBatchTaskIntegration by not passing None explicitly (let Django field default apply) - Remove unused result variable flagged by flake8 - Apply black formatting to 3 files - Add CHANGELOG entries for batch embedding aggregation

claude · 2026-03-25T04:57:10Z

PR Review: Fix Semantic UI removal cleanup issues + Batch Embedding

This PR bundles two distinct feature areas: frontend UI/accessibility cleanup and backend batch embedding. Both are well-executed with good test coverage.

Frontend Changes

ExtractCellFormatter.tsx - Accessibility + Color Fix

Good:

Using folderIconAlpha(0.15/0.25) instead of hardcoded rgba(245, 158, 11, ...) keeps both color usages in sync.
The keyboard navigation (roving tabIndex, arrow key wrapping with role=menuitem) is a solid ARIA menu pattern.
Mouse-leave debounce refactor (mouseLeaveTimeoutRef) correctly cancels the timeout on mouseEnter, preventing the race condition where moving from the status dot to the popup closed it prematurely.
requestAnimationFrame before focusing firstMenuItemRef ensures the popup DOM is rendered before focus is attempted.
Escape returning focus to statusDotRef follows expected keyboard flow.

Minor nit: The arrow key handler uses querySelectorAll with :not(:disabled). CSS :disabled only matches form elements with a disabled attribute; aria-disabled=true is not covered. Since IconButton renders as button this works today, but :not([disabled]):not([aria-disabled=true]) would be more complete.

EditMessageModal.tsx

Removing border-radius and box-shadow from StyledModalInner is correct. Letting OsModal own those properties avoids doubled styling.

WorkerTokensSection.tsx

variant=danger over variant=secondary with inline color override is the right approach: the variant should drive the visual treatment.

SelectDocumentsModal.tsx + constants.ts

Extracting "70vh" to MODAL_BODY_MAX_HEIGHT is consistent with the project no-magic-numbers policy.

Component Tests

Three new Playwright tests cover keyboard open + first item focus, arrow key wrap-around navigation, and Escape + focus return. These are the right test cases for the new accessibility behavior.

Backend Changes - Batch Embedding

Architecture

The two-level batching design (Celery task grouping at 100, HTTP API sub-batching at 50) is clean and well-documented.

BaseEmbedder.embed_texts_batch()

Well-designed: always returns a list, catches per-text exceptions, returns None per-slot for per-text failure. Minor: The return type annotation uses Optional at the outer level but the base implementation never returns None there - that is only relevant for overrides.

MicroserviceEmbedder.embed_texts_batch()

NaN detection and 3D-to-2D squeeze handling are thorough. The 4xx vs 5xx log distinction aids observability.

_batch_embed_text_annotations()

Correctly handles all failure modes: whitespace skip, None batch return, per-element None vectors, add_embedding returning None. Never raises.

Style issue: EMBEDDING_API_BATCH_SIZE is imported lazily inside the function body. Constants should be at module level - move it to the top of embeddings_task.py.

calculate_embeddings_for_annotation_batch() refactor

Partitioning logic is correct. The content_modalities default correctly treats None as text-only. The error message fix from annotation_id (raw string) to annotation.id (ORM attribute) is a subtle correctness improvement.

Tests (test_batch_embedding.py)

Comprehensive: happy path, empty text skip, sub-batching, partial/full failure, exception propagation, full task integration, dual-embedding regression guard, multimodal partitioning, sequential fallback. Good use of uuid4 for test isolation.

Documentation Concern

docs/superpowers/plans/2026-03-24-batch-embedding-aggregation.md contains developer-local absolute paths like /home/jman/Code/OpenContracts. These should be stripped before merging.

Summary - Items to address before merging

Style: Move EMBEDDING_API_BATCH_SIZE import to module level in embeddings_task.py.
Docs: Strip developer-local absolute paths from the planning doc.
Nit: Arrow key nav - consider :not([disabled]):not([aria-disabled=true]) for full a11y coverage.

Test coverage is thorough and the implementation is correct. Good work on the accessibility improvements and batch embedding architecture.

…1123-puDwk # Conflicts: # CHANGELOG.md

…clean imports - ARIA menu: limit to ArrowDown/ArrowUp (remove Left/Right per spec), add Home/End support, add Tab to close menu, exclude aria-disabled items - Move openedViaKeyboard reset inside rAF callback for correct ordering - Add Z_INDEX.HEADER (50) constant and apply to HeaderContainer - Move EMBEDDING_API_BATCH_SIZE import to module level in embeddings_task.py - Strip developer-local absolute paths from planning doc - Update tests to use ArrowDown instead of ArrowRight

claude · 2026-03-29T02:47:27Z

Code Review

This PR bundles two feature areas: frontend Semantic UI cleanup fixes and backend batch embedding. Both are well-executed.

BACKEND - Issue 1: NaN in batch response drops whole sub-batch. When any embedding has NaN, embed_texts_batch() returns None, marking every annotation in the sub-batch failed. A single bad text corrupts 49 valid embeddings. Fix: replace NaN indices with per-item None instead of returning None for the whole batch. _batch_embed_text_annotations() already handles per-item None.

BACKEND - Issue 2: BaseEmbedder.embed_texts_batch() docstring says outer list is always returned (never None), but MicroserviceEmbedder can return None on network error. Docstring should be relaxed or the signature should use Optional to match actual subclass behavior.

BACKEND - Issue 3: The 100-text truncation guard in embed_texts_batch() can never trigger (EMBEDDING_API_BATCH_SIZE=50). If it ever did activate it would silently drop annotations. ERROR-level logging rather than WARNING would make this data-loss scenario visible.

FRONTEND - Issue 4: mouseLeaveTimeoutRef in ExtractCellFormatter.tsx holds a setTimeout ID with no cleanup useEffect. If the component unmounts while the timer is pending this triggers a state update on an unmounted component. A cleanup effect calling clearTimeout(mouseLeaveTimeoutRef.current) would fix this.

FRONTEND - Issue 5: PR description says Z_INDEX.DROPDOWN (100) was added to HeaderContainer, but the actual change adds Z_INDEX.HEADER (50). The implementation is correct per the code comment, but the PR description is misleading.

POSITIVES: 8 thorough backend integration tests covering happy path, sub-batching, partial and full-batch failure, empty text skipping. 3 new Playwright keyboard-nav tests for ExtractCellFormatter. Full WAI-ARIA support in ExtractCellFormatter (role=menu, aria-haspopup, aria-expanded, roving tabindex, Enter/Space/Arrow/Escape, focus restoration). All magic numbers properly centralized. variant=danger button is the correct treatment for destructive actions.

Summary: Issues 1 and 4 are most actionable before merge. Issues 2, 3, 5 are minor polish.

…1123-puDwk

PDFContainer has position:relative + z-index:1, creating a CSS stacking context that trapped the SelectionActionMenu's z-index inside it. External elements (like EnhancedLabelSelector) intercepted clicks on the menu. Fix: render SelectionActionMenu via React portal (document.body) in all three annotators (TXT, PDF, DOCX) so it escapes the stacking context. Replace hardcoded zIndex:1000 with Z_INDEX.CONTEXT_MENU constant. Also add unmount cleanup for mouseLeaveTimeoutRef in ExtractCellFormatter to prevent state updates on unmounted components.

…element Previously, any NaN in the batch response caused the entire batch to fail (returning None). Now NaN rows return None per-element while valid rows are preserved. This prevents a single bad embedding from losing up to 50 valid annotation embeddings. Addresses Claude review feedback.

claude · 2026-03-29T12:46:08Z

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

OpenContracts is an AGPL-3.0 enterprise document analytics platform for PDFs and text-based formats. It features a Django/GraphQL backend with PostgreSQL + pgvector, a React/TypeScript frontend with Jotai state management, and pluggable document processing pipelines powered by machine learning models.

Baseline Commit Rules

Always ensure all affected (or new) tests pass - backend tests suite should only be run in its entirety for good reason as it takes 30+ minutes.
Always make sure typescript compiles and pre-commits pass before committing new code.
Never credit Claude or Claude Code in commit messages, PR messages, comments, or any other artifacts. This includes Co-Authored-By lines, "Generated by Claude", etc.

Essential Commands

Backend (Django)

# Run backend tests (sequential, use --keepdb to speed up subsequent runs)
docker compose -f test.yml run django python manage.py test --keepdb

# Run backend tests in PARALLEL (recommended - ~4x faster)
# Uses pytest-xdist with 4 workers, --dist loadscope keeps class tests together
docker compose -f test.yml run django pytest -n 4 --dist loadscope

# Run parallel tests with auto-detected worker count (uses all CPU cores)
docker compose -f test.yml run django pytest -n auto --dist loadscope

# Run parallel tests with fresh databases (first run or after schema changes)
docker compose -f test.yml run django pytest -n 4 --dist loadscope --create-db

# Run specific test file
docker compose -f test.yml run django python manage.py test opencontractserver.tests.test_notifications --keepdb

# Run specific test file in parallel
docker compose -f test.yml run django pytest opencontractserver/tests/test_notifications.py -n 4 --dist loadscope

# Run specific test class/method
docker compose -f test.yml run django python manage.py test opencontractserver.tests.test_notifications.TestNotificationModel.test_create_notification --keepdb

# Apply database migrations
docker compose -f local.yml run django python manage.py migrate

# Create new migration
docker compose -f local.yml run django python manage.py makemigrations

# Django shell
docker compose -f local.yml run django python manage.py shell

# Code quality (runs automatically via pre-commit hooks)
pre-commit run --all-files

Frontend (React/TypeScript)

cd frontend

# Start development server (proxies to Django on :8000)
yarn start

# Run unit tests (Vitest) - watches by default
yarn test:unit

# Run component tests (Playwright) - CRITICAL: Use --reporter=list to prevent hanging
yarn test:ct --reporter=list

# Run component tests with grep filter
yarn test:ct --reporter=list -g "test name pattern"

# Run E2E tests
yarn test:e2e

# Coverage report
yarn test:coverage

# Linting and formatting
yarn lint
yarn fix-styles

# Build for production
yarn build

# Preview production build locally
yarn serve

Production Deployment

# CRITICAL: Always run migrations FIRST in production
docker compose -f production.yml --profile migrate up migrate

# Then start main services
docker compose -f production.yml up

High-Level Architecture

Backend Architecture

Stack: Django 4.x + GraphQL (Graphene) + PostgreSQL + pgvector + Celery

Key Patterns:

GraphQL Schema Organization:
- config/graphql/graphene_types.py - All GraphQL type definitions
- config/graphql/queries.py - Query resolvers
- config/graphql/*_mutations.py - Mutation files (organized by feature)
- config/graphql/schema.py - Schema composition
Permission System (CRITICAL - see docs/permissioning/consolidated_permissioning_guide.md):
- Annotations & Relationships: NO individual permissions - inherited from document + corpus
- Documents & Corpuses: Direct object-level permissions via django-guardian
- Analyses & Extracts: Hybrid model (own permissions + corpus permissions + document filtering)
- Formula: Effective Permission = MIN(document_permission, corpus_permission)
- Structural items are ALWAYS read-only except for superusers
- Use Model.objects.visible_to_user(user) pattern (NOT resolve_oc_model_queryset - DEPRECATED)
AnnotatePermissionsForReadMixin:
- Most GraphQL types inherit this mixin (see config/graphql/permissioning/permission_annotator/mixins.py)
- Adds my_permissions, is_published, object_shared_with fields
- Requires model to have guardian permission tables ({model}userobjectpermission_set)
- Notifications use simple ownership model and DON'T use this mixin
Django Signal Handlers:
- Automatic notification creation on model changes (see opencontractserver/notifications/signals.py)
- Must be imported in app's apps.py ready() method
- Use _skip_signals attribute to prevent duplicate notifications in tests
Pluggable Parser Pipeline:
- Base classes in opencontractserver/pipeline/base/
- Parsers, embedders, thumbnailers auto-discovered and registered
- Multiple backends: Docling (ML-based), LlamaParse, Text
- All convert to unified PAWLs format for frontend
Agent Tool Architecture (see docs/architecture/llms/README.md):
- CoreTool (framework-agnostic) → PydanticAIToolWrapper (pydantic-ai specific)
- All production tools MUST be async (a-prefixed in core_tools.py). The wrapper supports sync functions for lightweight helpers/tests but does NOT wrap them in a thread pool — sync ORM calls will raise SynchronousOnlyOperation.
- Tool fault tolerance (issue Agent Tools Not Fault Tolerant #820): operational exceptions are caught and returned as error strings to the LLM; security exceptions (PermissionError, ToolConfirmationRequired) propagate.
- Pre-execution checks run on every call (not cached): permission validation, resource ID validation, approval gates.

Frontend Architecture

Stack: React 18 + TypeScript + Apollo Client + Jotai (atoms) + PDF.js + Vite

Key Patterns:

State Management - Jotai Atoms:
- Global state via atoms in frontend/src/atoms/ (NOT Redux/Context)
- Key atoms: selectedCorpusIdAtom, selectedFolderIdAtom, currentThreadIdAtom
- Derived atoms automatically update when dependencies change
- Apollo reactive vars in frontend/src/graphql/cache.ts for UI state
- AuthGate pattern ensures auth completes before rendering
Central Routing System (see docs/frontend/routing_system.md):
- Single source of truth: frontend/src/routing/CentralRouteManager.tsx
- URL paths → Entity resolution via GraphQL slug queries
- URL params ↔ Reactive vars (bidirectional sync)
- Components consume state via reactive vars, never touch URLs directly
- Deep linking and canonical redirects handled automatically
PDF Annotation System (see .cursor/rules/pdf-viewer-and-annotator-architecture.mdc):
- Virtualized rendering: Only visible pages (+overscan) rendered for performance
- Binary search to find visible page range (O(log n))
- Height caching per zoom level
- Two-phase scroll-to-annotation system
- Dual-layer architecture: Document layer (annotations) + Knowledge layer (summaries)
Unified Filtering Architecture:
- useVisibleAnnotations hook provides single-source-of-truth filtering for both annotations and relationship-connected annotations
- Reads from Jotai atoms via useAnnotationDisplay, useAnnotationControls, and useAnnotationSelection (from UISettingsAtom)
- Ensures consistency across all components
- Forced visibility for selected items and their relationship connections
Component Testing (see .cursor/rules/test-document-knowledge-base.mdc):
- ALWAYS mount components through test wrappers (e.g., DocumentKnowledgeBaseTestWrapper)
- Wrapper provides: MockedProvider + InMemoryCache + Jotai Provider + asset mocking
- Use --reporter=list flag to prevent hanging
- Increase timeouts (20s+) for PDF rendering in Chromium
- GraphQL mocks must match variables EXACTLY (null vs undefined matters)
- Mock same query multiple times for refetches
- Use page.mouse for PDF canvas interactions (NOT locator.dragTo)
- Add settle time after drag operations (500ms UI, 1000ms Apollo cache)
Development Server Configuration:
- Vite dev server on :3000 proxies to Django on :8000
- WebSocket proxy for /ws → ws://localhost:8000
- GraphQL proxy for /graphql → http://localhost:8000
- REST API proxy for /api → http://localhost:8000
- Auth0 optional via REACT_APP_USE_AUTH0 environment variable

Data Flow Architecture

Document Processing:

Upload → Parser Selection (Docling/LlamaParse/Text)
Parser generates PAWLs JSON (tokens with bounding boxes)
Text layer extracted from PAWLs
Annotations created for structure (headers, sections, etc.)
Relationships detected between elements
Vector embeddings generated for search

GraphQL Permission Flow:

Query resolver filters objects with .visible_to_user(user)
GraphQL types resolve my_permissions via AnnotatePermissionsForReadMixin
Frontend uses permissions to enable/disable UI features
Mutations check permissions and return consistent errors to prevent IDOR

Critical Security Patterns

IDOR Prevention:
- Query by both ID AND user-owned field: Model.objects.get(pk=pk, recipient=user)
- Return same error message whether object doesn't exist or belongs to another user
- Prevents enumeration via timing or different error messages
Permission Checks:
- NEVER trust frontend - always check server-side
- Use visible_to_user() manager method for querysets
- Check user_has_permission_for_obj() for individual objects (in opencontractserver.utils.permissioning)
XSS Prevention:
- User-generated content in JSON fields must be escaped on frontend
- GraphQL's GenericScalar handles JSON serialization safely
- Document this requirement in resolver comments

Critical Concepts

No dead code - when deprecating or replacing code, always try to fully replace older code and, once it's no longer in use, delete it and related texts.
DRY - please always architect code for maximal dryness and always see if you can consolidate related code and remove duplicative code.
Single Responsibility Principle - Generally, ensure that each module / script has a single purpose or related purpose.
No magic numbers - we have constants files in opencontractserver/constants/ (backend) and frontend/src/assets/configurations/constants.ts (frontend). Use them for any hardcoded values.
Don't touch old tests without permission - if pre-existing tests fail after changes, try to identify why and present user with root cause analysis. If the test logic is correct but expectations need updating due to intentional behavior changes, document the change clearly.
Utility functions belong in utility files:
- Before writing new utilities: Check existing utility files first (frontend/src/utils/, opencontractserver/utils/)
- When reviewing code: If you find utility functions defined inline in components/views, check if they already exist in utility files. If not, consider whether they should be extracted there for reuse.
- Frontend utilities: frontend/src/utils/formatters.ts (formatting), frontend/src/utils/files.ts (file operations), etc.
- Backend utilities: opencontractserver/utils/ contains permissioning, PDF processing, and other shared utilities

Testing Patterns

Manual Test Scripts

Location: docs/test_scripts/

When performing manual testing (e.g., testing migrations, verifying database state, testing API endpoints interactively), always document the test steps in a markdown file under docs/test_scripts/. These scripts will later be used to build automated integration tests.

Format:

# Test: [Brief description]

## Purpose
What this test verifies.

## Prerequisites
- Required state (e.g., "migration at 0058")
- Required data (e.g., "at least one document exists")

## Steps
1. Step one with exact command
   ```bash
   docker compose -f local.yml run --rm django python manage.py shell -c "..."

Step two...

Expected Results

What should happen after each step
Success criteria

Cleanup

Commands to restore original state if needed.


**When to document**:
- Migration testing (rollback, create test data, migrate forward)
- Database constraint validation
- Race condition verification
- Any manual verification requested during code review

### Backend Tests

**Location**: `opencontractserver/tests/`

**Parallel Testing** (pytest-xdist):
- Run with `-n 4` (or `-n auto`) for parallel execution across workers
- Use `--dist loadscope` to keep tests from the same class on the same worker (respects `setUpClass`)
- Each worker gets its own database (test_db_gw0, test_db_gw1, etc.)
- Use `@pytest.mark.serial` to mark tests that cannot run in parallel
- First run or after DB schema changes: add `--create-db` flag

**Patterns**:
- Use `TransactionTestCase` for tests with signals/asynchronous behavior
- Use `TestCase` for faster tests without transaction isolation
- Clear auto-created notifications when testing moderation: `Notification.objects.filter(recipient=user).delete()`
- Use `_skip_signals` attribute on instances to prevent signal handlers during fixtures

### Frontend Component Tests

**Location**: `frontend/tests/`

**Critical Requirements**:
- Mount through test wrappers that provide all required context
- GraphQL mocks must match query variables exactly
- Include mocks for empty-string variants (unexpected boot calls)
- Wait for visible evidence, not just network-idle
- Use `page.mouse` for PDF canvas interactions (NOT `locator.dragTo`)
- Add settle time after drag operations (500ms UI, 1000ms Apollo cache)

**Test Wrapper Pattern**:
```typescript
// ALWAYS use wrappers, never mount components directly
const component = await mount(
  <DocumentKnowledgeBaseTestWrapper corpusId="corpus-1" documentId="doc-1">
    <DocumentKnowledgeBase />
  </DocumentKnowledgeBaseTestWrapper>
);

Automated Documentation Screenshots

Location: docs/assets/images/screenshots/auto/ (output) | frontend/tests/utils/docScreenshot.ts (utility)

Screenshots for documentation are automatically captured during Playwright component tests and committed back to the PR branch by the screenshots.yml CI workflow.

How it works:

Import docScreenshot from ./utils/docScreenshot in any .ct.tsx test file
Call await docScreenshot(page, "area--component--state") after the component reaches the desired visual state
Reference the image in markdown: ![Alt text](../assets/images/screenshots/auto/area--component--state.png)
CI runs tests on every PR, captures screenshots, and auto-commits any changes

Naming convention (-- separates segments, - within words):

Segment	Purpose	Examples
`area`	Feature area	`landing`, `badges`, `corpus`, `versioning`, `annotations`
`component`	Specific view or component	`hero-section`, `celebration-modal`, `list-view`
`state`	Visual state captured	`anonymous`, `with-data`, `empty`, `auto-award`

At least 2 segments required, 3 recommended. All lowercase alphanumeric with single hyphens.

Example:

import { docScreenshot } from "./utils/docScreenshot";

// After the component renders and assertions pass:
await docScreenshot(page, "badges--celebration-modal--auto-award");

Rules:

Place docScreenshot() calls AFTER assertions that confirm the desired visual state
The filename IS the contract between tests and docs — keep names stable
Never manually edit files in docs/assets/images/screenshots/auto/ — they are overwritten by CI
Manually curated screenshots stay in docs/assets/images/screenshots/ (parent directory)

Release Screenshots (Point-in-Time)

For release notes, use releaseScreenshot to capture screenshots that are locked in amber — they show the UI at a specific release and never change.

Location: docs/assets/images/screenshots/releases/{version}/ (output)

import { releaseScreenshot } from "./utils/docScreenshot";

await releaseScreenshot(page, "v3.0.0.b3", "landing-page", { fullPage: true });

Key differences from docScreenshot:

Output: docs/assets/images/screenshots/releases/{version}/{name}.png
Write-once: If the file already exists, the function is a no-op (won't overwrite)
CI never touches the releases/ directory
Name is a simple kebab-case string (no -- segment convention)
Version must match v{major}.{minor}.{patch} format (with optional suffix)

When to use which:

docScreenshot → README, quickstart, guides (always fresh)
releaseScreenshot → Release notes (frozen at release time)

Authenticated Playwright Testing (Live Frontend Debugging)

When you need to interact with the running frontend as an authenticated user (e.g., debugging why a query returns empty results), use Django admin session cookies to authenticate GraphQL requests.

Architecture context: The frontend uses Auth0 for authentication, but the Django backend also accepts session cookie auth. Apollo Client sends GraphQL requests directly to http://localhost:8000/graphql/ (cross-origin from the Vite dev server at localhost:5173). Since fetch defaults to credentials: 'same-origin', browser cookies aren't sent cross-origin. The workaround is to inject the session cookie into request headers via Playwright's route interception.

Step 1: Set a password for the superuser (one-time setup):

docker compose -f local.yml exec django python manage.py shell -c "
from django.contrib.auth import get_user_model
User = get_user_model()
u = User.objects.filter(is_superuser=True).first()
u.set_password('testpass123')
u.save()
print(f'Password set for {u.username}')
"

Step 2: Playwright script pattern:

const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch({ headless: true });
  const context = await browser.newContext();
  const page = await context.newPage();

  // Collect console messages for debugging
  const consoleMsgs = [];
  page.on('console', msg => consoleMsgs.push('[' + msg.type() + '] ' + msg.text()));

  // 1. Login to Django admin to get session cookie
  await page.goto('http://localhost:8000/admin/login/');
  await page.fill('#id_username', '<superuser-username>');
  await page.fill('#id_password', 'testpass123');
  await page.click('input[type=submit]');
  await page.waitForTimeout(2000);

  // 2. Extract the session cookie
  const cookies = await context.cookies();
  const sessionCookie = cookies.find(c => c.name === 'sessionid');

  // 3. Intercept GraphQL requests to inject the session cookie
  //    (needed because Apollo sends cross-origin requests to :8000)
  await page.route('**/graphql/**', async (route) => {
    const headers = {
      ...route.request().headers(),
      'Cookie': 'sessionid=' + sessionCookie.value,
    };
    await route.continue({ headers });
  });

  // 4. Navigate to the frontend page under test
  await page.goto('http://localhost:5173/extracts');
  await page.waitForTimeout(5000);

  // 5. Inspect results
  const bodyText = await page.textContent('body');
  console.log(bodyText);

  await browser.close();
})();

Run from the frontend directory (where playwright is a dependency):

cd frontend && node /path/to/script.js

Key details:

The Django admin login sets a sessionid cookie for localhost
page.route('**/graphql/**') intercepts Apollo's requests to localhost:8000/graphql/ and injects the cookie header
The AuthGate will still show anonymous state (no Auth0 session), but GraphQL queries execute as the authenticated Django user
This is useful for verifying backend query results, permission filtering, and data flow through the real frontend

For verifying just the GraphQL response without the full frontend, curl with the session cookie also works:

curl -s -X POST http://localhost:8000/graphql/ \
  -H "Content-Type: application/json" \
  -H "Cookie: sessionid=<session-key>" \
  -d '{"query":"query { extracts { edges { node { id name } } } }"}' | python3 -m json.tool

Alternative — create a session programmatically (no admin login needed):

docker compose -f local.yml exec django python manage.py shell -c "
from django.contrib.sessions.backends.db import SessionStore
from django.contrib.auth import get_user_model
User = get_user_model()
user = User.objects.filter(is_superuser=True).first()
session = SessionStore()
session['_auth_user_id'] = str(user.pk)
session['_auth_user_backend'] = 'django.contrib.auth.backends.ModelBackend'
session['_auth_user_hash'] = user.get_session_auth_hash()
session.save()
print(f'Session key: {session.session_key}')
"

Then use the printed session key directly in curl or Playwright route interception.

Documentation Locations

Permissioning: docs/permissioning/consolidated_permissioning_guide.md
Frontend Routing: docs/frontend/routing_system.md
Frontend Auth Flow: docs/frontend/auth_flow.md
PDF Data Layer: docs/architecture/PDF-data-layer.md
PAWLs Format (v1/v2): docs/architecture/pawls-format.md
Parser Pipeline: docs/pipelines/pipeline_overview.md
LLM Framework: docs/architecture/llms/README.md
Collaboration System: docs/commenting_system/README.md
Auth Pattern (detailed): frontend/src/docs/AUTHENTICATION_PATTERN.md
Documentation Screenshots: docs/development/screenshots.md
Query Permission Patterns: docs/architecture/query_permission_patterns.md
OS-Legal-Style Migration Guide: docs/frontend/os-legal-style-migration-guide.md

Branch Strategy

This project follows trunk-based development:

Work directly on main branch or use short-lived feature branches
Feature branches: feature/description-issue-number
Merge feature branches quickly (within a day or two)
Commit message format: Descriptive with issue references (e.g., "Closes Epic: Notification System #562")

Changelog Maintenance

IMPORTANT: Always update CHANGELOG.md when making significant changes to the codebase.

The changelog follows Keep a Changelog format:

## [Unreleased] - YYYY-MM-DD

### Added
- New features

### Fixed
- Bug fixes with file locations and line numbers

### Changed
- Changes to existing functionality

### Technical Details
- Implementation specifics, architectural notes

When to update:

New features or models added
Production code bugs fixed (document file location, line numbers, and impact)
Breaking changes to APIs or data models
Test suite fixes that reveal production issues
Database migrations
Architecture changes

What to include:

File paths and line numbers for code changes
Clear description of the issue and fix
Impact on system behavior
Migration notes if applicable

Pre-commit Hooks

Automatically run on commit:

black (Python formatting)
isort (import sorting)
flake8 (linting)
prettier (frontend formatting)
pyupgrade (Python syntax modernization)

Run manually: pre-commit run --all-files

Common Pitfalls

Frontend tests hanging: Always use --reporter=list flag
Permission N+1 queries: Use .visible_to_user() NOT individual permission checks. For Conversation list queries in GraphQL resolvers, use ConversationQueryOptimizer from opencontractserver.conversations.query_optimizer - it provides request-level caching to avoid repeated corpus/document visibility subqueries
Missing GraphQL mocks: Check variables match exactly (null vs undefined matters), add duplicates for refetches
Notification duplication in tests: Moderation methods auto-create ModerationAction records
Structural annotation editing: Always read-only except for superusers
Missing signal imports: Import signal handlers in apps.py ready() method
PDF rendering slow in tests: Increase timeouts to 20s+ for Chromium
Cache serialization crashes: Keep InMemoryCache definition inside wrapper, not test file
Backend Tests Waiting > 10 seconds on Postgres to be Ready: Docker network issue. Fix with: docker compose -f test.yml down && docker kill $(docker ps -q) && docker compose -f test.yml down
Empty lists on direct navigation: AuthGate pattern solves this (don't check auth status, it's always ready)
URL desynchronization: Use CentralRouteManager, don't bypass routing system
Jotai state not updating: Ensure atoms are properly imported and used with useAtom hook
Writing sync agent tools: All agent tools must be async. The PydanticAIToolWrapper accepts sync functions but calls them directly (no thread pool) — sync Django ORM calls will raise SynchronousOnlyOperation. Use the a-prefixed async versions in core_tools.py.
PydanticAI system_prompt silently dropped: When creating PydanticAIAgent, use instructions= NOT system_prompt=. The system_prompt parameter is only included when message_history is None, but OpenContracts' chat() persists a HUMAN message before calling pydantic-ai's run(), so history is always non-empty. This causes the system prompt to be silently dropped. See docs/architecture/llms/README.md for full details.
Apollo cache keyArgs must use field argument names, not variable names: In cache.ts, relayStylePagination(["corpus", "name_Contains"]) uses the GraphQL field argument names (e.g., extracts(corpus: $id, name_Contains: $text)), NOT the variable names ($id, $text). Mismatched keyArgs silently fail to isolate cache entries, causing queries with different filters to share stale cached results.

Corrupted Docker iptables chains (RARE): If you see Chain 'DOCKER-ISOLATION-STAGE-2' does not exist errors, Docker's iptables chains have been corrupted during docker cycling. Run this nuclear fix:

sudo systemctl stop docker && sudo systemctl stop docker.socket && sudo ip link delete docker0 2>/dev/null || true && sudo iptables -t nat -F && sudo iptables -t nat -X && sudo iptables -t filter -F && sudo iptables -t filter -X 2>/dev/null || true && sudo iptables -t mangle -F && sudo iptables -t mangle -X && sudo iptables -t filter -N INPUT 2>/dev/null || true && sudo iptables -t filter -N FORWARD 2>/dev/null || true && sudo iptables -t filter -N OUTPUT 2>/dev/null || true && sudo iptables -P INPUT ACCEPT && sudo iptables -P FORWARD ACCEPT && sudo iptables -P OUTPUT ACCEPT && sudo systemctl start docker

This completely resets Docker's networking and iptables state. Docker will recreate all required chains on startup.

claude · 2026-03-29T12:46:31Z

test

claude · 2026-03-29T12:48:08Z

Simple review test

claude · 2026-03-29T12:49:05Z

CorpusHome Query Performance Optimization Plan

Executive Summary

After tracing all 3 GraphQL queries from the CorpusHome frontend through to their backend resolvers, I identified 6 concrete performance bottlenecks — 3 critical, 2 moderate, and 1 low severity. All proposed fixes are backend-only, respect the consolidated permission system, and require no frontend query changes.

Queries Analyzed

#	Query	Component	Backend Entry Point
1	`GET_CORPUS_WITH_HISTORY`	CorpusLandingView, CorpusDetailsView	`OpenContractsNode.get_node_from_global_id()` → `Corpus.objects.visible_to_user(user).get(id=pk)`
2	`GET_DOCUMENT_RELATIONSHIPS`	DocumentTableOfContents	`resolve_document_relationships()` → `DocumentRelationshipQueryOptimizer`
3	`GET_CORPUS_DOCUMENTS_FOR_TOC`	DocumentTableOfContents	`resolve_documents()` → `Document.objects.visible_to_user(user)` + `DocumentFilter.in_corpus()`

Bottleneck Analysis

CRITICAL 1: `CountableConnection.resolve_total_count()` materializes entire queryset

File: config/graphql/base.py:65-76
Affects: Queries 2 and 3 (both use CountableConnection)

Current code:

def resolve_total_count(root, info, **kwargs):
    if isinstance(root.iterable, django.db.models.QuerySet):
        return root.iterable.model.objects.filter(
            pk__in=[obj.pk for obj in root.iterable]
        ).count()
    else:
        return len(root.iterable)

Problem: [obj.pk for obj in root.iterable] iterates and materializes every object in the unpaginated queryset into Python, extracts PKs into a list, then sends WHERE id IN (pk1, pk2, ..., pk10000) to the database. For a corpus with 1,000 documents, this instantiates 1,000 Document model objects plus their select_related data just to count them.

Fix: Replace with root.iterable.count() which generates a single SELECT COUNT(*) SQL query without materializing any objects.

Estimated improvement: For a 1,000-document corpus, eliminates ~1,000 object instantiations + 1 redundant COUNT query. Cost drops from O(N) Python + O(N) SQL to O(1) SQL.

CRITICAL 2: `DocumentRelationshipQueryOptimizer` materializes ALL visible doc/corpus IDs into Python sets

File: opencontractserver/documents/query_optimizer.py:340-406
Affects: Query 2

Current code:

visible_ids = set(
    Document.objects.visible_to_user(user).values_list("id", flat=True)
)
# ... used as:
queryset = DocumentRelationship.objects.filter(
    source_document_id__in=visible_doc_ids,
    target_document_id__in=visible_doc_ids,
)

Problems:

Materializes ALL visible document IDs into a Python set(). For a user with 10,000 visible documents, creates a set of 10,000 integers and generates WHERE source_document_id IN (1, 2, ..., 10000) AND target_document_id IN (1, 2, ..., 10000).
Same issue for _get_visible_corpus_ids().
Document.objects.visible_to_user(user) adds unnecessary select_related("creator", "user_lock") JOINs even though we only need IDs.
The resolver at queries.py:2631 does NOT pass context=info.context to the optimizer, so the request-level caching mechanism is never used.

Fix:

Change _get_visible_document_ids() / _get_visible_corpus_ids() to return lazy QuerySet values for use as SQL subqueries instead of materialized Python sets.
Pass context=info.context from the resolver.

Estimated improvement: Eliminates materializing N document IDs into Python. Replaces 2 separate queries + 2 IN-clause queries with a single query containing efficient SQL subqueries. ~4x fewer DB round-trips, O(1) Python memory.

CRITICAL 3: N+1 on nested FKs in DocumentRelationship `select_related`

File: opencontractserver/documents/query_optimizer.py:481-487
Affects: Query 2

Current code:

return queryset.select_related(
    "source_document",
    "target_document",
    "annotation_label",
    "corpus",
    "creator",
)

Problem: The frontend query requests sourceDocument.creator.slug, targetDocument.creator.slug, and corpus.creator.slug. These nested FK accesses are NOT covered by the current select_related. Each access triggers a separate DB query. For 20 relationships, that's up to 60 extra queries (3 per relationship).

Fix: Add nested select_related paths:

return queryset.select_related(
    "source_document__creator",
    "target_document__creator",
    "annotation_label",
    "corpus__creator",
    "creator",
)

Note: source_document__creator automatically includes source_document in the JOIN, so explicit listing of the parent is not needed.

Estimated improvement: Eliminates up to 3N queries. For 20 results, saves ~60 DB round-trips.

MODERATE 1: `myPermissions` on DocumentRelationshipType silently errors for every result

File: config/graphql/permissioning/permission_annotator/mixins.py:266-278
Affects: Query 2

Problem: DocumentRelationshipType inherits AnnotatePermissionsForReadMixin, but DocumentRelationship has NO guardian permission tables. The mixin tries to access self.documentrelationshipuserobjectpermission_set which raises AttributeError, caught by the outer try/except — silently returning an empty permission list. For each result this means wasted attribute lookup + exception handling + a guardian group permission query that also fails.

Fix: Add "documentrelationship" to the pre-computed permissions check in the mixin (alongside "annotation" and "relationship"). In the resolve_document_relationships resolver, annotate permission flags (_can_read, _can_create, etc.) on the queryset based on the user's document/corpus permissions. This follows the exact pattern already established by AnnotationQueryOptimizer.

MODERATE 2: Unnecessary heavy `prefetch_related` on Document `visible_to_user()` for TOC query

File: opencontractserver/shared/Managers.py:133-175
Affects: Query 3

Problem: Document.objects.visible_to_user(user) always applies 7+ prefetch_related lookups (doc_annotations with nested select_related, rows, source_relationships, target_relationships, notes, plus guardian permission prefetches). The TOC query only needs id, title, slug, icon, fileType, creator.slug — none of these prefetches are used. When Django evaluates the queryset to return paginated edges, ALL prefetch queries fire. That's ~7 extra DB queries returning data that's never accessed.

Fix: Add a lightweight=False parameter to visible_to_user(). When True, skip the heavy prefetches (doc_annotations, rows, relationships, notes) but keep select_related("creator", "user_lock") and the guardian permission prefetch. Update the resolve_documents resolver to accept an inCorpusWithId hint and use lightweight mode when it detects a TOC-pattern query.

Note: This is the most invasive change. An alternative minimal approach: clear prefetches in a new TOC-specific resolver rather than modifying the shared visible_to_user().

LOW: `descriptionRevisions` N+1 on author FK

File: config/graphql/graphene_types.py:2037-2039
Affects: Query 1

Current code:

def resolve_description_revisions(self, info):
    return self.revisions.all() if hasattr(self, "revisions") else []

Problem: Each revision's author { id, email } field triggers a separate FK query. For 10 revisions, that's 10 extra queries.

Fix: Add select_related("author"):

def resolve_description_revisions(self, info):
    return self.revisions.select_related("author").all() if hasattr(self, "revisions") else []

Implementation Plan

Phase 1: Zero-Risk, High-Impact Fixes (4 changes, 4 files)

These are purely additive backend optimizations that don't change query semantics or permission behavior.

#	Fix	File	Risk	Impact
1	`CountableConnection.resolve_total_count()` — use `.count()` on queryset directly	`config/graphql/base.py`	Very Low	Critical — affects ALL paginated queries
2	DocumentRelationship nested `select_related` for creator FKs	`opencontractserver/documents/query_optimizer.py`	Very Low	High — eliminates 3N queries
3	`descriptionRevisions` `select_related("author")`	`config/graphql/graphene_types.py`	Very Low	Low — eliminates N revision author queries
4	Pass `context=info.context` to DocumentRelationshipQueryOptimizer	`config/graphql/queries.py`	Very Low	Medium — enables existing request-level caching

Phase 2: Medium-Risk, High-Impact Fixes (2 changes)

#	Fix	Files	Risk	Impact
5	DocumentRelationship optimizer uses subqueries instead of materialized Python sets	`opencontractserver/documents/query_optimizer.py`	Low	High — O(N) → O(1) memory, fewer DB round-trips
6	Pre-compute DocumentRelationship permissions to avoid silent errors	`opencontractserver/documents/query_optimizer.py` + `config/graphql/permissioning/permission_annotator/mixins.py`	Low	Medium — eliminates N exceptions per response

Phase 3: Higher-Risk Optimization (1 change)

#	Fix	Files	Risk	Impact
7	Slim down Document `visible_to_user()` prefetch for lightweight queries	`opencontractserver/shared/Managers.py`	Medium	High — saves ~7 queries on every TOC request

Security Considerations

No permission bypasses: All fixes keep the existing permission filtering intact. Subqueries contain the exact same WHERE clauses as the materialized sets.
No new guardian-table access: The DocumentRelationship permission pre-computation follows the established pattern from AnnotationQueryOptimizer.
No frontend changes needed: All fixes are backend-only query optimizations.
Consolidated permission formula preserved: Effective Permission = MIN(document_permission, corpus_permission) is unchanged.
IDOR prevention maintained: All queries still go through visible_to_user() or equivalent optimizer methods.

What I'm NOT Changing

Frontend GraphQL queries (no changes needed)
The permission model or guardian table structure
The visible_to_user() filtering logic (only optimizing HOW it's executed, not WHAT it filters)
Any existing test expectations
The AnnotatePermissionsForReadMixin behavior for models that DO have guardian tables

claude · 2026-03-29T12:49:13Z

placeholder review content without hash headers

claude · 2026-03-29T12:49:18Z

This is a test header

Some content here

Another header

This is a fourth-level header

Content here

claude · 2026-03-29T12:50:15Z

PR Review: Fix Semantic UI removal cleanup issues

Overall this is a well-structured PR with a clear scope.

ExtractCellFormatter.tsx

Good:

The folderIconAlpha() color fix is correct.
The roving tabindex implementation follows WAI-ARIA menu pattern properly.
Cleanup of the mouseLeaveTimeoutRef on unmount is good defensive coding.

claude · 2026-03-29T12:50:19Z

Test with backtick code: folderIconAlpha()
Another line with getCellBackground()

claude · 2026-03-29T12:50:24Z

The folderIconAlpha() color fix is correct. Hardcoding rgba(245, 158, 11, ...) while getCellBackground() used folderIconAlpha() was a genuine inconsistency.
The roving tabindex implementation follows WAI-ARIA menu pattern properly (Home/End/ArrowUp/ArrowDown/Tab/Escape all handled).

claude · 2026-03-29T12:50:28Z

Good: Moving context menus to createPortal(_, document.body) correctly escapes the PDF container's stacking context. zIndex: 1000 to Z_INDEX.CONTEXT_MENU follows the project's constants convention.

Observation: After portaling, onMouseDown={(e) => e.stopPropagation()} is no longer needed to shield the PDF canvas, but it's harmless.

claude · 2026-03-29T12:50:33Z

Z_INDEX.HEADER = 50 is clear and correctly documented. One suggestion: if the sidebar's z-index (90) is not already a named constant, extracting it would make the stated invariant ("50 < sidebar's 90") self-enforcing. Not blocking.

All three changes are correct and clean:

Removing duplicate box-shadow/border-radius from StyledModalInner prevents doubled styling with OsModal.
MODAL_BODY_MAX_HEIGHT = "70vh" follows the "no magic numbers" rule.
variant="danger" is the right semantic choice for a destructive action.

claude · 2026-03-29T12:50:45Z

embedder.py — `embed_texts_batch()` base implementation

Good: Sequential fallback catches per-text exceptions and returns None for that slot. Docstring is clear about the contract.

Observation: Return type annotation is Optional[list[Optional[list[float]]]] but the implementation always returns a list (never None). The docstring says "The outer list is always returned (never None)" — the return type could be tightened to list[Optional[list[float]]] to match.

sent_transformer_microservice.py — `embed_texts_batch()`

Good:

Per-element NaN handling preserves valid vectors in the same batch.
The 3D squeeze handles microservices that wrap each embedding in an extra list.

Potential issue — silent truncation:

The caller (_batch_embed_text_annotations) uses EMBEDDING_API_BATCH_SIZE = 50, so the > 100 guard won't trigger in practice. But if it ever fires, the caller still expects len(vectors) == len(chunk). The zip(chunk, vectors) call will silently skip the truncated annotations — no failed count increment, no error. Consider raising a ValueError instead of silently truncating, since the caller controls batch size and this shouldn't happen.

claude · 2026-03-29T12:50:51Z

embeddings_task.py — Refactored `calculate_embeddings_for_annotation_batch`

Good:

Partitioning (text-only vs multimodal) is clear.
Dual-embedding fallback path is unchanged.
Error messages now correctly use annotation.id instead of annotation_id.

Redundant inner import to remove:

EMBEDDING_API_BATCH_SIZE is already imported at module level (line 9 of the diff). The inner import inside if text_only: is unnecessary and can be removed.

claude · 2026-03-29T12:51:00Z

Tests

The test coverage is excellent:

test_batch_embedding.py: happy path, empty text, sub-batching, partial failure, full batch None, exception propagation, add_embedding returning None, add_embedding raising, base class sequential fallback, per-text exception in fallback.
TestMicroserviceEmbedderBatchUnit: 2D/3D shapes, NaN per-element, 4xx/5xx, network exception, truncation, API key header, endpoint URL.

One gap: No test covers embed_texts_batch returning a list shorter than expected (e.g., 4 vectors for a 5-text batch). The zip(chunk, vectors) in _batch_embed_text_annotations silently omits the trailing annotation with no failed count increment. Worth adding a defensive length check and a corresponding test.

claude · 2026-03-29T12:51:08Z

Summary

Solid PR. Fixes are targeted, and the ExtractCellFormatter accessibility work correctly follows WAI-ARIA patterns with good Playwright test coverage.

Actionable before merge:

Remove the redundant inner import of EMBEDDING_API_BATCH_SIZE inside the if text_only: block (it's already at module level).
Harden the >100 truncation in MicroserviceEmbedder.embed_texts_batch — raise ValueError instead of silently truncating, since silent truncation produces a length mismatch the caller doesn't detect.
(Optional) Add a length-mismatch guard in _batch_embed_text_annotations for the case where embed_texts_batch returns fewer vectors than inputs.

claude · 2026-03-29T12:51:32Z

PR Review: Fix Semantic UI removal cleanup issues

Overall this is a well-structured PR with a clear scope. Here's my feedback organized by area.

claude · 2026-03-29T12:51:39Z

Some content here

A section header

PR Review: Fix Semantic UI removal cleanup issues

Some content here

Frontend Fixes

claude · 2026-03-29T12:51:52Z

PR Review: Fix Semantic UI removal cleanup issues

Overall this is a well-structured PR with a clear scope. Here is my feedback organized by area.

Frontend Fixes

ExtractCellFormatter.tsx

claude · 2026-03-29T12:52:25Z

PR Review: Fix Semantic UI removal cleanup issues

Overall this is a well-structured PR with a clear scope. Here is my feedback organized by area.

Frontend Fixes

ExtractCellFormatter.tsx — Accessibility improvements

Good:

The folderIconAlpha() color fix is correct. Hardcoding rgba(245, 158, 11, ...) while getCellBackground() used folderIconAlpha() was a genuine inconsistency.
The roving tabindex implementation follows WAI-ARIA menu pattern properly (Home/End/ArrowUp/ArrowDown/Tab/Escape all handled).
Cleanup of the mouseLeaveTimeoutRef on unmount is good defensive coding.

Minor observations:

The mouseLeaveTimeoutRef cleanup is duplicated across the unmount effect and the isPopupOpen effect. Not a bug, just slightly redundant.
openedViaKeyboard resets via setOpenedViaKeyboard(false) inside requestAnimationFrame, causing an extra re-render. Using a ref instead of state for this flag would avoid it, but the current approach is correct.

DocxAnnotator.tsx / SelectionLayer.tsx / TxtAnnotator.tsx — Portal rendering

Good: Moving context menus to createPortal(_, document.body) correctly escapes the PDF container stacking context. zIndex: 1000 to Z_INDEX.CONTEXT_MENU follows the project constants convention.

Observation: After portaling, onMouseDown={(e) => e.stopPropagation()} is no longer needed to shield the PDF canvas, but it is harmless.

HeaderAndLayout.tsx — z-index fix

Z_INDEX.HEADER = 50 is clear and correctly documented. One suggestion: if the sidebar z-index (90) is not already a named constant, extracting it would make the stated invariant ("50 < sidebar 90") self-enforcing. Not blocking.

EditMessageModal.tsx / SelectDocumentsModal.tsx / WorkerTokensSection.tsx

All three changes are correct and clean:

Removing duplicate box-shadow/border-radius from StyledModalInner prevents doubled styling with OsModal.
MODAL_BODY_MAX_HEIGHT = "70vh" follows the "no magic numbers" rule.
variant="danger" is the right semantic choice for a destructive action.

Backend Changes (Batch Embedding)

embedder.py — `embed_texts_batch()` base implementation

Good: Sequential fallback catches per-text exceptions and returns None for that slot. Docstring is clear about the contract.

Observation: Return type annotation is Optional[list[Optional[list[float]]]] but the implementation always returns a list (never None). The docstring says "The outer list is always returned (never None)" — the return type could be tightened to list[Optional[list[float]]] to match.

sent_transformer_microservice.py — `embed_texts_batch()`

Good:

Per-element NaN handling preserves valid vectors in the same batch.
The 3D squeeze handles microservices that wrap each embedding in an extra list.

Potential issue — silent truncation:

The caller (_batch_embed_text_annotations) uses EMBEDDING_API_BATCH_SIZE = 50, so the > 100 guard will not trigger in practice. But if it ever fires, the caller still expects len(vectors) == len(chunk). The zip(chunk, vectors) call will silently skip the truncated annotations — no failed count increment, no error. Consider raising a ValueError instead of silently truncating, since the caller controls batch size and this should not happen.

embeddings_task.py — Refactored `calculate_embeddings_for_annotation_batch`

Good:

Partitioning (text-only vs multimodal) is clear.
Dual-embedding fallback path is unchanged.
Error messages now correctly use annotation.id instead of annotation_id.

Redundant inner import to remove:

EMBEDDING_API_BATCH_SIZE is already imported at module level (line 9 of the diff). The inner import inside if text_only: is unnecessary and can be removed.

Tests

The test coverage is excellent:

test_batch_embedding.py: happy path, empty text, sub-batching, partial failure, full batch None, exception propagation, add_embedding returning None, add_embedding raising, base class sequential fallback, per-text exception in fallback.
TestMicroserviceEmbedderBatchUnit: 2D/3D shapes, NaN per-element, 4xx/5xx, network exception, truncation, API key header, endpoint URL.

One gap: No test covers embed_texts_batch returning a list shorter than expected (e.g., 4 vectors for a 5-text batch). The zip(chunk, vectors) in _batch_embed_text_annotations silently omits the trailing annotation with no failed count increment. Worth adding a defensive length check and a corresponding test.

Summary

Solid PR. Fixes are targeted, and the ExtractCellFormatter accessibility work correctly follows WAI-ARIA patterns with good Playwright test coverage.

Actionable before merge:

Remove the redundant inner import of EMBEDDING_API_BATCH_SIZE inside the if text_only: block (it is already at module level).
Harden the >100 truncation in MicroserviceEmbedder.embed_texts_batch — raise ValueError instead of silently truncating, since silent truncation produces a length mismatch the caller does not detect.
(Optional) Add a length-mismatch guard in _batch_embed_text_annotations for the case where embed_texts_batch returns fewer vectors than inputs.

Reverts all backend files (embeddings batch, markdown parser, requirements, pipeline changes) to main. Removes batch embedding plan/spec docs. Cleans CHANGELOG of backend-specific entries. PR now contains only the Semantic UI cleanup fixes and CAML frontend work.

Previous commit used stale local main which included unmerged dependabot bumps. Now restoring from origin/main so the PR introduces zero backend changes.

claude · 2026-03-29T14:48:47Z

Code Review

Overall this PR is clean and well-motivated. The accessibility work on ExtractCellFormatter is especially thorough. A few concerns worth discussing before merge.

Potential Regression: Markdown/CAML Backend Support Removed

The most significant issue in the backend changes: the PR removes MarkdownParser, the text/markdown MIME type from FileTypeEnum/MIME_TO_FILE_TYPE, and the TEXT_MIMETYPES set that included "text/markdown". The changelog reflects this — the CAML feature description was edited from "Full-stack support" to "Frontend support", silently removing mention of MarkdownParser, text_to_frontmatter, and TEXT_MIMETYPES.

Concerns:

Any user who previously uploaded .md or .caml files will have them reprocessed as text/plain going forward.
document_mutations.py removed the markdown-detection block, so new .md/.caml uploads also receive text/plain MIME type.
DocumentLabelFilter had the title field removed — worth confirming this is intentional and not a diff collision.

If CAML is purely frontend-rendered and the backend parser was never load-bearing, that is fine — but it should be called out explicitly. If any deployments used MarkdownParser, removing it without a migration path is a regression.

Dependency Version Downgrades

Several packages are being rolled back:

Package	From	To
`redis`	7.4.0	7.3.0
`celery`	5.6.3	5.6.2
`cryptography`	46.0.6	46.0.5
`django-tree-queries`	0.24.0	0.23.1
`djangorestframework`	3.17.1	3.17.0
`gunicorn`	25.2.0	25.1.0

The cryptography change is the one worth explaining: the changelog calls out upgrading to 46.0.5 to fix CVE-2026-26007, but main must have been at 46.0.6 already. Rolling back to 46.0.5 still covers the CVE but is worth confirming. These look like merge-conflict resolutions reverting upstream bumps — a brief note confirming intentionality would help reviewers.

Codecov Action Downgrade

- uses: codecov/codecov-action@v6
+ uses: codecov/codecov-action@v5

Downgrading a CI action is unusual. Intentional (v6 broke something?) or a merge conflict artifact?

Portal Rendering for Context Menus

createPortal(_, document.body) in SelectionLayer.tsx, TxtAnnotator.tsx, and DocxAnnotator.tsx is the correct fix for z-index trapping. One thing to verify: portaled menus need fixed/absolute positioning recalculated via getBoundingClientRect() to stay visually anchored to their trigger. If the existing positioning was relative to PDFContainer, moving to document.body without updating the coordinate calculation will render menus at the wrong position. The test plan should cover this.

ExtractCellFormatter Accessibility — Minor Notes

The keyboard navigation implementation looks well-considered overall. Two small items:

mouseLeaveTimeoutRef stale ID: The useEffect cleanup clears the timeout — good. Just verify the ref is also nulled inside the callback itself (mouseLeaveTimeoutRef.current = null) so guard checks do not act on a stale timer ID after it fires.
aria-expanded type: Confirm it is set as a string ("true"/"false") rather than a JS boolean. React serializes both identically to the DOM, but WAI-ARIA spec calls for string attribute values.

Positive Highlights

variant="danger" on Revoke button: correct semantic approach over inline style override.
EditMessageModal doubled-shadow fix: clean and correct.
MODAL_BODY_MAX_HEIGHT constant extraction: good no-magic-numbers compliance.
Z_INDEX.HEADER on HeaderContainer: right fix for the backdrop-filter stacking context issue.
New ExtractCellFormatter.ct.tsx tests: the three keyboard test cases (open, arrow nav, Escape) are meaningful and well-targeted.

Summary: The UI/UX fixes are solid. Main questions before merge: (1) is removing markdown backend support intentional and are existing markdown documents unaffected? (2) are the dependency downgrades deliberate? (3) what drove the codecov v6 → v5 change?

claude · 2026-03-29T14:54:48Z

Code Review

Overall this is a clean, well-scoped cleanup PR. The changes are targeted, the rationale is clear, and test coverage for the new keyboard navigation is solid.

What is done well

Color consistency fix (ExtractCellFormatter.tsx): Replacing the two hardcoded amber rgba values with folderIconAlpha(0.15) and folderIconAlpha(0.25) is exactly the right fix.

WAI-ARIA menu implementation: The roving tabindex pattern (ArrowUp/ArrowDown/Home/End/Tab/Escape) follows the WAI-ARIA menu button pattern correctly. aria-haspopup, aria-expanded, role="menu", and role="menuitem" with tabIndex={-1} are all in place.

Portal fix for stacking context (SelectionLayer, TxtAnnotator, DocxAnnotator): Using createPortal to escape PDFContainer stacking context is the correct solution. The comment in each file explains why.

Keyboard focus restoration: Returning focus to statusDotRef on Escape and after Tab-to-close is correct per ARIA guidelines.

mouseLeaveTimeoutRef cleanup: The useEffect cleanup on unmount and the isPopupOpen effect cleanup both guard against dangling timers.

Minor issues

mouseLeaveTimeoutRef not nulled after the timer fires

In the onMouseLeave handler the timer is stored but never cleared from the ref once it fires. The stale ID is harmless (clearTimeout on a fired ID is a no-op), but it means the if (mouseLeaveTimeoutRef.current) guards are truthy even when no timer is pending. Consider clearing the ref inside the callback itself.

Z-index constant inconsistency between popup and portaled menus

The annotator menus moved to portals use Z_INDEX.CONTEXT_MENU, while the ExtractCellFormatter popup (not portaled) uses Z_INDEX.DROPDOWN. This is likely intentional since the popup lives in normal flow rather than inside PDFContainer, but a brief inline comment would make the distinction explicit — similar to the Z_INDEX.HEADER JSDoc added in constants.ts.

openedViaKeyboard + requestAnimationFrame edge case

If the user opens the popup via keyboard then immediately closes it (e.g. double-press Enter), the rAF fires after the popup has closed and tries firstMenuItemRef.current?.focus(). Optional chaining handles it safely, but setOpenedViaKeyboard(false) still runs. Consider checking isPopupOpen inside the rAF callback before calling focus.

ExtractCellFormatter popup is not portaled

The PR description frames the fix as moving context menus to createPortal, but the extract cell popup retains position: "absolute". This is fine if it is not nested inside PDFContainer, but a comment confirming the intent would help future readers understand why this one is treated differently.

Nits

EditMessageModal.tsx: The removed border-radius: 0 in the mobile media query was only compensating for the now-removed desktop border-radius. Clean removal.
MODAL_BODY_MAX_HEIGHT as a string constant is consistent with how other dimension strings are handled in constants.ts.
Z_INDEX.HEADER = 50 with the JSDoc comment explaining the flex-column non-overlap is a great documentation pattern.

Test coverage

The three new Playwright tests cover the key keyboard-accessibility paths:

Enter to open + focus first item
ArrowDown navigation + wrap-around
Escape to close + focus return to trigger

Missing cases (not blocking, worth noting for future work): ArrowUp wrap-around, Home/End keys, Tab-to-close focus return.

Overall: approve with minor suggestions. The nits above are non-blocking.

claude · 2026-03-29T14:54:53Z

Code Review

Overall this is a clean, focused cleanup PR with well-scoped changes. A few things worth flagging:

Bugs / Potential Issues

1. ExtractCellFormatter popup is not portaled — inconsistent with other menus

The three SelectionActionMenu instances are correctly moved to createPortal(_, document.body) to escape the PDFContainer stacking context. But the status-cell popup in ExtractCellFormatter still uses position: absolute with z-index: Z_INDEX.DROPDOWN. If the datagrid cell's container ever establishes its own stacking context (e.g., via transform, filter, or will-change), this popup will still be clipped. Given the motivating bug was stacking-context entrapment, it may be worth portaling this one too — or at least adding a comment explaining why it is safe to leave it in-flow.

2. Tab key traps focus — consider not preventing default

Trapping Tab and redirecting focus back to the trigger deviates from natural focus order and can surprise keyboard users. The WAI-ARIA menu pattern recommends closing on Tab but not preventing default — let the browser handle natural tab movement. Consider removing e.preventDefault() (or handling Shift+Tab separately to go backward).

3. Missing aria-controls on trigger

StatusDot has aria-haspopup="menu" and aria-expanded, but no aria-controls pointing to the popup. Adding id="cell-actions-menu" to the popup div and aria-controls="cell-actions-menu" to the trigger would make the relationship explicit for assistive technology.

Minor / Nits

4. Mobile border-radius removal in EditMessageModal

Removing border-radius: 0 from the mobile media query is fine since border-radius was already stripped from StyledModalInner. Just worth confirming the OsModal wrapper applies border-radius: 0 on mobile independently, otherwise rounded corners could reappear on small viewports.

5. querySelectorAll in onKeyDown handler

The DOM query runs on every keypress inside the menu. Not a real perf concern at this scale, but since the item list is static a stable ref would be cleaner.

6. MODAL_BODY_MAX_HEIGHT constant

Works correctly. A short inline comment noting the unit would help future readers.

Positives

The folderIconAlpha fix is a real correctness issue, not just cosmetic — good catch.
Mouse-leave timeout now stores the ID and clears it on re-entry; the old bare setTimeout without storing the return value was a genuine leak.
WAI-ARIA roving tabindex (Home/End/ArrowUp/ArrowDown, focus return on Escape) follows the spec correctly.
Playwright tests cover keyboard open, arrow navigation, and Escape-to-close — solid coverage for the new accessibility behaviour.
Portaling SelectionActionMenu via createPortal is the right fix and Z_INDEX.CONTEXT_MENU usage is consistent with the constants pattern.
CHANGELOG entry is accurate and appropriately scoped.

Merge remote-tracking branch 'origin/main' into claude/resolve-issue-…

d34c540

…1123-puDwk

JSv4 added 2 commits March 23, 2026 23:38

Fix prettier formatting to pass CI lint check

3caec94

JSv4 added 2 commits March 24, 2026 07:47

Merge remote-tracking branch 'origin/main' into claude/resolve-issue-…

7438029

…1123-puDwk

JSv4 added 9 commits March 24, 2026 23:20

Add EMBEDDING_API_BATCH_SIZE constant for batch embedding sub-batching

a847436

Add embed_texts_batch() sequential fallback to BaseEmbedder

5f73af9

Add embed_texts_batch() to test embedders for batch path testing

0ef8c98

Add _batch_embed_text_annotations() helper for batch embedding

4892e5b

Refactor batch task to partition and use batch embedding for text-onl…

f7563b1

…y annotations

Add integration tests for batch embedding aggregation

a4e5919

JSv4 added 2 commits March 28, 2026 21:33

Merge remote-tracking branch 'origin/main' into claude/resolve-issue-…

3c464d9

…1123-puDwk # Conflicts: # CHANGELOG.md

JSv4 added 2 commits March 28, 2026 22:19

Merge remote-tracking branch 'origin/main' into claude/resolve-issue-…

278cbab

…1123-puDwk

JSv4 added 2 commits March 29, 2026 09:44

Fix backend file revert to match origin/main exactly

3b61259

Previous commit used stale local main which included unmerged dependabot bumps. Now restoring from origin/main so the PR introduces zero backend changes.

Restore oc_markdown_parser and test_markdown_parser to match origin/main

cae3a87

JSv4 merged commit 3ec94b4 into main Mar 30, 2026
16 checks passed

JSv4 deleted the claude/resolve-issue-1123-puDwk branch March 30, 2026 03:40

Uh oh!

Conversation

JSv4 commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

claude bot commented Mar 24, 2026

PR Review

Positive

Issues

Minor

Summary

Uh oh!

codecov bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

claude bot commented Mar 24, 2026

PR Review: Fix Semantic UI removal cleanup issues

Bug: firstMenuItemRef assigned to all menu items

Minor: Incomplete ARIA menu keyboard pattern

Minor: CHANGELOG has spurious blank lines

Looks good

Question: Z_INDEX.HEADER value

Uh oh!

claude bot commented Mar 24, 2026

PR Review: Fix Semantic UI removal cleanup issues

Bugs / Correctness

Accessibility: Incomplete ARIA menu keyboard navigation

Minor: Redundant Escape handler

Test coverage

Positives

Uh oh!

claude bot commented Mar 24, 2026

Uh oh!

claude bot commented Mar 25, 2026

Uh oh!

claude bot commented Mar 25, 2026

Code Review - Full Review

Uh oh!

claude bot commented Mar 25, 2026

PR Review: Fix Semantic UI removal cleanup issues + Batch Embedding

Frontend Changes

ExtractCellFormatter.tsx - Accessibility + Color Fix

EditMessageModal.tsx

WorkerTokensSection.tsx

SelectDocumentsModal.tsx + constants.ts

Component Tests

Backend Changes - Batch Embedding

Architecture

BaseEmbedder.embed_texts_batch()

MicroserviceEmbedder.embed_texts_batch()

_batch_embed_text_annotations()

calculate_embeddings_for_annotation_batch() refactor

Tests (test_batch_embedding.py)

Documentation Concern

Summary - Items to address before merging

Uh oh!

claude bot commented Mar 29, 2026

Uh oh!

claude bot commented Mar 29, 2026

CLAUDE.md

Project Overview

Baseline Commit Rules

Essential Commands

Backend (Django)

Frontend (React/TypeScript)

Production Deployment

High-Level Architecture

Backend Architecture

Frontend Architecture

Data Flow Architecture

Critical Security Patterns

Critical Concepts

Testing Patterns

Manual Test Scripts

Expected Results

Cleanup

Automated Documentation Screenshots

JSv4 commented Mar 24, 2026 •

edited

Loading

codecov bot commented Mar 24, 2026 •

edited

Loading

Bug: `firstMenuItemRef` assigned to all menu items

Question: `Z_INDEX.HEADER` value

CRITICAL 1: `CountableConnection.resolve_total_count()` materializes entire queryset

CRITICAL 2: `DocumentRelationshipQueryOptimizer` materializes ALL visible doc/corpus IDs into Python sets

CRITICAL 3: N+1 on nested FKs in DocumentRelationship `select_related`

MODERATE 1: `myPermissions` on DocumentRelationshipType silently errors for every result

MODERATE 2: Unnecessary heavy `prefetch_related` on Document `visible_to_user()` for TOC query

LOW: `descriptionRevisions` N+1 on author FK

embedder.py — `embed_texts_batch()` base implementation

sent_transformer_microservice.py — `embed_texts_batch()`

embeddings_task.py — Refactored `calculate_embeddings_for_annotation_batch`