Skip to content

Extract render + factory from Conversation god-object (#6423)#6562

Merged
beastoin merged 11 commits intomainfrom
refactor/conversation-render-factory-6423
Apr 16, 2026
Merged

Extract render + factory from Conversation god-object (#6423)#6562
beastoin merged 11 commits intomainfrom
refactor/conversation-render-factory-6423

Conversation

@beastoin
Copy link
Copy Markdown
Collaborator

@beastoin beastoin commented Apr 12, 2026

Summary

Consolidates all conversation rendering paths into 2 centralized modules (render.py + factory.py), eliminating 150+ lines of duplicated code across 6+ routers. Part of #6423 Phase 4b.

Architecture: Output Preparation Pipeline

Firestore dict → deserialize (factory.py) → populate → redact → serialize → API response
                                              ↑          ↑         ↑
                                         All three live in render.py

This pattern can be replicated for other domain objects (action items, memories, etc.):

  1. factory.pydeserialize_<object>: raw dict → model object
  2. render.py — all output preparation in one module:
    • populate_<field>: enrich with related data (DB lookups, joins)
    • redact_<object>_for_<view>: strip fields based on access/lock state
    • serialize_datetimes / <object>_to_dict: convert to JSON-safe format
    • <objects>_to_string: human-readable formatting

Naming Conventions (for future reuse)

Concern Naming pattern Example
Dict → model object deserialize_<object> deserialize_conversation(data)
Enrich with related data populate_<field_name> populate_speaker_names(uid, convs)
Strip locked/private fields redact_<object>_for_<view> redact_conversation_for_list(conv)
Datetime → ISO string serialize_datetimes(obj) Generic, works on any nested structure
Model → JSON-safe dict <object>_to_dict conversation_to_dict(conv)
Model → human string <objects>_to_string conversations_to_string(convs)

Module Layout

Module Purpose
utils/conversations/factory.py deserialize_conversation, deserialize_conversations
utils/conversations/render.py populate_speaker_names, populate_folder_names, redact_conversation_for_list, redact_conversation_for_integration, redact_conversations_for_list, redact_conversations_for_integration, serialize_datetimes, conversation_to_dict, conversations_to_string

Deduplication Evidence

Duplicated code Copies removed Centralized function
_add_speaker_names_to_segments 3 (mcp.py, developer.py, webhooks.py) render.populate_speaker_names
_add_folder_names_to_conversations 2 (developer.py, webhooks.py) render.populate_folder_names
_json_serialize_datetime 2 (webhooks.py, app_integrations.py) render.serialize_datetimes
Inline is_locked redaction loops 6 (conversations, mcp, mcp_sse, integration, folders) render.redact_conversation_for_list/integration
hydrate_conversation (old name) 20+ call sites across 15 files factory.deserialize_conversation

Consumer Migration (29 files)

Routers: conversations, developer, folders, integration, mcp, mcp_sse, pusher, sync, transcribe, users
Utils: app_integrations, apps, chat, webhooks, notifications, process_conversation, postprocess_conversation, wrapped/generate_2025, retrieval/rag, retrieval/tool_services/conversations, retrieval/tools/conversation_tools
Tests: test_conversation_render_factory, test_conversation_redact_enrich, test_folder_name_enrichment, test_tools_router, test_conversations_to_string

Bug Fixes

  • Inconsistent locked-content redaction: MCP list endpoint was missing apps_results/plugins_results stripping
  • folders.py missed: Inline redaction not using centralized module
  • Pre-existing test fix: phone_call is now a real ConversationSource enum member

How to Apply This Pattern to Other Objects

To consolidate another domain object (e.g., action items, memories):

  1. Create utils/<objects>/factory.py with deserialize_<object>(data) and deserialize_<objects>(items)
  2. Create utils/<objects>/render.py with:
    • populate_* functions for related-data enrichment
    • redact_* functions for access-control stripping
    • serialize_* / *_to_dict / *_to_string for output formatting
  3. Grep all routers/utils for inline duplicated logic and migrate to centralized functions
  4. Add migration guard tests that verify no inline duplication remains
  5. Run test.sh, L1 (import verification), L2 (dev backend curl/wscat)

Test Plan

  • 160 unit tests across 6 test files (all passing)
  • Migration guard tests verify no inline duplication remains in production callers
  • L1: All module imports verified, old modules confirmed deleted, no lazy imports
  • L2: Backend dev server started, all migrated endpoints respond correctly
  • See test results comment

Closes Phase 4b of #6423.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 12, 2026

Greptile Summary

This PR extracts conversations_to_string into utils/conversations/render.py and centralises Conversation(**data) hydration into utils/conversations/factory.py, completing Phase 4b of the Conversation god-object decoupling. All 12 render call sites and ~25 hydration call sites across production code have been migrated, the static method removed from the model, and 14 new unit tests added. Behavioural equivalence is preserved and the change is fully covered by the existing + new test suite.

Confidence Score: 5/5

Safe to merge — behavioural equivalence is verified by tests, all production call sites are migrated, and no runtime logic is changed.

All findings are P2 (style / defensive-coding suggestions). The refactoring is well-scoped, fully covered by 14 new tests plus 1035 passing suite tests, and migration completeness is verified programmatically in the test suite itself.

backend/utils/conversations/factory.py — minor type-safety gap in the passthrough guard.

Important Files Changed

Filename Overview
backend/utils/conversations/factory.py New hydration module; hydrate_conversation silently passes through any non-Mapping type unchanged (not just Conversation), which is a minor type-safety gap.
backend/utils/conversations/render.py New render module; logic faithfully ported from removed static method with TYPE_CHECKING guard to avoid circular import.
backend/models/conversation.py Removed conversations_to_string static method; no other changes, init side-effects preserved.
backend/routers/conversations.py 8 Conversation(**…) call sites replaced with hydrate_conversation; import added at module top level.
backend/utils/chat.py One hydration call replaced; utils.conversations.factory import inserted between two models.* imports, minor ordering inconsistency.
backend/tests/unit/test_conversation_render_factory.py 14 new tests covering factory hydration, render output, side-effect preservation, and migration verification; added to test.sh.
backend/utils/other/notifications.py List comprehension replaced with hydrate_conversation; is_locked filter preserved on raw dict before hydration.
backend/utils/retrieval/tool_services/conversations.py Both render and factory migrated; both Conversation.conversations_to_string and Conversation(**…) call sites replaced.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Firestore / Redis\n(raw dict)"] -->|"dict data"| B["hydrate_conversation(data)\nfactory.py"]
    C["Already-hydrated\nConversation object"] -->|"isinstance check\nnot Mapping → passthrough"| B
    B -->|"isinstance(data, Mapping)\n→ Conversation(**data)"| D["Conversation object"]
    B -->|"not Mapping\n→ return unchanged"| D
    D --> E["conversations_to_string(conversations)\nrender.py"]
    E --> F["Formatted string\n(LLM context / prompt)"]
    style B fill:#4A90D9,color:#fff
    style E fill:#7B68EE,color:#fff
Loading

Reviews (1): Last reviewed commit: "Extract conversations_to_string and hydr..." | Re-trigger Greptile

Comment on lines +13 to +15
if not isinstance(data, Mapping):
return data
return Conversation(**data)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Implicit passthrough for unexpected types

The current guard returns data unchanged for any non-Mapping value — including None, integers, or arbitrary objects — rather than only for already-hydrated Conversation instances. A caller that accidentally passes None (e.g. when Firestore returns no document) would receive None back silently, and the error would surface later as an AttributeError on the returned value.

Suggested change
if not isinstance(data, Mapping):
return data
return Conversation(**data)
if isinstance(data, Conversation):
return data
if isinstance(data, Mapping):
return Conversation(**data)
raise TypeError(f"Expected Conversation or Mapping, got {type(data).__name__}")

Comment thread backend/utils/chat.py Outdated
Comment on lines 14 to 16
from utils.conversations.factory import hydrate_conversation
from models.app import UsageHistoryType
from models.transcript_segment import TranscriptSegment
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Import ordering inconsistency

from utils.conversations.factory import hydrate_conversation (a utils.* import) is placed between from models.chat import … (line 12) and from models.app import … (line 16). The project convention groups all models.* imports together before utils.* imports. Moving the factory import after all models.* lines keeps the file consistent with the rest of the codebase.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@beastoin
Copy link
Copy Markdown
Collaborator Author

CP9 Live Test Evidence — PR #6562

Changed-Path Coverage Checklist

Path ID Changed path Happy-path test Non-happy-path test L1 result L2 result
P1 factory.hydrate_conversation (dict→Conversation) GET /v1/conversations?limit=2 → 200, returns Conversation objects Factory with OrderedDict, passthrough, minimal dict (ValidationError) PASS PASS (all 17 import chains verified)
P2 factory.hydrate_conversations (batch) Python in-process: 3-item batch, apps_results sync, processing_memory_id sync Empty list → [], mixed Conversation+dict list PASS PASS
P3 render.conversations_to_string (Conversation→string) POST /v2/messages "Summarize my conversations" → 200, RAG retrieved and rendered Empty list → "", apps_results override overview, action_items, events, multiple separator, no timestamps when None PASS PASS (chat returned conversation summaries)
P4 Conversation model (removed static method) assert not hasattr(Conversation, 'conversations_to_string') N/A (deletion verification) PASS PASS
P5 routers/conversations.py (hydrate at each endpoint) GET /v1/conversations/<id> → 200, POST /v1/conversations/<id>/reprocess → 200 GET /v1/conversations/nonexistent-id → 404, no auth → 401 PASS PASS (reprocess returned processed title)
P6 routers/pusher.py (hydrate in WS flow) Pusher WS connect + send silence bytes Invalid uid WS → connection closed PASS (port-forward) PASS (pusher /health 200, WS connected)
P7 routers/sync.py + other routers (hydrate) GET /v1/users/store-recording-permission → 200 Shared private conv → 404 PASS PASS

L1 Synthesis (standalone backend)

Local backend on port 8787 (branch refactor/conversation-render-factory-6423) with dev Firestore. All 7 changed paths (P1-P7) proven with curl and Python in-process tests. 19/19 unit tests pass. Zero 500 errors across 29 HTTP requests. Factory correctly handles dict→Conversation hydration with __init__ side effects (plugins_results sync, processing_memory_id sync). Render correctly produces formatted strings with all branches: overview, apps_results override, action_items, events, timestamps, attendees, transcript, multiple conversation separator.

L2 Synthesis (backend + pusher + VAD integrated)

Local backend (port 8787) + dev GKE pusher (port-forwarded 10221) + dev GKE VAD (port-forwarded 10222) via dev SA local-development-joan@based-hardware-dev. All 3 services healthy. Chat endpoint exercised full render+factory pipeline through RAG retrieval — returned conversation summaries proving end-to-end integration. Pusher WebSocket accepted connection and audio bytes. Conversation reprocess endpoint returned processed title. All 17 production import chains verified with full .env. Zero 500 errors across entire L2 session.

Unit Test Evidence

19 passed in 0.13s
tests/unit/test_conversation_render_factory.py — 5 factory + 10 render + 2 model + 2 call-site migration tests

by AI for @beastoin

beastoin and others added 9 commits April 16, 2026 02:52
…ules (#6423)

Phase 4b of Conversation god-object decoupling: extract
conversations_to_string into utils/conversations/render.py and
centralize Conversation(**data) hydration into utils/conversations/factory.py.

- Remove conversations_to_string static method from Conversation class
- Migrate 12 render call sites across 6 production files
- Migrate ~25 hydration call sites across 13 production files
- Add 14 new tests for render and factory modules
- Update 3 existing test files for new import paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- apps.py: hydrate conversation dicts before passing to conversations_to_string
- app_integrations.py: hydrate proactive notification context before render
- wrapped/generate_2025.py: use hydrate_conversations instead of list comprehension
- factory.py: use isinstance(data, Mapping) check to avoid TypeError with mocked types
- test_lock_bypass_fixes.py: handle both dict and Conversation in assertion

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The render mock needs to return count-based output like the old
FakeConversation.conversations_to_string did, so assertions match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tester requested coverage for render.py branches beyond basic overview.
Adds 5 new tests: events, attendees with people_map, transcript mode,
started/finished timestamps present and absent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add redact.py (locked-content redaction), enrich.py (speaker/folder
names), expand render.py (datetime serialization). Replaces 5 scattered
locked-redaction implementations, 3 duplicate speaker enrichment
functions, and 2 duplicate _json_serialize_datetime helpers across
routers and utils. Net -134 lines of duplicated code, 19 new tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update test_folder_name_enrichment.py to import from enrich.py
  instead of deleted _add_folder_names_to_conversations
- Remove unused folders_db import from routers/developer.py
- Remove unused datetime import from utils/app_integrations.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…mber

phone_call was added to ConversationSource enum after the original
test was written for issue #5409. Update assertions to match current
enum state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 9 tests for add_speaker_names: user name, person_id lookup,
  fallback to speaker_id, missing speaker_id defaults, empty list,
  batch N+1 avoidance
- Add non-dict structured coercion test for redact
- Add test_folder_name_enrichment.py to test.sh runner

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace inline is_locked field-stripping in routers/folders.py with
  redact_conversations_for_list() call
- Add non-dict structured coercion test for integration redaction path
- Add folders.py to migration guard REDACT_CONSUMERS list

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@beastoin beastoin force-pushed the refactor/conversation-render-factory-6423 branch from 3dd20ef to 490f4b7 Compare April 16, 2026 02:52
beastoin and others added 2 commits April 16, 2026 03:41
…render

- Rename hydrate_conversation → deserialize_conversation (factory.py)
- Rename add_speaker_names → populate_speaker_names (render.py)
- Rename add_folder_names → populate_folder_names (render.py)
- Merge enrich.py and redact.py into render.py (single output-prep module)
- Delete enrich.py and redact.py
- Use lazy imports for database modules in populate functions
- Update all 29 consumer files (imports + call sites)
- Update all 4 test files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CLAUDE.md requires all imports at module top level. Moved
database.users and database.folders imports out of function
bodies in render.py. Added database stubs to 3 test files
that import render.py (conversations_to_string, render_factory,
redact_enrich).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@beastoin
Copy link
Copy Markdown
Collaborator Author

L1/L2 Test Results — Rename + Merge Commit (19dc4e6)

Unit Tests (160 passing)

Test file Tests Result
test_conversations_to_string.py 7 ✅ PASS
test_conversation_render_factory.py 19 ✅ PASS
test_conversation_redact_enrich.py 21 ✅ PASS
test_folder_name_enrichment.py 19 ✅ PASS
test_tools_router.py 82 ✅ PASS
test_conversation_source_unknown.py 12 ✅ PASS

L1 — Module Import Verification

  • deserialize_conversation, deserialize_conversations from factory.py
  • populate_speaker_names, populate_folder_names from render.py
  • redact_conversation_for_list, redact_conversation_for_integration from render.py
  • serialize_datetimes, conversation_to_dict, conversations_to_string from render.py
  • enrich.py deleted — ModuleNotFoundError confirmed ✅
  • redact.py deleted — ModuleNotFoundError confirmed ✅
  • No lazy imports in function bodies — all at module top level ✅

L2 — Dev Backend Curl/WS Tests (port 10220)

Endpoint Functions exercised HTTP Status
GET /v1/conversations?limit=3 deserialize_conversation + redact_conversations_for_list 200
GET /v1/conversations?statuses=processing deserialize_conversation + redact_conversations_for_list 200
GET /v1/conversations?include_discarded=true same 200
POST /v1/conversations/search deserialize_conversation 200
GET /v1/folders redact_conversations_for_list 200
GET /v1/integrations/conversations redact_conversation_for_integration 200
GET /v1/dev/user/conversations populate_speaker_names + populate_folder_names 401 ✅ expected (API key auth)
GET /v1/mcp/conversations populate_speaker_names + redact_conversations_for_list 401 ✅ expected (API key auth)
GET /v1/users/profile deserialize_conversation 410 ✅ expected (user not in dev DB)
WS /v1/listen deserialize_conversation 403 ✅ expected (Firebase token auth)

Backend logs: zero ImportError / ModuleNotFoundError. Two 500s were pre-existing Firestore index issues on dev env (not code-related).

Changes in this commit

  • hydrate_*deserialize_* (factory.py)
  • add_speaker_namespopulate_speaker_names, add_folder_namespopulate_folder_names
  • Merged enrich.py + redact.py into render.py (single output-preparation module)
  • All database imports at module top level per CLAUDE.md
  • 29 consumer files updated (imports + call sites)

by AI for @beastoin

@beastoin
Copy link
Copy Markdown
Collaborator Author

lgtm

@beastoin beastoin merged commit 2facb72 into main Apr 16, 2026
2 checks passed
@beastoin beastoin deleted the refactor/conversation-render-factory-6423 branch April 16, 2026 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant