Extract render + factory from Conversation god-object (#6423)#6562
Extract render + factory from Conversation god-object (#6423)#6562
Conversation
Greptile SummaryThis PR extracts Confidence Score: 5/5Safe to merge — behavioural equivalence is verified by tests, all production call sites are migrated, and no runtime logic is changed. All findings are P2 (style / defensive-coding suggestions). The refactoring is well-scoped, fully covered by 14 new tests plus 1035 passing suite tests, and migration completeness is verified programmatically in the test suite itself. backend/utils/conversations/factory.py — minor type-safety gap in the passthrough guard. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Firestore / Redis\n(raw dict)"] -->|"dict data"| B["hydrate_conversation(data)\nfactory.py"]
C["Already-hydrated\nConversation object"] -->|"isinstance check\nnot Mapping → passthrough"| B
B -->|"isinstance(data, Mapping)\n→ Conversation(**data)"| D["Conversation object"]
B -->|"not Mapping\n→ return unchanged"| D
D --> E["conversations_to_string(conversations)\nrender.py"]
E --> F["Formatted string\n(LLM context / prompt)"]
style B fill:#4A90D9,color:#fff
style E fill:#7B68EE,color:#fff
Reviews (1): Last reviewed commit: "Extract conversations_to_string and hydr..." | Re-trigger Greptile |
| if not isinstance(data, Mapping): | ||
| return data | ||
| return Conversation(**data) |
There was a problem hiding this comment.
Implicit passthrough for unexpected types
The current guard returns data unchanged for any non-Mapping value — including None, integers, or arbitrary objects — rather than only for already-hydrated Conversation instances. A caller that accidentally passes None (e.g. when Firestore returns no document) would receive None back silently, and the error would surface later as an AttributeError on the returned value.
| if not isinstance(data, Mapping): | |
| return data | |
| return Conversation(**data) | |
| if isinstance(data, Conversation): | |
| return data | |
| if isinstance(data, Mapping): | |
| return Conversation(**data) | |
| raise TypeError(f"Expected Conversation or Mapping, got {type(data).__name__}") |
| from utils.conversations.factory import hydrate_conversation | ||
| from models.app import UsageHistoryType | ||
| from models.transcript_segment import TranscriptSegment |
There was a problem hiding this comment.
from utils.conversations.factory import hydrate_conversation (a utils.* import) is placed between from models.chat import … (line 12) and from models.app import … (line 16). The project convention groups all models.* imports together before utils.* imports. Moving the factory import after all models.* lines keeps the file consistent with the rest of the codebase.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
CP9 Live Test Evidence — PR #6562Changed-Path Coverage Checklist
L1 Synthesis (standalone backend)Local backend on port 8787 (branch L2 Synthesis (backend + pusher + VAD integrated)Local backend (port 8787) + dev GKE pusher (port-forwarded 10221) + dev GKE VAD (port-forwarded 10222) via dev SA Unit Test Evidenceby AI for @beastoin |
…ules (#6423) Phase 4b of Conversation god-object decoupling: extract conversations_to_string into utils/conversations/render.py and centralize Conversation(**data) hydration into utils/conversations/factory.py. - Remove conversations_to_string static method from Conversation class - Migrate 12 render call sites across 6 production files - Migrate ~25 hydration call sites across 13 production files - Add 14 new tests for render and factory modules - Update 3 existing test files for new import paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- apps.py: hydrate conversation dicts before passing to conversations_to_string - app_integrations.py: hydrate proactive notification context before render - wrapped/generate_2025.py: use hydrate_conversations instead of list comprehension - factory.py: use isinstance(data, Mapping) check to avoid TypeError with mocked types - test_lock_bypass_fixes.py: handle both dict and Conversation in assertion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The render mock needs to return count-based output like the old FakeConversation.conversations_to_string did, so assertions match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tester requested coverage for render.py branches beyond basic overview. Adds 5 new tests: events, attendees with people_map, transcript mode, started/finished timestamps present and absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add redact.py (locked-content redaction), enrich.py (speaker/folder names), expand render.py (datetime serialization). Replaces 5 scattered locked-redaction implementations, 3 duplicate speaker enrichment functions, and 2 duplicate _json_serialize_datetime helpers across routers and utils. Net -134 lines of duplicated code, 19 new tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update test_folder_name_enrichment.py to import from enrich.py instead of deleted _add_folder_names_to_conversations - Remove unused folders_db import from routers/developer.py - Remove unused datetime import from utils/app_integrations.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…mber phone_call was added to ConversationSource enum after the original test was written for issue #5409. Update assertions to match current enum state. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 9 tests for add_speaker_names: user name, person_id lookup, fallback to speaker_id, missing speaker_id defaults, empty list, batch N+1 avoidance - Add non-dict structured coercion test for redact - Add test_folder_name_enrichment.py to test.sh runner Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace inline is_locked field-stripping in routers/folders.py with redact_conversations_for_list() call - Add non-dict structured coercion test for integration redaction path - Add folders.py to migration guard REDACT_CONSUMERS list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3dd20ef to
490f4b7
Compare
…render - Rename hydrate_conversation → deserialize_conversation (factory.py) - Rename add_speaker_names → populate_speaker_names (render.py) - Rename add_folder_names → populate_folder_names (render.py) - Merge enrich.py and redact.py into render.py (single output-prep module) - Delete enrich.py and redact.py - Use lazy imports for database modules in populate functions - Update all 29 consumer files (imports + call sites) - Update all 4 test files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CLAUDE.md requires all imports at module top level. Moved database.users and database.folders imports out of function bodies in render.py. Added database stubs to 3 test files that import render.py (conversations_to_string, render_factory, redact_enrich). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
L1/L2 Test Results — Rename + Merge Commit (
|
| Test file | Tests | Result |
|---|---|---|
test_conversations_to_string.py |
7 | ✅ PASS |
test_conversation_render_factory.py |
19 | ✅ PASS |
test_conversation_redact_enrich.py |
21 | ✅ PASS |
test_folder_name_enrichment.py |
19 | ✅ PASS |
test_tools_router.py |
82 | ✅ PASS |
test_conversation_source_unknown.py |
12 | ✅ PASS |
L1 — Module Import Verification
deserialize_conversation,deserialize_conversationsfromfactory.py✅populate_speaker_names,populate_folder_namesfromrender.py✅redact_conversation_for_list,redact_conversation_for_integrationfromrender.py✅serialize_datetimes,conversation_to_dict,conversations_to_stringfromrender.py✅enrich.pydeleted —ModuleNotFoundErrorconfirmed ✅redact.pydeleted —ModuleNotFoundErrorconfirmed ✅- No lazy imports in function bodies — all at module top level ✅
L2 — Dev Backend Curl/WS Tests (port 10220)
| Endpoint | Functions exercised | HTTP | Status |
|---|---|---|---|
GET /v1/conversations?limit=3 |
deserialize_conversation + redact_conversations_for_list |
200 | ✅ |
GET /v1/conversations?statuses=processing |
deserialize_conversation + redact_conversations_for_list |
200 | ✅ |
GET /v1/conversations?include_discarded=true |
same | 200 | ✅ |
POST /v1/conversations/search |
deserialize_conversation |
200 | ✅ |
GET /v1/folders |
redact_conversations_for_list |
200 | ✅ |
GET /v1/integrations/conversations |
redact_conversation_for_integration |
200 | ✅ |
GET /v1/dev/user/conversations |
populate_speaker_names + populate_folder_names |
401 | ✅ expected (API key auth) |
GET /v1/mcp/conversations |
populate_speaker_names + redact_conversations_for_list |
401 | ✅ expected (API key auth) |
GET /v1/users/profile |
deserialize_conversation |
410 | ✅ expected (user not in dev DB) |
WS /v1/listen |
deserialize_conversation |
403 | ✅ expected (Firebase token auth) |
Backend logs: zero ImportError / ModuleNotFoundError. Two 500s were pre-existing Firestore index issues on dev env (not code-related).
Changes in this commit
hydrate_*→deserialize_*(factory.py)add_speaker_names→populate_speaker_names,add_folder_names→populate_folder_names- Merged
enrich.py+redact.pyintorender.py(single output-preparation module) - All database imports at module top level per CLAUDE.md
- 29 consumer files updated (imports + call sites)
by AI for @beastoin
|
lgtm |
Summary
Consolidates all conversation rendering paths into 2 centralized modules (
render.py+factory.py), eliminating 150+ lines of duplicated code across 6+ routers. Part of #6423 Phase 4b.Architecture: Output Preparation Pipeline
This pattern can be replicated for other domain objects (action items, memories, etc.):
factory.py—deserialize_<object>: raw dict → model objectrender.py— all output preparation in one module:populate_<field>: enrich with related data (DB lookups, joins)redact_<object>_for_<view>: strip fields based on access/lock stateserialize_datetimes/<object>_to_dict: convert to JSON-safe format<objects>_to_string: human-readable formattingNaming Conventions (for future reuse)
deserialize_<object>deserialize_conversation(data)populate_<field_name>populate_speaker_names(uid, convs)redact_<object>_for_<view>redact_conversation_for_list(conv)serialize_datetimes(obj)<object>_to_dictconversation_to_dict(conv)<objects>_to_stringconversations_to_string(convs)Module Layout
utils/conversations/factory.pydeserialize_conversation,deserialize_conversationsutils/conversations/render.pypopulate_speaker_names,populate_folder_names,redact_conversation_for_list,redact_conversation_for_integration,redact_conversations_for_list,redact_conversations_for_integration,serialize_datetimes,conversation_to_dict,conversations_to_stringDeduplication Evidence
_add_speaker_names_to_segmentsrender.populate_speaker_names_add_folder_names_to_conversationsrender.populate_folder_names_json_serialize_datetimerender.serialize_datetimesis_lockedredaction loopsrender.redact_conversation_for_list/integrationhydrate_conversation(old name)factory.deserialize_conversationConsumer Migration (29 files)
Routers: conversations, developer, folders, integration, mcp, mcp_sse, pusher, sync, transcribe, users
Utils: app_integrations, apps, chat, webhooks, notifications, process_conversation, postprocess_conversation, wrapped/generate_2025, retrieval/rag, retrieval/tool_services/conversations, retrieval/tools/conversation_tools
Tests: test_conversation_render_factory, test_conversation_redact_enrich, test_folder_name_enrichment, test_tools_router, test_conversations_to_string
Bug Fixes
apps_results/plugins_resultsstrippingphone_callis now a realConversationSourceenum memberHow to Apply This Pattern to Other Objects
To consolidate another domain object (e.g., action items, memories):
utils/<objects>/factory.pywithdeserialize_<object>(data)anddeserialize_<objects>(items)utils/<objects>/render.pywith:populate_*functions for related-data enrichmentredact_*functions for access-control strippingserialize_*/*_to_dict/*_to_stringfor output formattingtest.sh, L1 (import verification), L2 (dev backend curl/wscat)Test Plan
Closes Phase 4b of #6423.