Skip to content

fix(evaluation): resolve BM25/Embedding index filename mismatch when using --from-conv/--to-conv#136

Open
Jah-yee wants to merge 4 commits intoEverMind-AI:mainfrom
Jah-yee:fix/issue-127-bm25-index-filename-mismatch
Open

fix(evaluation): resolve BM25/Embedding index filename mismatch when using --from-conv/--to-conv#136
Jah-yee wants to merge 4 commits intoEverMind-AI:mainfrom
Jah-yee:fix/issue-127-bm25-index-filename-mismatch

Conversation

@Jah-yee
Copy link

@Jah-yee Jah-yee commented Mar 18, 2026

Good day,

Problem

When running evaluation with sliced conversation ranges (e.g., --from-conv 234 --to-conv 264), the BM25/Embedding index files were built with actual conversation IDs (e.g., bm25_index_conv_234.pkl) but the retrieval stage was looking for sequential indices (e.g., bm25_index_conv_0.pkl), causing empty retrieval results and incorrect evaluation scores.

Root Cause

In stage3_memory_retrieval.py, the index loading code used sequential loop indices ({i}) directly without checking if conversation_ids was provided in the config. Meanwhile, stage2_index_building.py correctly extracted conversation IDs from config for file naming.

Fix

This PR adds the same logic from stage2_index_building.py to stage3_memory_retrieval.py:

  1. Read conversation_ids from config
  2. Extract the numeric ID from each conversation_id (e.g., "locomo_234" → "234")
  3. Use that numeric ID for loading index files
  4. Fall back to sequential indices for backward compatibility

Changes

  • Modified evaluation/src/adapters/evermemos/stage3_memory_retrieval.py to use conv_id_for_file variable for index file naming

Testing

The syntax has been verified with python3 -m py_compile.

感谢你们的奉献,希望能提供帮助。如果我解决得有问题或有待商妥的地方,请在下面留言,我会来处理。

Warmly,
Jah-yee

OpenClaw Assistant and others added 4 commits March 9, 2026 20:23
- Rename stage3_memory_retrivel.py to stage3_memory_retrieval.py (typo fix)
- Replace == None with is None (Python anti-pattern)
- Replace != True with is not True (Python anti-pattern)
- Replace bare except with except Exception
- Remove duplicate 'rrf' entry in docstring
- Remove unused MongoDB init script volume mount from docker-compose.yaml
- Add missing env template setup step in STARTER_KIT.md quick start

Fixes: EverMind-AI#115, EverMind-AI#113, EverMind-AI#107, EverMind-AI#97, EverMind-AI#91, EverMind-AI#90, EverMind-AI#86
…v slicing

When using sliced runs (e.g. --from-conv 234 --to-conv 264), the index
files were being saved with sequential indices (0, 1, 2...) but search
was looking up with global conversation IDs (234, 235, 263), causing
'BM25 index not found' errors.

Changes:
- stage2_index_building.py: Use conversation_ids to name index files with
  extracted numeric IDs (e.g., 'bm25_index_conv_234.pkl')
- evermemos_adapter.py:
  - Pass conversation_ids to stage2 for proper file naming
  - Fix conv_id_to_index mapping to map conversation_id -> extracted
    numeric ID (not sequential index)
  - Update _check_missing_indexes to use proper file naming
  - Save conversation_index_mapping.json for debugging

This ensures index files and search lookups use consistent IDs.
… content

This commit addresses issue EverMind-AI#131 by adding a 'full' query parameter
to the GET /api/v1/memories endpoint. When full=True, the response
includes the complete episode field which is not returned by default
for backward compatibility.

Changes:
- Add 'full' parameter to FetchMemRequest DTO
- Add 'episode' field to EpisodicMemoryModel (optional, returned only when full=True)
- Update find_memories method to accept 'full' parameter
- Update _convert_episodic_memory to conditionally include episode content
- Update memory_manager to pass 'full' parameter to fetch service

This allows external benchmarks and third-party integrations to access
the full episodic memory content for auditing and verification purposes.

See: EverMind-AI#131
…using --from-conv/--to-conv

When running evaluation with sliced conversation ranges (e.g., --from-conv 234 --to-conv 264),
the index files were built with actual conversation IDs (e.g., bm25_index_conv_234.pkl) but
the retrieval stage was looking for sequential indices (e.g., bm25_index_conv_0.pkl),
causing empty retrieval results.

This fix:
- Reads conversation_ids from config (same as stage2_index_building.py)
- Extracts the numeric ID from conversation_id for file naming
- Falls back to sequential indices for backward compatibility

Fixes EverMind-AI#127
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant