Skip to content

LCORE-1350: Refactored utilities for more generic use#1198

Merged
tisnik merged 1 commit intolightspeed-core:mainfrom
asimurka:utility_refactoring
Feb 24, 2026
Merged

LCORE-1350: Refactored utilities for more generic use#1198
tisnik merged 1 commit intolightspeed-core:mainfrom
asimurka:utility_refactoring

Conversation

@asimurka
Copy link
Contributor

@asimurka asimurka commented Feb 23, 2026

Description

This PR introduces a query utility functions refactor. With the upcoming responses endpoint we need our utility functions to accept atomic or responses compatible types so that they can be reused in new responses endpoint without additional workarounds.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Cursor

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Rich transcript model and metadata for fuller conversation records.
    • Improved response text extraction across varied content.
    • Automatic model selection and referenced-document deduplication.
  • Bug Fixes

    • More accurate token usage reporting.
    • Safer streaming when responses are missing.
    • Clearer "no model configured" not-found messaging.
  • Refactor

    • Centralized system-prompt and configuration handling.
    • Simplified transcript and response assembly and storage.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 23, 2026

Walkthrough

Consolidates and renames response text-extraction APIs, adds build_turn_summary and response type aliases, shifts many functions to use a module-level configuration, introduces model-backed Transcript/TranscriptMetadata and transcript helpers, and updates endpoints and tests to the new signatures and flows.

Changes

Cohort / File(s) Summary
Responses core & types
src/utils/responses.py
Replaced OpenAI-prefixed types with Response* aliases; added build_turn_summary, select_model_for_responses, extract_text_from_output_items, extract_text_from_output_item, _extract_text_from_content; changed signatures for prepare_tools, extract_token_usage, parse_referenced_documents, build_tool_call_summary, and get_mcp_tools.
Query / token / transcript logic
src/utils/query.py, src/utils/transcripts.py, src/utils/types.py
Removed per-call AppConfig parameters in favor of module-level configuration; introduced TranscriptMetadata/Transcript models and create_transcript* helpers; updated store_transcript to accept a Transcript; changed store_conversation_into_cache, store_query_results, consume_query_tokens, and validate_model_provider_override signatures and flows; removed select_model_and_provider_id and evaluate_model_hints.
Endpoints & handlers
src/app/endpoints/query.py, src/app/endpoints/streaming_query.py, src/app/endpoints/a2a.py, src/app/endpoints/rlsapi_v1.py
Updated imports to use new response utilities; replaced extract_text_from_response_output_item with extract_text_from_output_item(s); switched to build_turn_summary for response assembly; RBAC validation now passed explicit model and provider; removed passing configuration into token/storage calls; get_mcp_tools now called with no args.
Prompts / configuration usage
src/utils/prompts.py
get_system_prompt now accepts `system_prompt: str
Models / errors
src/models/responses.py
NotFoundResponse.__init__ now accepts optional `resource_id: str
Tests updated
tests/unit/..., tests/integration/...
Updated unit and integration tests to reflect renamed functions, new signatures, global configuration usage, transcript object APIs, and build_turn_summary responses; removed tests for deleted utilities and adjusted mocks to supply usage as objects.
Small endpoint imports
src/app/endpoints/a2a.py
Swapped import/usages from extract_text_from_response_output_itemextract_text_from_output_item (callsite rename only).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

ok-to-test

Suggested reviewers

  • are-ces
  • tisnik
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Refactored utilities for more generic use' accurately describes the main thrust of the changeset, which refactors query utilities to accept atomic types.
Docstring Coverage ✅ Passed Docstring coverage is 99.26% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.



def get_topic_summary_system_prompt(config: AppConfig) -> str:
def get_topic_summary_system_prompt() -> str:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not pass as an argument

)


def select_model_and_provider_id(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecated. Use new function that generalizes for query and responses usage.

return _is_inout_shield(shield) or not is_output_shield(shield)


def evaluate_model_hints(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecated.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/utils/prompts.py (1)

11-33: ⚠️ Potential issue | 🟡 Minor

Update the docstring and typing to match the function signature and implementation.

The docstring references config and query_request parameters that don't exist in the function signature. The function accepts only system_prompt, and internally uses configuration (not config). Also, update the type annotation from Optional[str] to str | None per the modern union syntax requirement.

  • Line 11: Change Optional[str] to str | None and remove the unused Optional import
  • Lines 17-19: Update docstring to reference configuration.customization.custom_profile (not config)
  • Line 19: Update to configuration.customization.system_prompt
  • Line 23: Update to configuration.customization.disable_query_system_prompt
  • Line 24: Change "the incoming query_request contains a system_prompt" to clarify the function parameter, not a query request object
  • Lines 28-31: Remove the config (AppConfig): parameter from the docstring (it's not a parameter); update system_prompt parameter description to be concise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/prompts.py` around lines 11 - 33, Update the get_system_prompt
signature and docstring to match the implementation: change the type annotation
from Optional[str] to the modern union syntax str | None and remove the unused
Optional import; in the docstring replace all references to "config" and
"query_request" with the actual runtime variable "configuration" (e.g.,
configuration.customization.custom_profile,
configuration.customization.system_prompt,
configuration.customization.disable_query_system_prompt), clarify that the
function parameter is the incoming per-request system_prompt (not a
query_request object), and remove the nonexistent "config (AppConfig):"
parameter block while making the system_prompt parameter description concise and
accurate.
tests/unit/utils/test_transcripts.py (1)

63-66: ⚠️ Potential issue | 🟡 Minor

Mocking construct_transcripts_path hides a critical path-construction regression.

construct_transcripts_path is fully mocked here, so the test cannot detect that the path will be constructed with a double-hashed user ID in production. Recommend at minimum an assertion that construct_transcripts_path was called with the once-hashed user ID (i.e., _hash_user_id("user123")), not the raw value and not a double-hash:

from utils.transcripts import _hash_user_id   # expose for assertion
...
mock_construct = mocker.patch(
    "utils.transcripts.construct_transcripts_path",
    return_value=mocker.MagicMock(),
)
...
store_transcript(transcript)

expected_hashed = _hash_user_id(user_id)
mock_construct.assert_called_once_with(expected_hashed, conversation_id)

See the critical issue raised on store_transcript in src/utils/transcripts.py for the root cause.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/utils/test_transcripts.py` around lines 63 - 66, The test
currently fully mocks construct_transcripts_path which hides a regression where
the user_id could be double-hashed; update the test that patches
"utils.transcripts.construct_transcripts_path" to also import and use
_hash_user_id and assert that construct_transcripts_path was called once with
the once-hashed user id and the conversation_id (e.g., compute expected_hashed =
_hash_user_id(user_id) and then call
mock_construct.assert_called_once_with(expected_hashed, conversation_id)),
keeping the patch return as a MagicMock but adding this assertion around
store_transcript to catch double-hash regressions in store_transcript.
src/utils/query.py (1)

236-246: ⚠️ Potential issue | 🟡 Minor

Stale docstring: configuration parameter is still documented but was removed from the signature.

Line 244 in the docstring still lists configuration: Application configuration as an argument, but the function no longer accepts it.

📝 Suggested fix
     Args:
         user_id: The authenticated user ID
         conversation_id: The conversation ID
         model: The model identifier
         started_at: ISO formatted timestamp when the request started
         completed_at: ISO formatted timestamp when the request completed
         summary: Summary of the turn including LLM response and tool calls
         query_request: The original query request
-        configuration: Application configuration
         skip_userid_check: Whether to skip user ID validation
         topic_summary: Optional topic summary for the conversation
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/query.py` around lines 236 - 246, The docstring for the function
that documents parameters user_id, conversation_id, model, started_at,
completed_at, summary, query_request, configuration, skip_userid_check, and
topic_summary is stale: remove the `configuration: Application configuration`
entry (or replace it with the actual current parameter if the function now
accepts a differently named config) so the Args section matches the function
signature; update the Args list near the top of the docstring to only include
parameters actually present (e.g., user_id, conversation_id, model, started_at,
completed_at, summary, query_request, skip_userid_check, topic_summary).
🧹 Nitpick comments (3)
tests/unit/utils/test_transcripts.py (1)

104-111: Use distinct values for model_id/query_model and provider_id/query_provider to strengthen field mapping coverage.

model_id and query_model are both "fake-model", and provider_id and query_provider are both "fake-provider". Any accidental transposition of these arguments in create_transcript_metadata would still pass all downstream assertions.

♻️ Suggested fix: use distinct values to disambiguate fields
- model = "fake-model"
- provider = "fake-provider"
+ model = "fake-model"
+ provider = "fake-provider"
+ query_model_override = "query-model"
+ query_provider_override = "query-provider"
  ...
  metadata = create_transcript_metadata(
      user_id=user_id,
      conversation_id=conversation_id,
      model_id=model,
      provider_id=provider,
-     query_provider=query_request.provider,
-     query_model=query_request.model,
+     query_provider=query_provider_override,
+     query_model=query_model_override,
  )
  ...
- assert stored_data["metadata"]["query_provider"] == query_request.provider
- assert stored_data["metadata"]["query_model"] == query_request.model
+ assert stored_data["metadata"]["query_provider"] == query_provider_override
+ assert stored_data["metadata"]["query_model"] == query_model_override
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/utils/test_transcripts.py` around lines 104 - 111, The test uses
identical values for the model and provider pairs which masks argument
transposition; update the test call to create_transcript_metadata so model_id !=
query_model and provider_id != query_provider (e.g., use distinct strings for
model_id vs query_request.model and for provider_id vs query_request.provider)
to ensure create_transcript_metadata receives and maps each parameter correctly
(refer to the create_transcript_metadata invocation and the
query_request.model/query_request.provider variables to locate and change the
values).
src/utils/responses.py (1)

988-1019: _extract_text_from_content strips each fragment before joining — verify this is intentional.

Each text/refusal part is .strip()-ed before being space-joined. This means leading/trailing whitespace in individual content parts is discarded. If a content part intentionally ends with whitespace (e.g., "sentence one. "), the strip will alter the final output. The current tests happen to match because the join inserts a space, but this could cause subtle differences for content with intentional formatting.

If this is intentional (normalize all whitespace), this is fine. Just flagging for awareness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/responses.py` around lines 988 - 1019, The function
_extract_text_from_content currently calls .strip() on each InputTextPart.text,
OutputTextPart.text, and ContentPartRefusal.refusal which removes intentional
leading/trailing whitespace; remove the .strip() calls so you append the raw
text/refusal values (still checking for truthiness) — e.g., change
text_fragments.append(input_text_part.text.strip()) to
text_fragments.append(input_text_part.text) (and similarly for
output_text_part.text and refusal_part.refusal) while preserving the existing
truthy checks, casts (InputTextPart, OutputTextPart, ContentPartRefusal), and
the final " ".join behavior.
src/app/endpoints/query.py (1)

258-270: Consider moving deduplicate_referenced_documents to a shared utility module for consistency and maintainability.

This pure utility function fits naturally with parse_referenced_documents in src/utils/responses.py, which already houses response-related utilities. While currently used only in query.py, consolidating related document-handling utilities in a single module improves code organization and makes future reuse easier.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/query.py` around lines 258 - 270, Move the pure utility
function deduplicate_referenced_documents into the same utilities module that
contains parse_referenced_documents (responses.py), keeping its signature and
docstring unchanged; then replace the local definition in query.py with an
import from responses (e.g., from responses import
deduplicate_referenced_documents), update any imports/exports as needed so other
modules can reuse it, and run tests to ensure no import regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/openapi.json`:
- Line 4375: Both the GET and POST operations for the /a2a endpoint use the same
operationId "handle_a2a_jsonrpc_a2a_get", which must be unique; update the
operationId values so they are distinct (e.g., change the POST operationId to
"handle_a2a_jsonrpc_a2a_post" or another unique name). Locate the occurrences of
operationId "handle_a2a_jsonrpc_a2a_get" in the OpenAPI document and rename the
POST one (and any other duplicates) to a unique identifier, ensuring any
references in server/client codegen or tooling are updated to match the new
names.

In `@Makefile`:
- Line 36: The Makefile test-e2e invocation uses the `script -q -e -c "uv run
behave ..."` command which leaves a stray "typescript" file; modify the command
used by the make test-e2e target to pass /dev/null as the output file to
`script` (e.g., `script -q -e /dev/null -c "uv run behave ..."`), so the
pseudo-TTY is provided but no transcript is written.
- Line 36: The Makefile target uses the `script` command without an explicit
output file which causes a `typescript` file to be created; update the
invocation of `script -q -e -c "uv run behave --color --format pretty
--tags=-skip -D dump_errors=true `@tests/e2e/test_list.txt`"` to pass `/dev/null`
as the output transcript (i.e., add `/dev/null` as the output-file argument to
`script`) so the pseudo-TTY is provided but no `typescript` artifact is written.

In `@src/models/responses.py`:
- Around line 1756-1763: The constructor NotFoundResponse.__init__ uses
Optional[str] for the resource_id annotation; update the signature to use modern
union syntax (resource_id: str | None) per project Python 3.12+ standards, and
adjust any related type hints inside the class/file accordingly; also remove the
now-unused Optional import from typing if it becomes unused to keep imports
clean.
- Around line 1756-1769: The __init__ signature in the NotFoundResponse
constructor uses Optional[str] for resource_id; update the type annotation to
use modern union syntax `str | None` in the __init__ method signature (the
parameter named resource_id in the __init__ of the class in
src/models/responses.py) and remove any now-unused Optional import from typing
if present; keep the rest of the docstring and logic unchanged.

In `@src/utils/responses.py`:
- Around line 920-943: The build_turn_summary code currently reads
response.usage unguarded leading to AttributeError for mock/odd responses;
change the final token usage extraction to use a safe getattr (e.g., usage =
getattr(response, "usage", None)) and pass that usage into extract_token_usage
(or skip/extract None) so extract_token_usage receives None instead of raising;
update the call around summary.token_usage = extract_token_usage(response.usage,
model) to use the safe variable and ensure extract_token_usage can accept None.

In `@src/utils/transcripts.py`:
- Around line 85-87: store_transcript is double-hashing user IDs because
create_transcript_metadata already stores a hashed ID
(TranscriptMetadata.user_id) and construct_transcripts_path always calls
_hash_user_id again; change the path construction to avoid the second hash by
adding a parameter to construct_transcripts_path (e.g., pre_hashed or
hashed=True) and, in store_transcript, call
construct_transcripts_path(transcript.metadata.user_id, pre_hashed=True) so
construct_transcripts_path skips calling _hash_user_id when the caller supplies
an already-hashed ID; update any callers or tests accordingly to preserve the
single-hash invariant.
- Around line 102-133: Update the type annotations of the new parameters in
create_transcript_metadata to use modern union syntax (str | None) instead of
Optional[str]; change provider_id, query_provider, and query_model annotations
to str | None in the create_transcript_metadata signature and ensure any related
type hints or usages in that function (e.g., when constructing
TranscriptMetadata) remain compatible with the new annotations.

In `@src/utils/types.py`:
- Around line 202-225: Update the TranscriptMetadata and Transcript models to
use modern union syntax and add Attributes sections to their docstrings: replace
Optional[str] with the PEP 604 form (str | None) for provider and
query_provider/query_model in TranscriptMetadata, and update any other Optional
usages in these classes accordingly; then expand the docstrings for both
TranscriptMetadata and Transcript to include a Google-style "Attributes:"
section that documents each field (e.g., provider, model, user_id,
conversation_id, timestamp, metadata, redacted_query, query_is_valid,
llm_response, rag_chunks, truncated, attachments, tool_calls, tool_results) with
brief types/purposes so the docstrings conform to project conventions.

In `@tests/unit/utils/test_responses.py`:
- Around line 67-80: The constructor for MockContentPart (__init__) can leave
self.type unset when part_type, text, and refusal are all falsy; update __init__
so self.type is always assigned (e.g., set self.type = part_type if provided,
elif text -> "output_text", elif refusal -> "refusal", else set self.type =
"unknown" or None) so any code accessing MockContentPart.type will not raise
AttributeError.

---

Outside diff comments:
In `@src/utils/prompts.py`:
- Around line 11-33: Update the get_system_prompt signature and docstring to
match the implementation: change the type annotation from Optional[str] to the
modern union syntax str | None and remove the unused Optional import; in the
docstring replace all references to "config" and "query_request" with the actual
runtime variable "configuration" (e.g.,
configuration.customization.custom_profile,
configuration.customization.system_prompt,
configuration.customization.disable_query_system_prompt), clarify that the
function parameter is the incoming per-request system_prompt (not a
query_request object), and remove the nonexistent "config (AppConfig):"
parameter block while making the system_prompt parameter description concise and
accurate.

In `@src/utils/query.py`:
- Around line 236-246: The docstring for the function that documents parameters
user_id, conversation_id, model, started_at, completed_at, summary,
query_request, configuration, skip_userid_check, and topic_summary is stale:
remove the `configuration: Application configuration` entry (or replace it with
the actual current parameter if the function now accepts a differently named
config) so the Args section matches the function signature; update the Args list
near the top of the docstring to only include parameters actually present (e.g.,
user_id, conversation_id, model, started_at, completed_at, summary,
query_request, skip_userid_check, topic_summary).

In `@tests/unit/utils/test_transcripts.py`:
- Around line 63-66: The test currently fully mocks construct_transcripts_path
which hides a regression where the user_id could be double-hashed; update the
test that patches "utils.transcripts.construct_transcripts_path" to also import
and use _hash_user_id and assert that construct_transcripts_path was called once
with the once-hashed user id and the conversation_id (e.g., compute
expected_hashed = _hash_user_id(user_id) and then call
mock_construct.assert_called_once_with(expected_hashed, conversation_id)),
keeping the patch return as a MagicMock but adding this assertion around
store_transcript to catch double-hash regressions in store_transcript.

---

Nitpick comments:
In `@src/app/endpoints/query.py`:
- Around line 258-270: Move the pure utility function
deduplicate_referenced_documents into the same utilities module that contains
parse_referenced_documents (responses.py), keeping its signature and docstring
unchanged; then replace the local definition in query.py with an import from
responses (e.g., from responses import deduplicate_referenced_documents), update
any imports/exports as needed so other modules can reuse it, and run tests to
ensure no import regressions.

In `@src/utils/responses.py`:
- Around line 988-1019: The function _extract_text_from_content currently calls
.strip() on each InputTextPart.text, OutputTextPart.text, and
ContentPartRefusal.refusal which removes intentional leading/trailing
whitespace; remove the .strip() calls so you append the raw text/refusal values
(still checking for truthiness) — e.g., change
text_fragments.append(input_text_part.text.strip()) to
text_fragments.append(input_text_part.text) (and similarly for
output_text_part.text and refusal_part.refusal) while preserving the existing
truthy checks, casts (InputTextPart, OutputTextPart, ContentPartRefusal), and
the final " ".join behavior.

In `@tests/unit/utils/test_transcripts.py`:
- Around line 104-111: The test uses identical values for the model and provider
pairs which masks argument transposition; update the test call to
create_transcript_metadata so model_id != query_model and provider_id !=
query_provider (e.g., use distinct strings for model_id vs query_request.model
and for provider_id vs query_request.provider) to ensure
create_transcript_metadata receives and maps each parameter correctly (refer to
the create_transcript_metadata invocation and the
query_request.model/query_request.provider variables to locate and change the
values).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b9ba4ab and f754544.

📒 Files selected for processing (19)
  • Makefile
  • docs/openapi.json
  • src/app/endpoints/a2a.py
  • src/app/endpoints/query.py
  • src/app/endpoints/rlsapi_v1.py
  • src/app/endpoints/streaming_query.py
  • src/models/responses.py
  • src/utils/prompts.py
  • src/utils/query.py
  • src/utils/responses.py
  • src/utils/transcripts.py
  • src/utils/types.py
  • tests/unit/app/endpoints/test_a2a.py
  • tests/unit/app/endpoints/test_query.py
  • tests/unit/app/endpoints/test_rlsapi_v1.py
  • tests/unit/utils/test_prompts.py
  • tests/unit/utils/test_query.py
  • tests/unit/utils/test_responses.py
  • tests/unit/utils/test_transcripts.py

logger.exception("Error storing transcript: %s", e)
response = InternalServerErrorResponse.generic()
raise HTTPException(**response.model_dump()) from e
logger.info("Storing transcript")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate construction to multiple functions, do not pass the entire request so it can be reused for responses endpoint.

raise HTTPException(**response.model_dump()) from e

# Store conversation in cache
cache_entry = CacheEntry(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of try block

logger = get_logger(__name__)


def extract_text_from_response_output_item(output_item: Any) -> str:
Copy link
Contributor Author

@asimurka asimurka Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too complicated and type insecure. Replaced by new function.

return extract_text_from_output_items(response.output)


async def prepare_tools(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be now reused in responses



async def get_mcp_tools( # pylint: disable=too-many-return-statements,too-many-locals
mcp_servers: list[ModelContextProtocolServer],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import directly from configuration

return documents


def extract_token_usage(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLS API is already stable - no need for such defensive approach.

return {"args": arguments_str}


async def select_model_for_responses(
Copy link
Contributor Author

@asimurka asimurka Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New function for model selection if not provided in request explicitly.

return summary


def extract_text_from_output_items(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantically equivalent to previous implementation but simplified thanks to API stability

rag_chunks: list[dict],
truncated: bool,
attachments: list[Attachment],
def store_transcript(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use dedicated model for transcript

except (IOError, OSError) as e:
logger.error("Failed to store transcript into %s: %s", transcript_file_path, e)
raise
response = InternalServerErrorResponse.generic()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raise HTTP error immediately do not reraise internal errors.

) -> TranscriptMetadata:
"""Create a TranscriptMetadata BaseModel instance.

logger.info("Transcript successfully stored at: %s", transcript_file_path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved right after successful file insertion

token_usage: TokenCounter = Field(default_factory=TokenCounter)


class TranscriptMetadata(BaseModel):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model can be reused by responses-like structure, where model and provider are concatenated.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it changed the behaviour (see changes in openapi.json), it should be done before refactoring. The refactoring itself should not change "external" behaviour.

@asimurka asimurka marked this pull request as draft February 23, 2026 11:31
@asimurka asimurka marked this pull request as draft February 23, 2026 11:31
@asimurka asimurka force-pushed the utility_refactoring branch from f754544 to 64d98ff Compare February 23, 2026 12:14
@asimurka asimurka marked this pull request as ready for review February 23, 2026 12:16
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/utils/transcripts.py (1)

44-55: ⚠️ Potential issue | 🟡 Minor

Clarify that hashed_user_id is pre-hashed.

The docstring still implies hashing occurs in this function, which no longer matches the signature.

📝 Suggested docstring tweak
-    The returned path is built from the configured transcripts storage base
-    directory, a filesystem-safe directory derived from a hash of `user_id`,
+    The returned path is built from the configured transcripts storage base
+    directory, a filesystem-safe directory derived from a pre-hashed `user_id`,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/transcripts.py` around lines 44 - 55, The docstring for
construct_transcripts_path incorrectly implies the function performs hashing;
clarify that hashed_user_id is expected to be pre-hashed and not hashed here,
and update parameter description for hashed_user_id to state it is a
filesystem-safe hashed identifier (pre-hashed), and ensure conversation_id
description mentions it is normalized for path use; keep the rest of the
docstring intact and consistent with the function signature.
src/utils/prompts.py (1)

3-35: ⚠️ Potential issue | 🟡 Minor

Fix type hints and docstring references.

Change Optional[str] to str | None per modern Python syntax guidelines. Update docstring references from config to configuration (lines 18, 19, 23), and remove the non-existent config (AppConfig) parameter from the Parameters section (line 30)—the function accepts only system_prompt.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/prompts.py` around lines 3 - 35, Update the type annotation and
docstring in get_system_prompt: change the parameter type from Optional[str] to
the modern union syntax str | None in the function signature, update all
docstring references that mention `config` to `configuration` (lines describing
configuration.customization...), and remove the non-existent `config
(AppConfig)` entry from the Parameters section so the docstring only documents
the actual `system_prompt` parameter; keep references to
constants.DEFAULT_SYSTEM_PROMPT and configuration.customization where present to
preserve resolution precedence.
♻️ Duplicate comments (2)
src/utils/responses.py (1)

920-943: Guard response.usage access in build_turn_summary.

This was previously flagged; the direct access can still raise AttributeError when mocks or partial responses omit .usage.

llama-stack-api OpenAIResponseObject usage field optional
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/responses.py` around lines 920 - 943, The code accesses
response.usage directly in build_turn_summary (where TurnSummary,
extract_token_usage are used), which can raise AttributeError for partial/mocked
responses; change the call to safely read usage (e.g., usage = getattr(response,
"usage", None) or check hasattr(response, "usage")) and pass that usage (which
may be None) into extract_token_usage(response_usage, model) so
extract_token_usage always receives a defined value; update any related
references in the function (summary.token_usage assignment) to use this guarded
variable.
tests/unit/utils/test_responses.py (1)

67-80: MockContentPart.type remains unset when all constructor args are falsy.

No else branch assigns self.type, leaving it undefined if part_type, text, and refusal are all None/falsy. Any call to part.type in that state raises AttributeError.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/utils/test_responses.py` around lines 67 - 80, The constructor for
MockContentPart (__init__) can leave self.type unset when part_type, text, and
refusal are all falsy; always assign self.type to avoid AttributeError by adding
a final else branch that sets a default value (e.g., None or "unknown") so
self.type is defined for every instance of MockContentPart.
🧹 Nitpick comments (4)
tests/unit/utils/test_responses.py (1)

1088-1104: Rename test_extract_token_usage_with_dict_usage – the test no longer uses a dict.

Both test_extract_token_usage_with_dict_usage (line 1088) and test_extract_token_usage_with_object_usage (line 1107) now use mocker.Mock(). The "dict" name is misleading and the two tests are near-identical. Consider renaming/merging them for clarity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/utils/test_responses.py` around lines 1088 - 1104, The test named
test_extract_token_usage_with_dict_usage no longer uses a dict and is nearly
identical to test_extract_token_usage_with_object_usage; update the tests to
remove confusion by renaming test_extract_token_usage_with_dict_usage to a more
accurate name (e.g., test_extract_token_usage_with_mock_usage) or merge the two
into a single parametrized test that covers both usage shapes, ensuring you
update any references and keep assertions against extract_token_usage (and the
mocked extract_provider_and_model_from_model_id, metrics.llm_token_sent_total,
metrics.llm_token_received_total, and _increment_llm_call_metric) unchanged.
src/app/endpoints/query.py (2)

155-155: Replace list[Any] with list[RAGChunk] and remove the leftover dev comment.

RAGChunk is available from utils.types (already imported in streaming_query.py for the same purpose). The inline comment reads like a personal TODO; it should be removed before merge.

✏️ Suggested fix
-    pre_rag_chunks: list[Any] = []  # use your RAGChunk type (or the upstream one)
+    pre_rag_chunks: list[RAGChunk] = []

Add to imports:

 from utils.types import (
     ResponsesApiParams,
     TurnSummary,
+    RAGChunk,
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/query.py` at line 155, Replace the dev placeholder typing
and comment for the pre_rag_chunks variable by typing it as list[RAGChunk] and
removing the inline TODO/comment; ensure RAGChunk is imported from utils.types
(the project already imports RAGChunk in streaming_query.py so mirror that
import) and update the declaration from "pre_rag_chunks: list[Any] = []  # use
your RAGChunk type (or the upstream one)" to a typed empty list using RAGChunk
with no extra comment.

258-270: Move deduplicate_referenced_documents to utils/query.py to eliminate duplication and complete the docstring.

Two issues:

  1. Code duplication: An equivalent deduplication block already exists inline in streaming_query.py (lines 607-618). Placing this function in utils/query.py would let both endpoints import and share it, removing the duplicate.

  2. Incomplete docstring: The one-line description lacks Args: and Returns: sections required by the Google docstring convention followed in this repo. As per coding guidelines, all functions must have complete docstrings with Parameters, Returns, and Raises sections.

✏️ Suggested docstring fix (if keeping function here for now)
-def deduplicate_referenced_documents(
-    docs: list[ReferencedDocument],
-) -> list[ReferencedDocument]:
-    """Remove duplicate referenced documents based on URL and title."""
+def deduplicate_referenced_documents(
+    docs: list[ReferencedDocument],
+) -> list[ReferencedDocument]:
+    """Remove duplicate referenced documents based on URL and title.
+
+    Args:
+        docs: List of referenced documents to deduplicate.
+
+    Returns:
+        Ordered list of documents with duplicates (same URL + title) removed.
+    """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/query.py` around lines 258 - 270, Move the deduplication
logic into a shared utility and add a full Google-style docstring: extract the
function deduplicate_referenced_documents and place it in the shared utils.query
module so streaming_query's inline dedupe block (the duplicate logic) imports
and uses this function instead of duplicating code; update the function's
docstring to include Args:, Returns:, and Raises: sections describing docs:
list[ReferencedDocument] input, the returned list[ReferencedDocument], and any
exceptions it may raise (if none, document that it does not raise), and update
streaming_query to import and call deduplicate_referenced_documents rather than
using the inline dedupe loop.
src/app/endpoints/streaming_query.py (1)

607-618: Move inline deduplication to a shared utility to avoid code duplication.

This block duplicates the deduplication logic from deduplicate_referenced_documents in src/app/endpoints/query.py. Because that helper lives in an endpoint module rather than utils/, it can't be imported here. Moving it to utils/query.py (or utils/responses.py) would let both endpoints share one implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/streaming_query.py` around lines 607 - 618, The inline
deduplication in the streaming handler (the loop that builds
seen/deduplicated_documents and sets turn_summary.referenced_documents from
turn_summary.pre_rag_documents + tool_based_documents) duplicates logic already
implemented in deduplicate_referenced_documents in src/app/endpoints/query.py;
extract that helper into a shared utils module (e.g., utils/query.py or
utils/responses.py), update the original deduplicate_referenced_documents to
live there and export it, then replace the inline loop with a call to the shared
helper (passing the combined list and assigning the result to
turn_summary.referenced_documents). Ensure function signatures match existing
callers (update imports in both streaming_query.py and query.py) and remove the
duplicated loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/query.py`:
- Around line 91-110: Update the signature of validate_model_provider_override
to use modern union syntax for optional params (change Optional[str] to str |
None for both model and provider), and tighten the override detection so any
explicit model or provider triggers the check (set has_override = provider is
not None or model is not None) instead of relying on "/" in model; then if
has_override and Action.MODEL_OVERRIDE not in authorized_actions, keep raising
HTTPException(**ForbiddenResponse.model_override().model_dump()) to enforce
RBAC.

In `@src/utils/responses.py`:
- Around line 101-120: Update the type annotations in the prepare_tools function
signature to use PEP 604 union syntax instead of Optional[...]: change
vector_store_ids: Optional[list[str]] to vector_store_ids: list[str] | None,
no_tools: Optional[bool] to no_tools: bool | None, mcp_headers:
Optional[McpHeaders] to mcp_headers: McpHeaders | None, and the return
annotation Optional[list[dict[str, Any]]] to list[dict[str, Any]] | None; keep
other symbols (AsyncLlamaStackClient, token: str, Any, dict) unchanged.

In `@src/utils/transcripts.py`:
- Around line 71-96: The docstring and sync I/O in store_transcript are
incorrect: update the docstring (Raises) to reflect that the function raises
HTTPException (wrapping IO/OSError) and then convert store_transcript to async
to avoid blocking async handlers — change its signature to async def
store_transcript(...), perform file writes with an async library (e.g.,
aiofiles) or run blocking I/O in asyncio.to_thread, and keep the same error
handling (catch IOError/OSError, log, create
InternalServerErrorResponse.generic(), and raise HTTPException from the error);
finally update store_query_results to await store_transcript so callers use the
new async API.

---

Outside diff comments:
In `@src/utils/prompts.py`:
- Around line 3-35: Update the type annotation and docstring in
get_system_prompt: change the parameter type from Optional[str] to the modern
union syntax str | None in the function signature, update all docstring
references that mention `config` to `configuration` (lines describing
configuration.customization...), and remove the non-existent `config
(AppConfig)` entry from the Parameters section so the docstring only documents
the actual `system_prompt` parameter; keep references to
constants.DEFAULT_SYSTEM_PROMPT and configuration.customization where present to
preserve resolution precedence.

In `@src/utils/transcripts.py`:
- Around line 44-55: The docstring for construct_transcripts_path incorrectly
implies the function performs hashing; clarify that hashed_user_id is expected
to be pre-hashed and not hashed here, and update parameter description for
hashed_user_id to state it is a filesystem-safe hashed identifier (pre-hashed),
and ensure conversation_id description mentions it is normalized for path use;
keep the rest of the docstring intact and consistent with the function
signature.

---

Duplicate comments:
In `@src/utils/responses.py`:
- Around line 920-943: The code accesses response.usage directly in
build_turn_summary (where TurnSummary, extract_token_usage are used), which can
raise AttributeError for partial/mocked responses; change the call to safely
read usage (e.g., usage = getattr(response, "usage", None) or check
hasattr(response, "usage")) and pass that usage (which may be None) into
extract_token_usage(response_usage, model) so extract_token_usage always
receives a defined value; update any related references in the function
(summary.token_usage assignment) to use this guarded variable.

In `@tests/unit/utils/test_responses.py`:
- Around line 67-80: The constructor for MockContentPart (__init__) can leave
self.type unset when part_type, text, and refusal are all falsy; always assign
self.type to avoid AttributeError by adding a final else branch that sets a
default value (e.g., None or "unknown") so self.type is defined for every
instance of MockContentPart.

---

Nitpick comments:
In `@src/app/endpoints/query.py`:
- Line 155: Replace the dev placeholder typing and comment for the
pre_rag_chunks variable by typing it as list[RAGChunk] and removing the inline
TODO/comment; ensure RAGChunk is imported from utils.types (the project already
imports RAGChunk in streaming_query.py so mirror that import) and update the
declaration from "pre_rag_chunks: list[Any] = []  # use your RAGChunk type (or
the upstream one)" to a typed empty list using RAGChunk with no extra comment.
- Around line 258-270: Move the deduplication logic into a shared utility and
add a full Google-style docstring: extract the function
deduplicate_referenced_documents and place it in the shared utils.query module
so streaming_query's inline dedupe block (the duplicate logic) imports and uses
this function instead of duplicating code; update the function's docstring to
include Args:, Returns:, and Raises: sections describing docs:
list[ReferencedDocument] input, the returned list[ReferencedDocument], and any
exceptions it may raise (if none, document that it does not raise), and update
streaming_query to import and call deduplicate_referenced_documents rather than
using the inline dedupe loop.

In `@src/app/endpoints/streaming_query.py`:
- Around line 607-618: The inline deduplication in the streaming handler (the
loop that builds seen/deduplicated_documents and sets
turn_summary.referenced_documents from turn_summary.pre_rag_documents +
tool_based_documents) duplicates logic already implemented in
deduplicate_referenced_documents in src/app/endpoints/query.py; extract that
helper into a shared utils module (e.g., utils/query.py or utils/responses.py),
update the original deduplicate_referenced_documents to live there and export
it, then replace the inline loop with a call to the shared helper (passing the
combined list and assigning the result to turn_summary.referenced_documents).
Ensure function signatures match existing callers (update imports in both
streaming_query.py and query.py) and remove the duplicated loop.

In `@tests/unit/utils/test_responses.py`:
- Around line 1088-1104: The test named test_extract_token_usage_with_dict_usage
no longer uses a dict and is nearly identical to
test_extract_token_usage_with_object_usage; update the tests to remove confusion
by renaming test_extract_token_usage_with_dict_usage to a more accurate name
(e.g., test_extract_token_usage_with_mock_usage) or merge the two into a single
parametrized test that covers both usage shapes, ensuring you update any
references and keep assertions against extract_token_usage (and the mocked
extract_provider_and_model_from_model_id, metrics.llm_token_sent_total,
metrics.llm_token_received_total, and _increment_llm_call_metric) unchanged.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f754544 and 64d98ff.

📒 Files selected for processing (18)
  • src/app/endpoints/a2a.py
  • src/app/endpoints/query.py
  • src/app/endpoints/rlsapi_v1.py
  • src/app/endpoints/streaming_query.py
  • src/models/responses.py
  • src/utils/prompts.py
  • src/utils/query.py
  • src/utils/responses.py
  • src/utils/transcripts.py
  • src/utils/types.py
  • tests/integration/endpoints/test_query_integration.py
  • tests/unit/app/endpoints/test_a2a.py
  • tests/unit/app/endpoints/test_query.py
  • tests/unit/app/endpoints/test_rlsapi_v1.py
  • tests/unit/utils/test_prompts.py
  • tests/unit/utils/test_query.py
  • tests/unit/utils/test_responses.py
  • tests/unit/utils/test_transcripts.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/app/endpoints/a2a.py
  • src/app/endpoints/rlsapi_v1.py
  • src/utils/types.py
  • tests/unit/app/endpoints/test_a2a.py
  • tests/unit/app/endpoints/test_query.py

@asimurka asimurka force-pushed the utility_refactoring branch from 64d98ff to 0611cc1 Compare February 23, 2026 13:59
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/utils/query.py (1)

236-250: ⚠️ Potential issue | 🟡 Minor

Stale docstring: references removed configuration parameter.

Line 244 still documents configuration: Application configuration but this parameter was removed from the function signature.

📝 Proposed fix
     Args:
         user_id: The authenticated user ID
         conversation_id: The conversation ID
         model: The model identifier
         started_at: ISO formatted timestamp when the request started
         completed_at: ISO formatted timestamp when the request completed
         summary: Summary of the turn including LLM response and tool calls
         query_request: The original query request
-        configuration: Application configuration
         skip_userid_check: Whether to skip user ID validation
         topic_summary: Optional topic summary for the conversation
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/query.py` around lines 236 - 250, The docstring parameter list is
stale: it still documents `configuration` although that parameter was removed
from the function signature; update the docstring in src/utils/query.py so the
Args section matches the actual signature (remove `configuration: Application
configuration` and any references to it), ensure the remaining parameters
(`user_id`, `conversation_id`, `model`, `started_at`, `completed_at`, `summary`,
`query_request`, `skip_userid_check`, `topic_summary`) are correctly described,
and keep the Raises section accurate for `HTTPException` or other errors.
♻️ Duplicate comments (4)
src/utils/transcripts.py (1)

79-80: Docstring Raises section is inaccurate.

The function catches IOError/OSError and raises HTTPException (line 96), but the docstring still documents IOError, OSError as the raised exceptions.

📝 Proposed fix
     Raises:
-        IOError, OSError: If writing the transcript file to disk fails.
+        HTTPException: If writing the transcript file to disk fails (wraps IOError/OSError).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/transcripts.py` around lines 79 - 80, Update the docstring for the
function in src/utils/transcripts.py that writes the transcript to disk so its
Raises section reflects the actual exception thrown (HTTPException) instead of
listing IOError/OSError; specifically mention that the function raises
fastapi.HTTPException (or HTTPException) when file write fails and include the
relevant status code/context, and remove or replace the outdated IOError/OSError
entries so the docstring matches the behavior where the code catches OS errors
and re-raises an HTTPException (see the raise HTTPException call in the
function).
src/utils/query.py (1)

91-110: Previous review's Optional[str] and override-logic concerns were noted.

The Optional[str] usage on lines 92-93 was flagged in a previous review. The override detection logic at line 107 intentionally allows bare model names without "/" to skip the MODEL_OVERRIDE check, which the docstring explains is by design for the Responses API "provider/model" format.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/query.py` around lines 91 - 110, Change the Optional[str] type
annotations on validate_model_provider_override's parameters to modern union
syntax (model: str | None, provider: str | None) for consistency with
authorized_actions, and keep the override detection logic using has_override =
provider is not None or (model is not None and "/" in model) unchanged; ensure
the function still raises HTTPException when has_override is true and
Action.MODEL_OVERRIDE not in authorized_actions (referencing
validate_model_provider_override, model, provider, has_override, and
Action.MODEL_OVERRIDE).
tests/unit/utils/test_responses.py (1)

67-80: ⚠️ Potential issue | 🟡 Minor

Ensure MockContentPart.type is always set.
When part_type, text, and refusal are all falsy, type remains unset and may raise AttributeError in code that expects it.

🛠️ Suggested fix
     def __init__(
         self,
         text: Optional[str] = None,
         refusal: Optional[str] = None,
         part_type: Optional[str] = None,
     ) -> None:
         self.text = text
         self.refusal = refusal
         if part_type:
             self.type = part_type
         elif text:
             self.type = "output_text"
         elif refusal:
             self.type = "refusal"
+        else:
+            self.type = "unknown"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/utils/test_responses.py` around lines 67 - 80, The constructor for
MockContentPart leaves self.type unset when part_type, text, and refusal are all
falsy; update the __init__ of MockContentPart to always set self.type by adding
a final else that assigns a sensible default (e.g., "empty" or "unknown") so any
code reading MockContentPart.type never raises AttributeError.
src/utils/responses.py (1)

101-107: 🛠️ Refactor suggestion | 🟠 Major

Use PEP 604 unions instead of Optional[...] throughout the file.
The repo guidelines require modern T | None syntax; this file contains 30+ instances of Optional[...] that need refactoring. Python 3.12+ fully supports this syntax.

Examples to update:

  • Line 103-107: prepare_tools function parameters
  • Line 448: extract_token_usage function parameter
  • Plus 25+ additional instances throughout the module
♻️ Suggested refactor pattern
 async def prepare_tools(
     client: AsyncLlamaStackClient,
-    vector_store_ids: Optional[list[str]],
-    no_tools: Optional[bool],
+    vector_store_ids: list[str] | None,
+    no_tools: bool | None,
     token: str,
-    mcp_headers: Optional[McpHeaders] = None,
-) -> Optional[list[dict[str, Any]]]:
+    mcp_headers: McpHeaders | None = None,
+) -> list[dict[str, Any]] | None:

-def extract_token_usage(usage: Optional[ResponseUsage], model: str) -> TokenCounter:
+def extract_token_usage(usage: ResponseUsage | None, model: str) -> TokenCounter:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/responses.py` around lines 101 - 107, The type hints in this module
use legacy Optional[...] forms; replace them with PEP 604 union syntax (T |
None) across the file (e.g., change prepare_tools signature parameters client:
AsyncLlamaStackClient, vector_store_ids: Optional[list[str]], no_tools:
Optional[bool], token: str, mcp_headers: Optional[McpHeaders] to use
AsyncLlamaStackClient, vector_store_ids: list[str] | None, no_tools: bool |
None, mcp_headers: McpHeaders | None) and do the same for extract_token_usage
and the other ~30 occurrences; remove or adjust any unnecessary Optional imports
(or keep typing imports used elsewhere) and ensure all annotations use X | None
consistently while preserving existing generic types like list[str] and
dict[str, Any].
🧹 Nitpick comments (2)
src/utils/prompts.py (1)

3-3: Use str | None instead of Optional[str].

The coding guidelines require modern union syntax. Optional[str] on line 11 should be str | None, and the from typing import Optional import can then be removed.

♻️ Proposed fix
-from typing import Optional
 from fastapi import HTTPException
 ...
-def get_system_prompt(system_prompt: Optional[str]) -> str:
+def get_system_prompt(system_prompt: str | None) -> str:

As per coding guidelines: "Use modern Union syntax str | int for type unions instead of Union[str, int]".

Also applies to: 11-11

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/prompts.py` at line 3, Replace the legacy typing.Optional usage
with modern union syntax: change any type annotations using Optional[str] to str
| None (e.g., function parameters or return annotations that currently use
Optional[str]) and remove the now-unneeded "from typing import Optional" import;
update all occurrences in src/utils/prompts.py so annotations use the pipe union
form.
src/app/endpoints/query.py (1)

258-270: Avoid duplicating deduplicate_referenced_documents.
There’s an identical helper in utils.responses; reusing it avoids drift and keeps behavior consistent.

♻️ Suggested refactor
-from utils.responses import (
-    build_turn_summary,
-    extract_vector_store_ids_from_tools,
-    get_topic_summary,
-    prepare_responses_params,
-)
+from utils.responses import (
+    build_turn_summary,
+    deduplicate_referenced_documents,
+    extract_vector_store_ids_from_tools,
+    get_topic_summary,
+    prepare_responses_params,
+)

- def deduplicate_referenced_documents(
-     docs: list[ReferencedDocument],
- ) -> list[ReferencedDocument]:
-     """Remove duplicate referenced documents based on URL and title."""
-     seen: set[tuple[str | None, str | None]] = set()
-     out: list[ReferencedDocument] = []
-     for d in docs:
-         key = (str(d.doc_url) if d.doc_url else None, d.doc_title)
-         if key in seen:
-             continue
-         seen.add(key)
-         out.append(d)
-     return out
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/query.py` around lines 258 - 270, The
deduplicate_referenced_documents function here duplicates logic from
utils.responses; remove this local copy and import the single implementation
from utils.responses instead. Replace the local function definition with an
import of deduplicate_referenced_documents from utils.responses and update any
local references to call that imported symbol (ensure function signature and
return types match usage in this module). Keep only the shared implementation in
utils.responses to avoid behavioral drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/prompts.py`:
- Around line 28-31: The function docstring in src/utils/prompts.py is stale:
update the parameter list to match the current signature by removing references
to the removed `config (AppConfig)` parameter and replace any remaining
`query_request` mentions with `system_prompt`; specifically edit the docstring
for the function that takes `system_prompt` (and optional other current params)
so the parameter descriptions accurately describe `system_prompt
(Optional[str])` and any other existing args, and remove the obsolete `config`
entry so the docs mirror the code.

In `@src/utils/responses.py`:
- Around line 184-188: The branch that builds model only when both
query_request.model and query_request.provider are present ignores valid
overrides where query_request.model already contains the provider (e.g.,
"provider/model"); update the logic in the response selection flow to first
validate and accept a provider-qualified model string (use
validate_model_provider_override to parse/validate query_request.model) and, if
valid, set model to that value; otherwise fall back to combining
query_request.provider + "/" + query_request.model when both fields are
provided, and only call select_model_for_responses(client, user_conversation)
when no valid override is present (adjust code around the current model
assignment where select_model_for_responses is invoked).

In `@src/utils/transcripts.py`:
- Around line 151-157: The create_transcript function currently hardcodes
Transcript fields query_is_valid=True and truncated=False; update
create_transcript to accept two new parameters (e.g., query_is_valid: bool =
True, truncated: bool = False) or implement detection logic, then pass those
parameter values into the Transcript constructor instead of the hardcoded
literals; modify the create_transcript signature and any callers to provide
appropriate values (or compute them inside create_transcript) and ensure the
returned Transcript uses the query_is_valid and truncated variables rather than
True/False constants.

---

Outside diff comments:
In `@src/utils/query.py`:
- Around line 236-250: The docstring parameter list is stale: it still documents
`configuration` although that parameter was removed from the function signature;
update the docstring in src/utils/query.py so the Args section matches the
actual signature (remove `configuration: Application configuration` and any
references to it), ensure the remaining parameters (`user_id`,
`conversation_id`, `model`, `started_at`, `completed_at`, `summary`,
`query_request`, `skip_userid_check`, `topic_summary`) are correctly described,
and keep the Raises section accurate for `HTTPException` or other errors.

---

Duplicate comments:
In `@src/utils/query.py`:
- Around line 91-110: Change the Optional[str] type annotations on
validate_model_provider_override's parameters to modern union syntax (model: str
| None, provider: str | None) for consistency with authorized_actions, and keep
the override detection logic using has_override = provider is not None or (model
is not None and "/" in model) unchanged; ensure the function still raises
HTTPException when has_override is true and Action.MODEL_OVERRIDE not in
authorized_actions (referencing validate_model_provider_override, model,
provider, has_override, and Action.MODEL_OVERRIDE).

In `@src/utils/responses.py`:
- Around line 101-107: The type hints in this module use legacy Optional[...]
forms; replace them with PEP 604 union syntax (T | None) across the file (e.g.,
change prepare_tools signature parameters client: AsyncLlamaStackClient,
vector_store_ids: Optional[list[str]], no_tools: Optional[bool], token: str,
mcp_headers: Optional[McpHeaders] to use AsyncLlamaStackClient,
vector_store_ids: list[str] | None, no_tools: bool | None, mcp_headers:
McpHeaders | None) and do the same for extract_token_usage and the other ~30
occurrences; remove or adjust any unnecessary Optional imports (or keep typing
imports used elsewhere) and ensure all annotations use X | None consistently
while preserving existing generic types like list[str] and dict[str, Any].

In `@src/utils/transcripts.py`:
- Around line 79-80: Update the docstring for the function in
src/utils/transcripts.py that writes the transcript to disk so its Raises
section reflects the actual exception thrown (HTTPException) instead of listing
IOError/OSError; specifically mention that the function raises
fastapi.HTTPException (or HTTPException) when file write fails and include the
relevant status code/context, and remove or replace the outdated IOError/OSError
entries so the docstring matches the behavior where the code catches OS errors
and re-raises an HTTPException (see the raise HTTPException call in the
function).

In `@tests/unit/utils/test_responses.py`:
- Around line 67-80: The constructor for MockContentPart leaves self.type unset
when part_type, text, and refusal are all falsy; update the __init__ of
MockContentPart to always set self.type by adding a final else that assigns a
sensible default (e.g., "empty" or "unknown") so any code reading
MockContentPart.type never raises AttributeError.

---

Nitpick comments:
In `@src/app/endpoints/query.py`:
- Around line 258-270: The deduplicate_referenced_documents function here
duplicates logic from utils.responses; remove this local copy and import the
single implementation from utils.responses instead. Replace the local function
definition with an import of deduplicate_referenced_documents from
utils.responses and update any local references to call that imported symbol
(ensure function signature and return types match usage in this module). Keep
only the shared implementation in utils.responses to avoid behavioral drift.

In `@src/utils/prompts.py`:
- Line 3: Replace the legacy typing.Optional usage with modern union syntax:
change any type annotations using Optional[str] to str | None (e.g., function
parameters or return annotations that currently use Optional[str]) and remove
the now-unneeded "from typing import Optional" import; update all occurrences in
src/utils/prompts.py so annotations use the pipe union form.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 64d98ff and 0611cc1.

📒 Files selected for processing (18)
  • src/app/endpoints/a2a.py
  • src/app/endpoints/query.py
  • src/app/endpoints/rlsapi_v1.py
  • src/app/endpoints/streaming_query.py
  • src/models/responses.py
  • src/utils/prompts.py
  • src/utils/query.py
  • src/utils/responses.py
  • src/utils/transcripts.py
  • src/utils/types.py
  • tests/integration/endpoints/test_query_integration.py
  • tests/unit/app/endpoints/test_a2a.py
  • tests/unit/app/endpoints/test_query.py
  • tests/unit/app/endpoints/test_rlsapi_v1.py
  • tests/unit/utils/test_prompts.py
  • tests/unit/utils/test_query.py
  • tests/unit/utils/test_responses.py
  • tests/unit/utils/test_transcripts.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/models/responses.py
  • src/app/endpoints/rlsapi_v1.py
  • tests/unit/app/endpoints/test_a2a.py
  • tests/unit/app/endpoints/test_rlsapi_v1.py
  • src/utils/types.py

@asimurka asimurka force-pushed the utility_refactoring branch from 0611cc1 to c87c7fd Compare February 23, 2026 14:38
)


def parse_referenced_docs(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed and moved to utils

raise HTTPException(**error_response.model_dump()) from e

# Process OpenAI response format
for output_item in response.output:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted to function - reusable

"""
# these two normalizations are required by Snyk as it detects
# this Path sanitization pattern
hashed_user_id = _hash_user_id(user_id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hash only once, then pass by argument

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (4)
src/utils/prompts.py (1)

10-31: ⚠️ Potential issue | 🟡 Minor

Docstring still references removed config and query_request parameters.

The docstring no longer matches the signature and omits the 422 exception behavior. Please update the Parameters text and add a Raises section.

📝 Suggested fix
-    2. The `custom_profile`'s "default" prompt (when present), accessed via
-       `config.customization.custom_profile.get_prompts().get("default")`.
-    3. `config.customization.system_prompt` from application configuration.
+    2. The `custom_profile`'s "default" prompt (when present), accessed via
+       `configuration.customization.custom_profile.get_prompts().get("default")`.
+    3. `configuration.customization.system_prompt` from application configuration.
@@
-    (config.customization.disable_query_system_prompt) and the incoming
-    `query_request` contains a `system_prompt`, an HTTP 422 Unprocessable
+    (configuration.customization.disable_query_system_prompt) and the incoming
+    `system_prompt` is provided, an HTTP 422 Unprocessable
     Entity is raised instructing the client to remove the field.
@@
     Parameters:
         system_prompt (str | None): The incoming system prompt; may contain a
         per-request `system_prompt`.
-        config (AppConfig): Application configuration which may include
-        customization flags, a custom profile, and a default `system_prompt`.
+
+    Raises:
+        HTTPException: If per-request system prompts are disabled and one is provided.
As per coding guidelines: Follow Google Python docstring conventions for modules, classes, and functions with Parameters, Returns, Raises, and Attributes sections.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/prompts.py` around lines 10 - 31, Update the get_system_prompt
function docstring to match its signature and behavior: remove references to
removed parameters `config` and `query_request`, update the Parameters section
to only document `system_prompt: str | None`, add a Raises section describing
that an HTTPException (422 Unprocessable Entity) is raised when per-request
system prompts are disabled and a `system_prompt` is provided, and add a Returns
section describing the returned system prompt string; keep wording following
Google Python docstring conventions and reference the function name
`get_system_prompt` so reviewers can locate the change.
src/utils/types.py (1)

202-225: ⚠️ Potential issue | 🟡 Minor

Add Attributes sections to the new transcript model docstrings.

The new classes have minimal docstrings but lack required Attributes documentation.

📝 Suggested fix
 class TranscriptMetadata(BaseModel):
-    """Metadata for a transcript entry."""
+    """Metadata for a transcript entry.
+
+    Attributes:
+        provider: LLM provider used for the response (if any).
+        model: Model identifier used for the response.
+        query_provider: Provider explicitly requested (if any).
+        query_model: Model explicitly requested (if any).
+        user_id: User identifier.
+        conversation_id: Conversation identifier.
+        timestamp: ISO 8601 timestamp for the transcript entry.
+    """
@@
 class Transcript(BaseModel):
-    """Model representing a transcript entry to be stored."""
+    """Model representing a transcript entry to be stored.
+
+    Attributes:
+        metadata: Transcript metadata.
+        redacted_query: Redacted user query.
+        query_is_valid: Whether the query passed validation.
+        llm_response: LLM response text.
+        rag_chunks: RAG chunks used.
+        truncated: Whether the response was truncated.
+        attachments: Attachment metadata.
+        tool_calls: Tool calls made.
+        tool_results: Tool results.
+    """
As per coding guidelines: Follow Google Python docstring conventions for modules, classes, and functions with Parameters, Returns, Raises, and Attributes sections.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/types.py` around lines 202 - 225, Update the docstrings for the
TranscriptMetadata and Transcript classes to include an "Attributes" section
following Google Python docstring conventions: list each field name with its
type and a one-line description (e.g., for TranscriptMetadata: provider: str |
None — provider name; model: str — model identifier; query_provider,
query_model, user_id, conversation_id, timestamp, etc.; and for Transcript:
metadata: TranscriptMetadata — metadata for the entry; redacted_query: str;
query_is_valid: bool; llm_response: str; rag_chunks, truncated, attachments,
tool_calls, tool_results, etc.), ensuring each attribute is documented and the
docstrings remain concise and accurate.
src/utils/transcripts.py (1)

71-96: ⚠️ Potential issue | 🟠 Major

Avoid synchronous file I/O in the async request flow.

store_transcript performs blocking disk writes and is invoked from async endpoints via store_query_results, which can stall the event loop under load. Prefer async def with aiofiles or asyncio.to_thread, and await it from callers.

#!/bin/bash
# Locate store_transcript and its call chain to confirm async contexts.
rg -n "def store_transcript|store_transcript\(" src -g '*.py'
rg -n "def store_query_results|store_query_results\(" src -g '*.py' -C 2
rg -n "^async def" src/app/endpoints -g '*.py'

As per coding guidelines: Use async def for I/O operations and external API calls.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/transcripts.py` around lines 71 - 96, store_transcript is
synchronous and performs blocking file I/O; convert it to an async function
(async def store_transcript) and perform the disk write using aiofiles (or run
the existing sync logic in asyncio.to_thread) so callers like
store_query_results can await it without blocking the event loop. Keep the same
error handling semantics: catch IOErrors/OSErrors, log the file path and
exception, construct the same InternalServerErrorResponse and raise
HTTPException from the caught exception. Update all call sites (e.g.,
store_query_results and any direct callers found via rg) to await the new async
store_transcript and ensure imports (aiofiles or asyncio) are added and
typing/annotations adjusted accordingly.
tests/unit/utils/test_responses.py (1)

67-80: ⚠️ Potential issue | 🟡 Minor

Ensure MockContentPart.type is always initialized.

If part_type, text, and refusal are all falsy, self.type is never set, which can raise AttributeError when production code accesses part.type.

🐛 Proposed fix
         if part_type:
             self.type = part_type
         elif text:
             self.type = "output_text"
         elif refusal:
             self.type = "refusal"
+        else:
+            self.type = "unknown"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/utils/test_responses.py` around lines 67 - 80, The constructor for
MockContentPart (the __init__ method) can leave self.type unset when part_type,
text, and refusal are all falsy; update MockContentPart.__init__ to always
initialize self.type by adding a final else branch that sets self.type to a
sensible default (e.g., None or "unknown") so the attribute always exists even
when no inputs are provided.
🧹 Nitpick comments (1)
src/utils/responses.py (1)

967-987: extract_text_from_output_items joins fragments with a plain space — may produce awkward output for multi-part responses.

If individual output items already contain trailing punctuation or newlines, joining them with " " can produce text like "First sentence. Second paragraph." collapsed into a single line. Consider "\n\n".join(...) or stripping/normalising each fragment before joining.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/responses.py` around lines 967 - 987, The function
extract_text_from_output_items currently joins fragments with a single space
which can collapse sentences and paragraphs; update
extract_text_from_output_items to normalize each fragment (strip
leading/trailing whitespace, collapse internal multiple newlines to a single
newline or spaces as appropriate) and then join fragments with a clearer
separator such as "\n\n" (or choose a separator based on whether fragments
contain newlines), using extract_text_from_output_item to obtain each fragment
before normalization so multi-part responses preserve paragraph boundaries and
avoid awkward single-line output.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 307-311: The call to get_mcp_tools() is dropping MCP auth headers;
update the call site in rlsapi_v1 (where infer_request is available) to pass the
request's auth token or MCP headers so MCP authorization is preserved (e.g.,
call get_mcp_tools(infer_request.auth.token) or
get_mcp_tools(infer_request.mcp_headers)), and ensure get_mcp_tools still
accepts a missing/None token to preserve current fallback behavior; change only
the call at the location that currently invokes get_mcp_tools() and keep
logging/usage of mcp_tools unchanged.

In `@src/models/responses.py`:
- Around line 1756-1769: In NotFoundResponse.__init__, the cause message when
resource_id is None is hardcoded to "No model is configured"; update the logic
so it only uses that model-specific text when resource == "model" (check the
resource variable), otherwise set a generic fallback like f"{resource.title()}
is not configured" (assign to cause), leaving the existing branch for
resource_id != None intact; adjust the cause variable in the __init__ method
accordingly.

In `@src/utils/responses.py`:
- Around line 184-192: The code currently calls client.models.list() twice when
selecting a fallback model: once inside select_model_for_responses and again
inside check_model_configured; fix by avoiding the redundant second fetch —
either (A) change select_model_for_responses to return both the chosen model id
and the fetched models list and accept/pass that list into
check_model_configured so it can validate without calling client.models.list()
again, or (B) mark the fallback path as already-validated and skip calling
check_model_configured after select_model_for_responses (only call
check_model_configured when model/provider came from query_request); update
references to select_model_for_responses and check_model_configured accordingly
and keep extract_provider_and_model_from_model_id usage for error construction.

---

Duplicate comments:
In `@src/utils/prompts.py`:
- Around line 10-31: Update the get_system_prompt function docstring to match
its signature and behavior: remove references to removed parameters `config` and
`query_request`, update the Parameters section to only document `system_prompt:
str | None`, add a Raises section describing that an HTTPException (422
Unprocessable Entity) is raised when per-request system prompts are disabled and
a `system_prompt` is provided, and add a Returns section describing the returned
system prompt string; keep wording following Google Python docstring conventions
and reference the function name `get_system_prompt` so reviewers can locate the
change.

In `@src/utils/transcripts.py`:
- Around line 71-96: store_transcript is synchronous and performs blocking file
I/O; convert it to an async function (async def store_transcript) and perform
the disk write using aiofiles (or run the existing sync logic in
asyncio.to_thread) so callers like store_query_results can await it without
blocking the event loop. Keep the same error handling semantics: catch
IOErrors/OSErrors, log the file path and exception, construct the same
InternalServerErrorResponse and raise HTTPException from the caught exception.
Update all call sites (e.g., store_query_results and any direct callers found
via rg) to await the new async store_transcript and ensure imports (aiofiles or
asyncio) are added and typing/annotations adjusted accordingly.

In `@src/utils/types.py`:
- Around line 202-225: Update the docstrings for the TranscriptMetadata and
Transcript classes to include an "Attributes" section following Google Python
docstring conventions: list each field name with its type and a one-line
description (e.g., for TranscriptMetadata: provider: str | None — provider name;
model: str — model identifier; query_provider, query_model, user_id,
conversation_id, timestamp, etc.; and for Transcript: metadata:
TranscriptMetadata — metadata for the entry; redacted_query: str;
query_is_valid: bool; llm_response: str; rag_chunks, truncated, attachments,
tool_calls, tool_results, etc.), ensuring each attribute is documented and the
docstrings remain concise and accurate.

In `@tests/unit/utils/test_responses.py`:
- Around line 67-80: The constructor for MockContentPart (the __init__ method)
can leave self.type unset when part_type, text, and refusal are all falsy;
update MockContentPart.__init__ to always initialize self.type by adding a final
else branch that sets self.type to a sensible default (e.g., None or "unknown")
so the attribute always exists even when no inputs are provided.

---

Nitpick comments:
In `@src/utils/responses.py`:
- Around line 967-987: The function extract_text_from_output_items currently
joins fragments with a single space which can collapse sentences and paragraphs;
update extract_text_from_output_items to normalize each fragment (strip
leading/trailing whitespace, collapse internal multiple newlines to a single
newline or spaces as appropriate) and then join fragments with a clearer
separator such as "\n\n" (or choose a separator based on whether fragments
contain newlines), using extract_text_from_output_item to obtain each fragment
before normalization so multi-part responses preserve paragraph boundaries and
avoid awkward single-line output.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0611cc1 and c87c7fd.

📒 Files selected for processing (18)
  • src/app/endpoints/a2a.py
  • src/app/endpoints/query.py
  • src/app/endpoints/rlsapi_v1.py
  • src/app/endpoints/streaming_query.py
  • src/models/responses.py
  • src/utils/prompts.py
  • src/utils/query.py
  • src/utils/responses.py
  • src/utils/transcripts.py
  • src/utils/types.py
  • tests/integration/endpoints/test_query_integration.py
  • tests/unit/app/endpoints/test_a2a.py
  • tests/unit/app/endpoints/test_query.py
  • tests/unit/app/endpoints/test_rlsapi_v1.py
  • tests/unit/utils/test_prompts.py
  • tests/unit/utils/test_query.py
  • tests/unit/utils/test_responses.py
  • tests/unit/utils/test_transcripts.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/app/endpoints/test_query.py

return " ".join(text_fragments)


def deduplicate_referenced_documents(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced from query endpoint

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@tisnik tisnik merged commit ec2f65d into lightspeed-core:main Feb 24, 2026
19 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants