[feat] Extend `loadables` by jp-agenta · Pull Request #3811 · Agenta-AI/agenta

jp-agenta · 2026-02-23T19:36:35Z

PR: Loadables Retrieval Alignment

Summary

This PR aligns loadables retrieval behavior across testsets/queries, removes stale design drift in docs, and makes the traces router a first-class traces API (no tracing-shaped response types in traces endpoints).

Change Inventory

API: Testsets

Fixed _populate_testcases(...) call-site argument binding bugs by switching to keyword arguments.
Added deterministic ID-level windowing support for testset revision retrieval (order, next, limit) when enumerating testcase_ids.
Kept revision retrieval semantics where include_testcases=true returns both testcases and testcase_ids.
Updated /preview/testsets/revisions/retrieve caching policy to cache only when include_testcases=false.

API: Queries

Extended query revision population to merge request pagination (next, limit) into stored windowing.
Kept revision retrieval semantics where trace expansion can return both traces and trace_ids.
Updated /preview/queries/revisions/retrieve caching policy to cache only when both include_trace_ids=false and include_traces=false.
Added permission coupling: query revision retrieve with trace expansion requires trace-view permission in addition to query-view permission.

API: Traces Router

Introduced traces-specific DTOs:
- TraceResponse
- TracesResponse
- TracesQueryRequest
Removed formatting from TracesQueryRequest.
Forced /preview/traces/query to always return Agenta trace trees (never spans/opentelemetry formatting).
Removed external TracingQuery request contract from TracesRouter.query_traces; traces endpoint now consumes only TracesQueryRequest.
Added query permission coupling for ref-dereferenced traces query (query_ref, query_variant_ref, query_revision_ref).
Added native traces router fetch handler (GET /preview/traces/{trace_id}) returning TraceResponse.

Docs

Updated docs/designs/loadables/loadables.querying.strategies.md to include:
- include-flag defaults
- conditional caching behavior
- permission coupling notes
- windowing.next terminology
Updated docs/designs/loadables/loadables.initial.specs.md examples from cursor to next.
Removed redundant docs/designs/loadables/loadables.querying.gap-analysis.md after consolidating its useful content into the strategies document.

Behavior Summary

Revision endpoints remain the control surface for ID/item expansion.
Record endpoints (/preview/testcases, /preview/traces) remain record-returning endpoints without extra top-level ID arrays.
Traces router now has a clean traces-only contract and response types.

Validation

Formatting/lint:
- cd api && ruff format && ruff check --fix
Targeted e2e:
- pytest -q oss/tests/pytest/e2e/tracing/test_traces_basics.py oss/tests/pytest/e2e/loadables/test_loadable_strategies.py
- Result: 27 passed
Broader e2e:
- pytest -q oss/tests/pytest/e2e/testsets/test_testsets_basics.py oss/tests/pytest/e2e/testsets/test_testsets_queries.py oss/tests/pytest/e2e/testsets/test_testcases_basics.py oss/tests/pytest/e2e/tracing/test_spans_basics.py oss/tests/pytest/e2e/tracing/test_spans_queries.py
- Result: 17 passed, 3 skipped (existing flaky skips)
Full e2e suite:
- pytest -q oss/tests/pytest/e2e
- Result: 175 passed, 3 skipped

vercel · 2026-02-23T19:36:40Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Feb 27, 2026 10:14am

jp-agenta · 2026-02-23T20:08:51Z

@ardaerzin the files that matters most:

docs/designs/loadables/loadables.querying.strategies.md
docs/designs/loadables/loadables.endpoints.specs.md

devin-ai-integration

Devin Review found 3 potential issues.

View 8 additional findings in Devin Review.

api/oss/src/core/queries/service.py

api/oss/src/apis/fastapi/testsets/router.py

devin-ai-integration · 2026-02-23T20:13:13Z

api/oss/src/apis/fastapi/tracing/router.py

+        self.router.add_api_route(
+            "/{trace_id}",
+            self.fetch_trace,
+            methods=["GET"],
+            operation_id="fetch_trace",
+            status_code=status.HTTP_200_OK,
+            response_model=TraceResponse,
+            response_model_exclude_none=True,
+        )


🟡 TracesRouter registers GET / and GET /{trace_id} before POST /query, causing GET /query to be routed to fetch_trace handler

In TracesRouter.__init__, the route /{trace_id} (GET) is registered at line 977 before /query (POST) at line 987. In FastAPI, /{trace_id} is a path-parameter route that matches any single path segment.

Route Shadowing Details

While /query and /ingest are POST endpoints (so they don't conflict with the GET /{trace_id}), there's a subtle issue: if a client mistakenly sends GET /preview/traces/query or GET /preview/traces/ingest, FastAPI will match these against /{trace_id} with trace_id="query" or trace_id="ingest" respectively. The fetch_trace handler will then attempt to parse "query" or "ingest" as a trace UUID, fail, and return a suppressed empty TraceResponse() (due to @suppress_exceptions).

This means route-not-found errors are silently swallowed instead of returning a proper 404/405 Method Not Allowed. More importantly, GET /preview/traces/query silently returns {"count": 0} instead of a 405, which could confuse API consumers.

Was this helpful? React with 👍 or 👎 to provide feedback.

github-actions · 2026-02-23T20:22:56Z

Railway Preview Environment


Status	Destroyed (PR converted to draft)

Updated at 2026-02-24T14:18:53.560Z

mmabrouk

Thanks @jp-agenta lgtm

Copilot

Pull request overview

This PR extends the loadables feature by aligning retrieval behavior across testsets and queries, cleaning up the tracing architecture, and introducing clean trace-specific DTOs and endpoints. The changes implement three strategies for retrieving loadable items (by content, by IDs, or by full items with pagination), separate concerns between routers/services/workers, and consolidate tracing type definitions in the SDK.

Changes:

Extended testset and query revision retrieval with include_*_ids and include_* flags for controlled expansion
Introduced dedicated traces router (/preview/traces/*) with trace-specific DTOs (TraceResponse, TracesResponse, TracesQueryRequest)
Refactored tracing utilities from monolithic module into organized subpackage (attributes, parsing, trees, filtering, hashing, simple_traces)
Moved canonical tracing types to SDK (agenta.sdk.models.tracing) and re-exported in API core
Removed test that was deleted (test_observability_trace_lifecycle)
Updated documentation with detailed loadables querying strategies

Reviewed changes

Copilot reviewed 71 out of 74 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
sdk/agenta/sdk/models/tracing.py	Consolidated tracing type definitions with backward-compatible aliases
api/oss/src/core/tracing/dtos.py	Re-exports SDK tracing types, adds query exceptions
api/oss/src/core/queries/service.py	Added `_populate_traces()` for query revision trace expansion
api/oss/src/apis/fastapi/tracing/models.py	New trace/span-specific request/response models
api/oss/src/core/tracing/utils/*	Split monolithic utils into organized submodules
docs/designs/loadables/*.md	Comprehensive loadables design documentation
Multiple test files	New unit tests for tracing utilities
EE organization/workspace files	Type consolidation into core modules

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

devin-ai-integration

Devin Review found 2 new potential issues.

View 14 additional findings in Devin Review.

api/oss/src/apis/fastapi/tracing/router.py

devin-ai-integration · 2026-02-27T08:44:40Z

api/oss/src/core/tracing/utils/trees.py

+def get_span_from_trace(trace: Optional[Trace], span_id: str) -> Optional[Span]:
+    if not trace or not trace.spans:
+        return None
+    for span in trace.spans.values():
+        if isinstance(span, list):
+            for item in span:
+                if item and item.span_id == span_id:
+                    return item
+        elif span and span.span_id == span_id:
+            return span
+    return None


🔴 get_span_from_trace only searches top-level spans, missing nested children

The get_span_from_trace function at api/oss/src/core/tracing/utils/trees.py:493-503 only iterates over the top-level trace.spans values. It does not recurse into nested span.spans children. This function is used by SpansRouter.fetch_span at api/oss/src/apis/fastapi/tracing/router.py:970-976 to serve GET /preview/spans/{trace_id}/{span_id}.

Root Cause

The Agenta trace tree format nests child spans inside parent spans via the spans field (e.g., root.spans.child_span). The get_span_from_trace function only checks trace.spans.values() — the top-level entries — and never recurses into span.spans for children:

def get_span_from_trace(trace, span_id): for span in trace.spans.values(): # only top-level if isinstance(span, list): for item in span: if item and item.span_id == span_id: return item elif span and span.span_id == span_id: return span return None # child spans are never found

Impact: GET /preview/spans/{trace_id}/{span_id} returns null for any child span (non-root span), even though the span exists in the trace. Only root-level spans can be fetched by this endpoint.

Suggested change

def get_span_from_trace(trace: Optional[Trace], span_id: str) -> Optional[Span]:

if not trace or not trace.spans:

return None

for span in trace.spans.values():

if isinstance(span, list):

for item in span:

if item and item.span_id == span_id:

return item

elif span and span.span_id == span_id:

return span

return None

def get_span_from_trace(trace: Optional[Trace], span_id: str) -> Optional[Span]:

if not trace or not trace.spans:

return None

def _search(spans_dict):

for span in spans_dict.values():

if isinstance(span, list):

for item in span:

if item and item.span_id == span_id:

return item

if item and hasattr(item, 'spans') and item.spans:

found = _search(item.spans)

if found:

return found

else:

if span and span.span_id == span_id:

return span

if span and hasattr(span, 'spans') and span.spans:

found = _search(span.spans)

if found:

return found

return None

return _search(trace.spans)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration

Devin Review found 2 new potential issues.

View 16 additional findings in Devin Review.

devin-ai-integration · 2026-02-27T09:07:03Z

api/oss/src/core/tracing/utils/simple_traces.py

+    }
+
+
+def parse_simple_trace(trace: Optional[OTelTraceTree]) -> Optional[ParsedSimpleTrace]:


🟡 parse_simple_trace receives Trace object but type-annotated and coded for OTelTraceTree (dict), causing AttributeError

After the refactor, fetch_trace returns Optional[Trace] (a Pydantic model with .trace_id and .spans), but parse_simple_trace is annotated as accepting Optional[OTelTraceTree] which is Dict[str, OTelSpansTree]. Internally it calls extract_root_span(trace) which does trace.spans — this works on the Trace model. However, the callers in annotations/service.py and invocations/service.py pass the Trace object directly to parse_simple_trace, while the old code passed an OTelSpansTree (which also had .spans). The type annotation is wrong but the runtime behavior happens to work because both Trace and OTelSpansTree have a .spans attribute.

However, query_traces in annotations/service.py:758 and invocations/service.py:674 returns Traces (a List[Trace]), and the code iterates over it passing each Trace to parse_simple_trace. The old code iterated over list(spans_response.traces.values()) which yielded OTelSpansTree objects. Since Trace extends SpansTree (which is OTelSpansTree), the .spans attribute is present and extract_root_span works correctly at runtime. The type annotation mismatch is misleading but not a runtime bug.

Analysis

This is a type annotation inconsistency rather than a runtime bug. Trace(TraceID, SpansTree) inherits .spans from SpansTree, which is the same as OTelSpansTree. So extract_root_span at simple_traces.py:92-102 will work correctly because trace.spans resolves properly on both types.

Suggested change

def parse_simple_trace(trace: Optional[OTelTraceTree]) -> Optional[ParsedSimpleTrace]:

def parse_simple_trace(trace: Optional[Union[OTelTraceTree, "Trace"]]) -> Optional[ParsedSimpleTrace]:

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-02-27T09:07:04Z

api/oss/src/apis/fastapi/queries/router.py

+        include_trace_ids_off = (
+            query_revision_retrieve_request.include_trace_ids is not True
        )
+        include_traces_off = query_revision_retrieve_request.include_traces is not True
+        should_cache = include_trace_ids_off and include_traces_off


🟡 Asymmetric caching condition: testset should_cache uses is False while query uses is not True, causing different behavior for None defaults

The testset retrieve caching condition at testsets/router.py:1367-1373 uses is False checks:

include_testcase_ids_off = (testset_revision_retrieve_request.include_testcase_ids is False) include_testcases_off = (testset_revision_retrieve_request.include_testcases is False)

The query retrieve caching condition at queries/router.py:965-969 uses is not True checks:

include_trace_ids_off = (query_revision_retrieve_request.include_trace_ids is not True) include_traces_off = (query_revision_retrieve_request.include_traces is not True)

Root Cause and Impact

For testsets, is False means caching only happens when the flag is explicitly set to False. When the flag is None (default), None is False evaluates to False, so no caching occurs on default requests. This is correct for testsets since the defaults include data.

For queries, is not True means caching happens when the flag is None (default) or False. None is not True evaluates to True, so caching occurs on default requests. This is correct for queries since the defaults exclude trace data.

While the behavior is intentionally asymmetric to match the different default semantics (testset defaults = include everything; query defaults = include nothing), the code uses two different comparison idioms to express this. This is not a runtime bug but is confusing and fragile — a future maintainer might "fix" one to match the other, breaking caching behavior.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration

Devin Review found 1 new potential issue.

View 18 additional findings in Devin Review.

devin-ai-integration · 2026-02-27T10:21:56Z

api/oss/src/core/tracing/service.py

+            return specs_body or []
+        if not specs_body:
+            return specs_params or []
+        return []


🟡 merge_specs silently discards user-provided analytics specs when both params and body are provided

When both specs_params and specs_body are truthy (i.e., the user provides custom metric specs in both query parameters and the request body), merge_specs returns an empty list []. The caller merge_analytics then replaces this empty list with hardcoded defaults, silently ignoring the user's custom spec selections.

Root Cause and Impact

The logic at api/oss/src/core/tracing/service.py:316-322 handles the "both provided" case by returning []:

if not specs_params: return specs_body or [] if not specs_body: return specs_params or [] return [] # Both provided → empty

Then in merge_analytics at line 339: if not specs: specs = cls.default_analytics_specs() — the empty list triggers the default fallback.

This is inconsistent with merge_queries which prefers body over params when both are present. The expected behavior should be return specs_body (body wins) instead of return [].

Impact: Any client providing analytics metric specs via both query params and request body will have both sets of specs silently ignored and replaced with system defaults. The /preview/tracing/analytics/query endpoint is affected.

Suggested change

return []

return specs_body or specs_params or []

Was this helpful? React with 👍 or 👎 to provide feedback.

jp-agenta added 6 commits February 22, 2026 13:32

Add initial specs

b02c5ce

Clean up Testsets/Testcases

4597b83

Clean up Queries/Traces

b2580af

cleaning up strategies docs

0d9a295

Add gap analysis and tests

224938b

Fixing loadables tests

87193e0

jp-agenta requested a review from ardaerzin February 23, 2026 19:36

jp-agenta added 3 commits February 23, 2026 21:03

quick CR

ced1a5d

Merge branch 'main' into feat/extend-loadables-in-api

fe3c7c7

Add PR.md

3715bc1

jp-agenta requested a review from mmabrouk February 23, 2026 20:06

jp-agenta marked this pull request as ready for review February 23, 2026 20:07

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. Backend documentation Improvements or additions to documentation labels Feb 23, 2026

vercel bot deployed to Preview February 23, 2026 20:08 View deployment

devin-ai-integration bot reviewed Feb 23, 2026

View reviewed changes

mmabrouk approved these changes Feb 23, 2026

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 23, 2026

jp-agenta changed the title ~~[chore] Extend loadables in api~~ [feat] Extend loadables in api Feb 24, 2026

jp-agenta marked this pull request as draft February 24, 2026 14:18

jp-agenta added 2 commits February 25, 2026 14:35

some clean-up

62b282a

more clean-up

91d059c

vercel bot deployed to Preview February 25, 2026 13:41 View deployment

cleanup ongoing

b5a8e81

vercel bot deployed to Preview February 25, 2026 17:05 View deployment

major clean up

4c98a29

vercel bot deployed to Preview February 26, 2026 13:59 View deployment

Add initial CRs

f243126

vercel bot deployed to Preview February 26, 2026 14:49 View deployment

Merge branch 'main' into feat/extend-loadables-in-api

a1f26dc

vercel bot deployed to Preview February 26, 2026 15:01 View deployment

second pass after merging main

22ca4b6

vercel bot deployed to Preview February 26, 2026 15:08 View deployment

jp-agenta changed the title ~~[feat] Extend loadables in api~~ [feat] Extend loadables Feb 26, 2026

ruff format

23b68c7

vercel bot deployed to Preview February 26, 2026 15:34 View deployment

junaway marked this pull request as ready for review February 27, 2026 08:35

Copilot AI review requested due to automatic review settings February 27, 2026 08:35

Copilot started reviewing on behalf of junaway February 27, 2026 08:36 View session

dosubot bot added the feature label Feb 27, 2026

Copilot AI reviewed Feb 27, 2026

View reviewed changes

devin-ai-integration bot reviewed Feb 27, 2026

View reviewed changes

consolidate

cb5efab

vercel bot deployed to Preview February 27, 2026 08:50 View deployment

devin-ai-integration bot reviewed Feb 27, 2026

View reviewed changes

jp-agenta added 2 commits February 27, 2026 10:20

cleanup CR

2a29170

Fix P0

46009c9

vercel bot deployed to Preview February 27, 2026 10:14 View deployment

devin-ai-integration bot reviewed Feb 27, 2026

View reviewed changes

		}


		def parse_simple_trace(trace: Optional[OTelTraceTree]) -> Optional[ParsedSimpleTrace]:

	def parse_simple_trace(trace: Optional[OTelTraceTree]) -> Optional[ParsedSimpleTrace]:
	def parse_simple_trace(trace: Optional[Union[OTelTraceTree, "Trace"]]) -> Optional[ParsedSimpleTrace]:

Conversation

jp-agenta commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR: Loadables Retrieval Alignment

Summary

Change Inventory

API: Testsets

API: Queries

API: Traces Router

Docs

Behavior Summary

Validation

Uh oh!

vercel bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jp-agenta commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

devin-ai-integration bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Railway Preview Environment

Uh oh!

mmabrouk left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

devin-ai-integration bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jp-agenta commented Feb 23, 2026 •

edited

Loading

vercel bot commented Feb 23, 2026 •

edited

Loading

jp-agenta commented Feb 23, 2026 •

edited

Loading

github-actions bot commented Feb 23, 2026 •

edited

Loading