Skip to content

Spec: Queryable Artifacts & Conversations V0#2961

Open
tim-inkeep wants to merge 49 commits intomainfrom
feature/queryable-artifacts-conversations
Open

Spec: Queryable Artifacts & Conversations V0#2961
tim-inkeep wants to merge 49 commits intomainfrom
feature/queryable-artifacts-conversations

Conversation

@tim-inkeep
Copy link
Copy Markdown
Contributor

@tim-inkeep tim-inkeep commented Apr 1, 2026

Summary

Simplified spec for queryable artifacts and conversations, replacing the previous full-text-search approach with a dramatically simpler V0.

  • List + get endpoints for artifacts and conversations with filters
  • "Search" = list returning id, name/title, summary — no FTS indexes, no ranking
  • Unified metadata generation module for both artifacts and conversations
  • Conversation title + summary auto-generation (artifacts already have this)

What's in this spec

1. Schema Changes

  • Add summary column to conversations table (nullable text)
  • No changes to ledgerArtifacts (already has name, description, summary)

2. Unified Metadata Generation Module

Extract artifact metadata generation from AgentSession.processArtifact() (~500 lines) into a shared module following the compressor pattern:

BaseMetadataGenerator (abstract)
├── ArtifactMetadataGenerator   — extracted from AgentSession, zero behavior change
└── ConversationMetadataGenerator — NEW: generates title + summary
  • Base class handles: model resolution chain, retry with backoff, telemetry, fire-and-forget pattern
  • Artifact generator: same prompt, schema, uniqueness check, fallback — just extracted
  • Conversation generator: triggered every assistant message, LLM decides whether to update via shouldUpdate boolean

3. API Endpoints

Domain Endpoint Status
Run GET /v1/conversations Updated: +agentId filter, +summary in response
Run GET /v1/artifacts New: list with conversationId, agentId filters
Run GET /v1/artifacts/{id} New: get single artifact
Manage GET /projects/:projectId/conversations Updated: +agentId filter, +summary
Manage GET /projects/:projectId/artifacts New: list with userId, agentId, conversationId filters
Manage GET /projects/:projectId/artifacts/{id} New: get single artifact

Authorization: Run API = end-user sees own only (scoped via JWT sub). Manage API = admin sees all, optional userId/agentId/conversationId filters.

4. DAL Changes

  • listConversations: add agentId filter
  • listLedgerArtifacts: new paginated list with userId/agentId via inner join to conversations
  • getLedgerArtifactById: new single artifact retrieval

5. SDK Methods

  • client.conversations.list() / .get()
  • client.artifacts.list() / .get()

What's NOT in this spec (deferred)

  • Full-text search indexes (tsvector, GIN)
  • Search ranking (RRF, recency weighting)
  • Platform tools for agent self-search
  • MCP adapter for search tools
  • Semantic/vector search (pgvector)
  • Windowed message search within conversations

Phasing

  1. Schema + DAL — summary column, agentId filter, artifact list/get DAL
  2. Metadata generation module — extract + unify artifact/conversation metadata generation
  3. API endpoints — artifact list/get on both domains, conversation list updates
  4. SDK — client methods

🤖 Generated with Claude Code

tim-inkeep and others added 30 commits March 27, 2026 16:57
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
The changesets/action pushes commits using the default GITHUB_TOKEN
credential, which GitHub ignores for triggering downstream workflows.
This left the Version Packages PR (#2881) stuck with required checks
(ci, Cypress E2E, Create Agents E2E) permanently waiting.

Configures the git remote URL with the inkeep-internal-ci App token
before changesets/action runs — same pattern applied to ci.yml and
auto-format.yml in #2871.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): configure git remote with App token in release workflow

The changesets/action pushes commits using the default GITHUB_TOKEN
credential, which GitHub ignores for triggering downstream workflows.
This left the Version Packages PR (#2881) stuck with required checks
(ci, Cypress E2E, Create Agents E2E) permanently waiting.

Configures the git remote URL with the inkeep-internal-ci App token
before changesets/action runs — same pattern applied to ci.yml and
auto-format.yml in #2871.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* perf(ci): skip container init for changeset PRs by extracting check job

Service containers (Doltgres, Postgres) in cypress-e2e and
create-agents-e2e were initialized before the changeset check ran,
wasting ~30s of ubuntu-32gb runner time on every changeset PR.

Extracts the changeset check into a lightweight job on ubuntu-latest
that runs first. The heavy jobs now depend on it via `needs:` and
skip entirely (including container init) when it's a changeset PR.

Also removes all redundant step-level `if: changeset-check` guards
since the job-level `if` already gates everything.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(ci): extract changeset check into reusable composite action

The changeset detection logic (~60 lines of shell) was duplicated
across ci.yml and cypress.yml. Extracts it into a shared composite
action at .github/composite-actions/changeset-check/action.yml.

Both workflows now do a sparse checkout (just the composite action
directory) and delegate to the shared action, keeping the changeset
detection logic in one place.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e-driven scheduler (#2823)

* data model + sync mechanism

* implementation

* style: auto-format with biome

* claude comments and changeset

* fix

* spec

* implementation

* remove spec

* Remove spec

* fix

* claude comments

* fix list of tag

* chore: update OpenAPI snapshot

* fix

* fix

* lint

* typecheck

* drop deployment id

* unneeded workflow endpoint

* fixes

* wip

* fixes

* fixes

* changeset

* comments

* style: auto-format with biome

* chore: update OpenAPI snapshot

* comments and tests

* typecheck

* tests

* fix

* fix

* chore: update OpenAPI snapshot

* comments

* fix

* fix

* style: auto-format with biome

* chore: update OpenAPI snapshot

* fix

* chore: update OpenAPI snapshot

* fix broken  rebase

* chore: update OpenAPI snapshot

* comments and fix migration

* lint

* delete

* drop row

* remove scheduled triggers from sdk

* style: auto-format with biome

* chore: update OpenAPI snapshot

* fix changeset

Implement a scheduler workflow that includes a centralized trigger dispatch and a deploy restart endpoint.

* lint

* fix cascade

* rebase

* frozen lockfile

* schema fix

* lock file

* tests

* migrations

* tests

* style: auto-format with biome

* tests

* Fix test

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Making agent timeout more noticeable in trace conversation view

* typed
* omit falsy models in agent form

* Apply suggestion from @dimaMachina

* Add changeset for agents-manage-ui patch

Co-authored-by: Dimitri POSTOLOV <dimaMachina@users.noreply.github.com>

* fix typecheck

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Dimitri POSTOLOV <dimaMachina@users.noreply.github.com>
* updated costs UI

* updated again with typecheck

* reverted weird change

* added changeset
* fix: prevent delegation metadata api key leaks

* style: auto-format with biome

* feedback

* chore: add changeset for delegation metadata api key leak fix

Co-authored-by: miles-kt-inkeep <miles-kt-inkeep@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: miles-kt-inkeep <miles.kamingthanassi@inkeep.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: miles-kt-inkeep <miles-kt-inkeep@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: miles-kt-inkeep <135626743+miles-kt-inkeep@users.noreply.github.com>
…CP credentials (#2912)

* fix: set initiatedBy for app credential auth to resolve user-scoped MCP credentials

The app credential auth paths (both authenticated and anonymous web_client) were
setting endUserId but not initiatedBy in the execution context metadata.
getUserIdFromContext checks initiatedBy first, so user-scoped MCP tools (like
Linear OAuth) couldn't resolve credentials — the userId was undefined, causing
the user-scoped credential lookup to be skipped entirely.

This is a follow-up to #2899 which fixed the playground auth path but missed the
app credential paths.

Made-with: Cursor

* chore: add changeset

Made-with: Cursor
* fix: return 403 Forbidden with actionable message for origin validation failures

When a web_client app's origin doesn't match allowedDomains, the error
response was a misleading 401 "Invalid Token" (rendered as internal_server_error).
Now throws createApiError with code 'forbidden' and message 'Origin not allowed
for this app', matching the existing pattern in the anonymous session route.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: auto-format with biome

* add changeset for origin validation error fix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* ci: sync preview runtime vars from template overrides

* ci: retry flaky Railway preview operations

* ci: throttle Railway preview mutations

* ci: reduce Railway preview polling

* ci: move preview Railway automation to GraphQL

* ci: surface Railway GraphQL helper errors

* ci: add jitter to preview teardown polling
)

* fix(manage): add `wasm-unsafe-eval` to CSP for Monaco WebAssembly

Monaco Editor (via @shikijs/monaco) uses WebAssembly for syntax
highlighting, which is blocked in production by the current CSP.
Adding the targeted `wasm-unsafe-eval` directive permits WASM
compilation without opening the door to JavaScript eval().

Fixes PILOT-INKEEP-COM-21

Made-with: Cursor

* Add changeset for CSP wasm-unsafe-eval fix

Co-authored-by: Dimitri POSTOLOV <dimaMachina@users.noreply.github.com>

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Dimitri POSTOLOV <dimaMachina@users.noreply.github.com>
The changeset-check optimization from #2902 skips the heavy ci,
Cypress E2E, and Create Agents E2E jobs on changeset PRs. But those
job names are required branch-protection checks, so skipping them
leaves the Version Packages PR in UNSTABLE state.

Add lightweight gate jobs that always run on ubuntu-latest and carry
the required check names. They pass unless the real job failed,
satisfying branch protection without spinning up expensive runners.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Adds fonts.googleapis.com to CSP

* Update agents-manage-ui/src/proxy.ts

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* removes duplicate

font-csp

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* skill generator

* polish skill generator

* skills tests

* upd

* upd

* generation.test wip

* add generation.test

* tree node

* skill page

* skill loader

* skill loader refactor

* skill loader

* move skills sidebar to layout

* use pure monaco-editor component since we can have different file extension

* add shadcn context menu component

* format context menu

* skills files and edit pages

* dry

* update layout

* add docs

* add a changeset

* redirect to first skill

* skill files utils

* skill selector

* upd treenode

* skill files

* skill file editor

* delete skill confirmation

* add skill files actions

* skills data

* rm

* up skills route

* upd

* upd

* better project error message on dev

* types

* skill files

* skill loader

* format

* project test

* entities

* project full tests

* upd introspect

* upd cliiii

* nested skills tests

* remove edit page

* remove edit page

* update files page

* upd

* upd file editor

* add SkillFileInsertSchema

* superRefine

* add transform

* rm some cases in superRefine

* use pipe

* use pipe

* upd skill loader

* validation skills

* upd

* rm

* upd

* data access tests

* skills db changes

* add

* skill files

* upd

* upd

* upd skill update

* SkillUpdateSchema has required files

* upd skills manage

* upd

* upd layout and page

* style: auto-format with biome

* move empty state comp to page

* upd schemas

* update schemas

* move to with-sidebar

* polish

* upd

* upd skill generator

* Make webhooks docs user friendly (#2752)

* shaping a2a webhooks page

* moved triggers to visual builder

* vb webhooks wip

* numbered TOC steps

* added step circles

* indented toc steps more

* added newsletter signup to docs

* added share feedback button

* moved newsletter subscribe route to agent docs

* subscribe confirm polish

* improved spacing

* improved spacing

* added high quality images

* added verification step

* Sync lockfile after rebase

* Use tag reference for pullfrog action instead of pinned SHA (#2757)

The action is actively under development, so referencing the v0 tag
allows picking up updates automatically.

https://claude.ai/code/session_01QZyvEs97scVf1ahTG8C1rV

Co-authored-by: Claude <noreply@anthropic.com>

* ci: provision PR preview environments in Railway (#2681)

* ci: add preview env diagnostics

* ci: probe preview env schema before deploy

* ci: probe preview env schema before deploy

* ci: harden preview api env defaults

* ci: attach git metadata to preview deploys

* ci: harden preview workflow operations

* ci: broaden preview log redaction

* ci: extract preview workflow scripts

* ci: harden preview script extraction

* fix(ci): correct Playwright cache restore-key prefix mismatch (#2760)

The restore-keys used `${{ runner.os }}-playwright-` but primary keys
used `playwright-${{ runner.os }}-`, so the prefix never matched on
cache miss, forcing a full browser download (~8.5 min) instead of a
cache restore (~13 sec).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): replace full git clone with shallow checkout in CI job (#2761)

Remove fetch-depth: 0 from the ci job's checkout step, which cloned the
entire git history (1.5-5 min overhead). Only the OpenAPI change detection
step needs the base branch ref, so fetch it on-demand with --depth=1.

Also switches the diff from three-dot merge-base syntax to a two-dot
pathspec-filtered diff against the fetched base ref, which works correctly
with shallow clones.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* format

* rm migration

* add new migrations

* validation for skill is ok

* move empty state to page

* delete skill

* delete skill revalidate path

* move skills schemas to own file

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* more typecheck fixes

* more typecheck fixes

* fix

* fix isRequired

* f1x

* move skill sidebar

* refactor skill sidebar

* add collapse file tree button

* upd

* upd

* upd

* deleteSkillFile

* upd

* deleteSkillFile

* fileId

* fileId

* upd schemas

* DeleteSkillFileConfirmation

* updateSkillFile

* rm simplematter from sdk

* Get Skill File

* getSkillFileById

* add new skill file page

* update skill file editor

* format

* Create Skill File

* upd

* createSkillFileAction

* createSkillFileById

* fix: Make OpenTelemetry startup idempotent (#2684)

* fix: Make OpenTelemetry startup idempotent

* fix: Re-export defaultSDK and cache NodeSDK instance on globalThis

Restores the export on defaultSDK to avoid breaking the
create-agents-template subpath import. Moves the new NodeSDK()
construction behind a globalThis guard (getOrCreateSDK) so
repeated Vite HMR module evaluations reuse the same instance instead
of leaking fresh SDK objects.

Co-authored-by: mike-inkeep <mike-inkeep@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(template): use idempotent startOpenTelemetrySDK() in instrumentation

* fix: guard all OTel singletons behind globalThis for HMR idempotency

- Cache otlpExporter, batchProcessor, resource, instrumentations,
  spanProcessors, contextManager, and propagator on globalThis via
  Symbol keys and getOrCreate* helpers so HMR re-evaluation reuses
  existing instances instead of leaking new ones
- Make OtelGlobal type strict with per-key types, eliminating the
  loose `boolean | NodeSDK` union and the `as NodeSDK` cast
- Add logger.debug in the MetricReader catch block to distinguish
  clean idempotency from error-recovery idempotency
- Remove defaultSDK export (now module-private) since all consumers
  use startOpenTelemetrySDK() instead

* Fix type errors

* Simplify to just suppress the error since it's not an issue in prod, only local

* Limit to dev mode

* Add changeset for OTel HMR idempotency fix

Co-authored-by: Dimitri POSTOLOV <dimaMachina@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: mike-inkeep <mike-inkeep@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: inkeep[bot] <257615677+inkeep[bot]@users.noreply.github.com>
Co-authored-by: Dimitri POSTOLOV <dimaMachina@users.noreply.github.com>

* Fix scheduled trigger invocations being skipped (#2777)

* Fix scheduled trigger invocations being skipped when trigger is edited without changing the next execution time

* claude comments

* adding app id (#2779)

* Update pullfrog to latest SHA and add daily dependabot group for high-frequency deps (#2780)

* Update pullfrog to latest SHA and add daily dependabot group for high-frequency deps

- Update pullfrog from v0.0.178 SHA to v0.0.181 SHA (30d68e5) to stay on
  commit-pinned references for security (action has write permissions + 9 API keys)
- Split dependabot github-actions config into a "high-frequency" group for
  pullfrog with daily schedule, so SHA pins get updated automatically
- This supersedes PR #2757's approach of moving to mutable tag references

https://claude.ai/code/session_01D3ZGYHG8VhsZwqjjXqy2Ap

* Split dependabot github-actions into daily (pullfrog) and monthly (rest)

Separate into two ecosystem entries so pullfrog gets daily SHA updates
while other GitHub Actions stay on a monthly cadence.

https://claude.ai/code/session_01D3ZGYHG8VhsZwqjjXqy2Ap

* Fix invalid dependabot config: merge duplicate github-actions entries

Dependabot disallows duplicate ecosystem+directory pairs. Use a single
entry with two groups instead: high-frequency (pullfrog) and github-actions
(everything else via exclude-patterns).

https://claude.ai/code/session_01D3ZGYHG8VhsZwqjjXqy2Ap

---------

Co-authored-by: Claude <noreply@anthropic.com>

* ci: seed preview auth in PR previews (#2775)

* ci: bootstrap preview auth

* ci: require secure preview auth config

* ci: recover preview auth runtime vars

* ci: install railway in preview bootstrap

* ci: provision preview db tcp proxies

* ci: proxy preview spicedb bootstrap

* ci: harden preview retry and error logging

---------

Co-authored-by: Andrew Mikofalvy <5668128+amikofalvy@users.noreply.github.com>

* Fix scopes placeholder to show correct Nango format (#2784)

* Fix misleading scopes placeholder in credential form

The Nango API validates scopes against a strict comma-separated pattern
with no spaces. Updated placeholder and help text to show the correct
format and prevent 400 errors when users enter multiple scopes.

Made-with: Cursor

* Add changeset for scopes placeholder fix

Made-with: Cursor

* fix(manage-ui): fix URL validation bypass and permission guard in credential provider setup (#2776)

* fix(manage-ui): fix URL validation bypass and permission guard in credential provider setup

Reorder Zod schema construction so custom validators (e.g. URL protocol
allowlist) are chained after required/optional base schema instead of
being overwritten. Move all React hooks above the canEdit early-return
guard to satisfy Rules of Hooks, with canEdit checks inside hook bodies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(manage-ui): add server-side URL protocol validation in buildCredentialsPayload

Validate app_link against HTTP/HTTPS allowlist in the server action to
prevent bypassing client-side form validation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update agents-manage-ui/src/components/credentials/views/generic-auth-form.tsx

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* fix err

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* feat(pdf): Support PDF attachments (#2709)

* feat(pdf): Support PDF attachments

* Add tests and other review feedback

* Fix doc

* More renaming and cleanup

* refactor: extract Vercel content part schemas to types/chat.ts for reuse

Move inline Zod schemas from chatDataStream.ts and message-parts.ts into
types/chat.ts as shared, exported schemas. This eliminates duplicate
definitions and makes schema management easier.

Co-authored-by: Andrew Mikofalvy <amikofalvy@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Andrew Mikofalvy <amikofalvy@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: Composio connected account ID pinning (#2783)

* feat: Composio connected account ID pinning

Pin connected_account_id to Composio MCP URLs to prevent cross-project
credential leakage. Implements "both or none" policy — user_id and
connected_account_id are injected together or not at all.

- Add ComposioCredentialStore for credential lifecycle management
- Update AgentMcpManager and discoverToolsFromServer with pinning logic
- Mark Composio tools without connectedAccountId as needs_auth
- Add generic disconnect credential UI (works for all credential types)
- Store authScheme in credential retrievalParams for display
- Update OAuth login flow to create credential references post-connect
- Add unit tests for new credential store, composio client, and pinning

Made-with: Cursor

* feedback

* fix test

* Version Packages (#2778)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Add back link to projects sidebar, add org settings link to user drop… (#2787)

* Add back link to projects sidebar, add org settings link to user dropdown, adjust sidebar highlight color in dark mode

* Apply suggestion from @claude[bot]

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* Fix bad claude formatting

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* fix: return FileUIPart-compliant file parts from /run conversations endpoint (#2782)

* fix: return Vercel AI SDK FileUIPart-compliant file parts from /run conversations endpoint

- Resolve blob:// URIs to proxy HTTP URLs via resolveMessagesListBlobUris()
- Reshape file parts from { data, metadata.mimeType } to { url, mediaType, filename? }
- Matches Vercel AI SDK FileUIPart spec for useChat() compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Skip malformed file parts

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Mike Rashkovsky <mike.r@inkeep.com>

* fix: treat load_skill as internal tool to suppress false-positive Sentry errors (#2756)

* fix: provide relationshipId for load_skill tool calls in graph events

* fix: treat load_skill as internal tool, suppress chat/graph streaming events

* fix for fetch trace (#2791)

* fix for fetch trace

* fix for fetch trace

* Fix empty breadcrumb on `/[tenantId]/profile` page and replace prop-drilled permission flags (`readOnly`, `canEdit`, `canUse`) with direct hook call `useProjectPermissionsQuery()` (#2792)

* upd

* upd

* format

* format

* format

* format

* format

* format

* format

* format

* format

* fix review

* fix breadcrumb on profile page

* Apply suggestions from code review

Co-authored-by: Dimitri POSTOLOV <dmytropostolov@gmail.com>

* Update agents-manage-ui/src/lib/api/projects.ts

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* Update agents-manage-ui/src/app/[tenantId]/profile/layout.tsx

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* style: auto-format with biome

* fix review

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(manage-ui): fix user-scoped MCP credential card not refreshing after connect/disconnect (#2794)

Fetch user-scoped credential server-side in page.tsx (matching the
project-scoped pattern) instead of via a client-side React Query hook.
This ensures router.refresh() after OAuth connect or credential delete
re-fetches the credential data, so the "Your Connection" card updates
without a manual page refresh.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* reuse `useProjectsQuery` instead of `fetchProjectsAction` in `useEffect` (#2793)

* reuse `useProjectsQuery` instead of `fetchProjectsAction` in `useEffect`

* format

* upd

* fix lint

* Create little-hounds-battle.md

* upd

* upd skill file editor

* polish skill editor like in github

* remove canEdit

* upd

* move skill metadata under collapsible advanced section

* reuse DeleteConfirmation

* upd skill file editor

* upd skill file editor2

* add useInitialCollapsedSidebar

* add useInitialCollapsedSidebar

* upd skill file editor

* rm

* // Avoid including metadata in the frontmatter when it's null

* fetchSkillFile and createSkillFile

* refactor skill breadcrumb

* format

* polish

* upd

* skills integration tests

* fix validation tests

* update skill form

* upd api skills in manage ui

* upd entities

* partial

* fix skill loader test

* chore: update OpenAPI snapshot

* polish skill file editor

* upd core skills tests

* upd core skills tests

* add SkillCreateDataSchema

* update skills data manage

* remove redundant

* lint

* lint

* typecheck

* typecheck

* typecheck

* knip

* lint

* rollback skill modals

* make modal opens in skill selector

* fix typecheck

* this should fix cypress

* fix sdk tests

* split permissions call

* add folder feature

* findNodeByPath

* SkillDirectoryBrowser

* upd

* polish

* fix

* fix edge case metadata validation

* fix

* fix cli test

* format

* upd

* upd

* upd

* chore: update OpenAPI snapshot

* fix skill generator

* add button group

* connect submit logic with extension select

* polish

* update skill generator tests

* update generation test

* polish skill generator

* format

* format

* fixes for tests

* typecheck

* fix review

* format

* new migration

* upd

* rm migration

* add migrations

* fix migration and add

* rm outdated

* Apply suggestions from code review

Co-authored-by: pullfrog[bot] <226033991+pullfrog[bot]@users.noreply.github.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* polish

* Add detailed changeset for nested skill files feature

Co-authored-by: Dimitri POSTOLOV <dimaMachina@users.noreply.github.com>

* fix typecheck

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gaurav Varma <gaurav.varma@hey.com>
Co-authored-by: Andrew Mikofalvy <5668128+amikofalvy@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Varun Varahabhotla <vnv-varun@users.noreply.github.com>
Co-authored-by: mike-inkeep <mike.r@inkeep.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: mike-inkeep <mike-inkeep@users.noreply.github.com>
Co-authored-by: inkeep[bot] <257615677+inkeep[bot]@users.noreply.github.com>
Co-authored-by: Dimitri POSTOLOV <dimaMachina@users.noreply.github.com>
Co-authored-by: shagun-singh-inkeep <shagun.singh@inkeep.com>
Co-authored-by: omar-inkeep <omar@inkeep.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: Andrew Mikofalvy <amikofalvy@users.noreply.github.com>
Co-authored-by: inkeep-internal-ci[bot] <259778081+inkeep-internal-ci[bot]@users.noreply.github.com>
Co-authored-by: sarah <129242944+sarah-inkeep@users.noreply.github.com>
Co-authored-by: Abraham <anubra266@gmail.com>
Co-authored-by: pullfrog[bot] <226033991+pullfrog[bot]@users.noreply.github.com>
* durable-exec-docs

* durable-exec-docs

* client execution mode

* client execution mode
* Bump agents-ui, add example for authed users in widget

* Fix example

* Fix style inconsistency

* Pass aichatsettings to component in react example

* Reorder props for consistency:
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Apr 2, 2026 5:35pm
agents-docs Ready Ready Preview, Comment Apr 2, 2026 5:35pm
agents-manage-ui Ready Ready Preview, Comment Apr 2, 2026 5:35pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 1, 2026

⚠️ No Changeset found

Latest commit: b213c07

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@tim-inkeep tim-inkeep marked this pull request as ready for review April 1, 2026 21:00
@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 1, 2026

TL;DR — Adds a design spec for queryable artifacts and conversations (V0). This replaces the earlier full-text-search approach with a dramatically simpler model: list + get endpoints with basic filters, a unified metadata generation module, and auto-generated conversation titles and summaries.

Key changes

  • Add V0 spec for queryable artifacts and conversations — Defines schema changes (summary column on conversations, new agentId index), a unified BaseMetadataGenerator extraction from AgentSession.processArtifact(), new artifact list/get endpoints on both Run and Manage domains, agentId filtering and summary on conversation list endpoints, a RuntimeClient SDK class, authorization model, and a four-phase implementation plan.

Summary | 1 file | 49 commits | base: mainfeature/queryable-artifacts-conversations


Unified metadata generation module via BaseMetadataGenerator

Before: Artifact metadata generation (~500 lines) lived as a private method processArtifact() tightly coupled inside AgentSession. Conversations had no auto-generated titles or summaries.
After: A shared BaseMetadataGenerator<TResult, TContext> abstract class handles model resolution, retry with backoff, telemetry, and shared prompt helpers (formatConversationHistory, truncateForModel). ArtifactMetadataGenerator extracts the existing logic with zero behavior change, and a new ConversationMetadataGenerator fires on every assistant message — gated by a shouldGenerate() hook that skips the LLM call entirely for conversations with < 3 user messages.

The base class follows the existing compressor pattern (BaseCompressor → specific implementations) and introduces a TContext generic so subclasses define their own context shapes. Prompts are fully owned by subclasses — the base only provides reusable helpers. Conversation metadata generation is fire-and-forget with error logging, never blocking the user response.

How does the LLM decide whether to update conversation metadata?

The schema uses nullable title and summary fields — null means "current value is fine," saving output tokens. The prompt provides the current title and summary alongside the last 10 messages, and instructs the model to return null for fields that still accurately capture the conversation's topic. This replaces the earlier shouldUpdate boolean approach with per-field granularity. The prompt explicitly biases toward stability — only updating when the conversation meaningfully shifts topic.

How is the cost kept low?

Three layers: (1) conversations with < 3 user messages skip the LLM call entirely and use a cheap deterministic fallback that always produces both title and summary, (2) 3+ message conversations use the summarizer model (cheapest available) with a ~500 token prompt biased toward returning null, (3) most LLM calls return null for both fields (no DB write needed).

How are pre-existing conversations handled?

A one-time backfill migration job (pnpm db:backfill-conversation-metadata) queries conversations where title IS NULL OR summary IS NULL, fetches first 3 messages for each, and applies the same deterministic fallback as saveFallback(). No LLM calls — deterministic only. The read-time title fallback is preserved as a safety net for any conversations the backfill hasn't reached yet.

SPEC.md


New artifact endpoints on Run and Manage domains

Before: No API endpoints existed to list or retrieve individual artifacts.
After: GET /v1/artifacts and GET /v1/artifacts/{id} on the Run domain (end-user, scoped via JWT sub); GET /projects/:projectId/artifacts and GET /projects/:projectId/artifacts/{id} on the Manage domain (admin, with optional userId/agentId/conversationId/toolName filters).

Artifact filtering by userId and agentId uses an always-present inner join from ledgerArtifacts.contextIdconversations.id (normalizes Drizzle return types via getTableColumns(ledgerArtifacts)). The toolName filter uses JSONB access (metadata->>'toolName'). The list response exposes description (not summary) since artifact descriptions are typically concise enough to be useful untruncated. getLedgerArtifactById handles the composite PK issue by returning the most recent row by createdAt, with ownership verified in a single query via optional userId join.

SPEC.md


Conversation list updates, SDK client, and phased rollout

Before: Conversation list endpoints had no agentId filter and no summary field. No SDK client existed for querying conversations or artifacts programmatically.
After: Both Run and Manage conversation list endpoints gain an optional agentId query param and return summary (nullable). A new RuntimeClient class in the SDK provides listConversations, getConversation, listArtifacts, and getArtifact methods following the EvaluationClient pattern.

The spec defines a four-phase implementation order: (1) schema + DAL + backfill, (2) metadata generation module extraction, (3) API endpoints, (4) SDK methods. A consistency model section documents the eventual-consistency semantics — every conversation has non-null title and summary after its first completed turn, with brief staleness windows during in-flight generation. Open questions cover composite PK disambiguation (OQ1, deferred to V1), internal message accessibility for agents (OQ2), and within-entity search strategy (OQ3, code mode vs FTS vs ILIKE).

SPEC.md

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(6) Total Issues | Risk: Medium

🟠⚠️ Major (4) 🟠⚠️

🟠 1) SPEC.md:347 Run API artifact get-by-id missing user ownership check

Issue: The spec for GET /v1/artifacts/{artifactId} states "Verify artifact's conversation belongs to end-user" but the proposed getLedgerArtifactById DAL function does NOT include a userId parameter or join to conversations. It only takes scopes (tenantId/projectId) and artifactId, leaving the authorization check unspecified.

Why: Without an explicit user ownership check at the DAL level, implementations could accidentally return artifacts from conversations the user does not own. This is a cross-user data exposure risk within the same project. The existing conversation get pattern shows the correct approach: fetch first, then verify ownership.

Fix: Either add userId param to getLedgerArtifactById with a join to conversations (defense-in-depth), OR explicitly document that route handlers MUST verify conversation.userId === endUserId after fetching.

Refs:

🟠 2) SPEC.md:454 DAL conditional join could leak cross-user data

Issue: The listLedgerArtifacts DAL conditionally joins to conversations only when needsJoin is true (userId or agentId provided). For Run API, userId should ALWAYS come from JWT sub, but the spec doesn't mark it as required in the DAL signature, allowing accidental omission.

Why: If Run API implementation omits userId, artifacts from ALL users in the project would be returned. The DAL signature permits this — making it a latent security footgun.

Fix: Create separate listLedgerArtifactsForUser DAL requiring userId, OR add runtime assertion, OR document the requirement prominently.

Refs:

🟠 3) SPEC.md:515 Artifact ID retrieval non-deterministic due to composite PK

Issue: getLedgerArtifactById uses .limit(1) without ordering when multiple artifacts share the same id across different taskIds. The "first match" is database-ordering-dependent, making results non-deterministic.

Why: API consumers could receive inconsistent results across calls. The risk section acknowledges this but doesn't specify V0 mitigation.

Fix: Add .orderBy(desc(ledgerArtifacts.createdAt)) to ensure deterministic results (most recent first). Document this behavior in acceptance criteria.

Refs:

🟠 4) SPEC.md:521-535 SDK section underspecifies client architecture

Issue: The SDK proposes client.conversations.list() and client.artifacts.list() but: (1) no client object exists in current SDK, (2) artifacts namespace could collide with artifactComponent() builder, (3) unclear if SDK targets Run API or Manage API.

Why: SDK consumers need a clear mental model. The current SDK is builder-focused (agent(), project()), not HTTP-client focused. This introduces a new pattern without clarifying the architecture.

Fix: Name the client explicitly, consider runtimeArtifacts to avoid confusion with artifact component configuration, and specify which API domain(s) the SDK wraps.

Refs:

Inline Comments:

  • 🟠 Major: SPEC.md:347 Run API get-by-id ownership check
  • 🟠 Major: SPEC.md:454 DAL conditional join security
  • 🟠 Major: SPEC.md:515 Artifact ID non-deterministic retrieval
  • 🟠 Major: SPEC.md:535 SDK architecture unclear

🟡 Minor (2) 🟡

🟡 1) SPEC.md:277 Fire-and-forget creates eventual consistency gap

Issue: The fire-and-forget metadata generation means list endpoints may return stale/null summary immediately after turn completion. The prompt also always generates title/summary even when shouldUpdate: false is expected.

Why: Downstream consumers (SDK, platform tools) may see stale data. Cost implications if most calls return shouldUpdate: false but still generate full output.

Fix: Document the consistency model in API specs. Consider two-phase prompt: decide first, generate if needed.

🟡 2) SPEC.md:332-337 contextId vs conversationId vocabulary mismatch

Issue: API returns conversationId but underlying schema uses contextId. Filter params and response fields use conversationId while internal code uses contextId.

Why: Developers debugging may encounter both terms. Consistent vocabulary reduces cognitive load.

Fix: Add note confirming the API layer maps contextId (database) → conversationId (API) for clarity.

Inline Comments:

  • 🟡 Minor: SPEC.md:277 Fire-and-forget consistency gap

💭 Consider (2) 💭

💭 1) SPEC.md:67 Module directory naming

Issue: Spec proposes metadata/ directory but peer modules use more specific naming (compression/, artifacts/, context/).
Fix: Consider metadata-generation/ or document why the generic name is intentional.

💭 2) SPEC.md:384 Path parameter notation inconsistency

Issue: Spec mixes :projectId (Express-style) with {id} (OpenAPI-style).
Fix: Use consistent notation throughout.

🧹 While You're Here (1) 🧹

🧹 1) scope Pagination response shape inconsistency (pre-existing)

Issue: Run domain uses { page, limit, total, pages } but Manage domain uses { page, limit, total, hasMore }. The spec proposes following the Run pattern for new artifact endpoints, but this perpetuates the split.

Why: The canonical ListResponseSchema uses pages, but manage conversations uses hasMore. New endpoints should follow a single pattern.

Fix: Decide which pagination pattern is canonical and align existing endpoints.

Refs:


💡 APPROVE WITH SUGGESTIONS

Summary: This is a well-structured spec with clear decisions, phasing, and acceptance criteria. The major findings are all addressable with spec clarifications rather than fundamental design changes. The security concerns (user ownership checks, DAL conditional joins) should be explicitly addressed in the spec to guide implementation. The SDK architecture needs more detail before implementation. Overall, good V0 scope that sets up well for future enhancements.

Discarded (12)
Location Issue Reason Discarded
SPEC.md:29-46 Schema migration safety Assessed as safe — adding nullable column is standard Postgres operation
SPEC.md:308-314 Response schema addition Backward compatible — adding nullable field is additive
SPEC.md:310 Query param addition Backward compatible — optional param doesn't break existing clients
SPEC.md:357 Route path conflicts No conflicts — /artifacts distinct from /artifact-components
SPEC.md:441 DAL function naming Follows existing pattern correctly
SPEC.md:94 BaseMetadataGenerator vs BaseCompressor pattern Intentional divergence — different purposes, spec could clarify wording
SPEC.md:186 Generator context includes db client Acceptable asymmetry between artifact (service) and conversation (DAL) paths
SPEC.md:21 LLM cost on every turn Risk acknowledged in spec, mitigations mentioned, monitoring via telemetry
SPEC.md:267 Fire-and-forget behavioral change Not breaking — matches existing artifact pattern, fallback preserved
SPEC.md:527 SDK namespace additions Additive, no collision with existing exports
SPEC.md:316 Run API missing userId param Intentional — userId from JWT, not query param
SPEC.md:58 Compressor pattern reference Accurate reference, no change needed
Reviewers (5)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-product 5 1 0 0 1 0 3
pr-review-architecture 8 2 1 0 1 0 4
pr-review-security-iam 5 2 0 0 2 0 1
pr-review-breaking-changes 7 0 0 0 0 0 7
pr-review-consistency 8 1 1 1 0 0 5
Total 33 6 2 1 4 0 20

Note: Breaking changes review found no breaking changes — all additions are backward compatible. Consistency review identified pre-existing pagination inconsistency surfaced during this review.


```
GET /v1/artifacts/{artifactId}
Auth: API key + JWT. Verify artifact's conversation belongs to end-user.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Run API artifact get-by-id missing explicit user ownership check

Issue: The spec for GET /v1/artifacts/{artifactId} states "Verify artifact's conversation belongs to end-user" but the proposed getLedgerArtifactById DAL function (Section 6.3) does NOT include a userId parameter or join to conversations. It only takes scopes (tenantId/projectId) and artifactId.

Why: Without an explicit user ownership check at the DAL level, the implementation could accidentally return artifacts from conversations the user does not own. This is a cross-user data exposure risk within the same project.

Fix: Update the spec to require one of:

  1. Add userId param to getLedgerArtifactById and join to conversations table with userId filter (defense-in-depth), OR
  2. Explicitly document that the route handler MUST fetch the artifact, then verify its conversation's userId === endUserId, returning 404 if not

Refs:

)
.limit(1);

return result[0];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Artifact ID retrieval non-deterministic due to composite PK

Issue: The spec acknowledges ledgerArtifacts has a 4-part composite PK (tenantId, projectId, id, taskId), and getLedgerArtifactById returns "first match by id + project scope." The proposed implementation uses .limit(1) without ordering, meaning which artifact is returned when multiple exist with the same id but different taskIds is undefined.

Why: API consumers calling GET /v1/artifacts/{artifactId} could receive inconsistent results. The "first match" depends on database query ordering, which is non-deterministic without an explicit ORDER BY.

Fix: Consider:

  1. Specify ordering: .orderBy(desc(ledgerArtifacts.createdAt)) to ensure most recent artifact is returned
  2. Include taskId in the artifact list response so clients can disambiguate
  3. Document in acceptance criteria: "When multiple artifacts share an ID, the most recently created is returned"

Refs:

client.artifacts.get(artifactId)
```

Thin HTTP wrappers over the manage API endpoints.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: SDK section underspecifies client architecture

Issue: The SDK section proposes client.conversations.list() and client.artifacts.list() as "thin HTTP wrappers over the manage API endpoints." However:

  1. No client object currently exists in the SDK — it exposes builder functions like agent(), project()
  2. The namespace artifacts could collide with the existing artifactComponent() builder (which manages artifact type definitions, not runtime instances)
  3. The spec says "manage API" but Run API also has artifact endpoints

Why: SDK consumers need a clear mental model. Introducing client.artifacts when artifactComponent() already exists for artifact schemas creates ambiguity.

Fix: Clarify:

  1. Name the client explicitly (e.g., RuntimeDataClient or extend existing patterns)
  2. Consider naming client.runtimeArtifacts.list() to avoid confusion with artifact component configuration
  3. Document whether SDK targets Run API or Manage API (or both)

const limit = Math.min(params.pagination?.limit || 20, 200);
const offset = (page - 1) * limit;

const needsJoin = !!params.userId || !!params.agentId;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Conditional join in DAL could leak cross-user data if userId omitted

Issue: The listLedgerArtifacts DAL conditionally joins to conversations only when needsJoin is true (userId or agentId provided). If neither filter is provided, artifacts from ALL users in the project would be returned. For Run API, userId should ALWAYS be provided from JWT sub, but the spec doesn't mark it as required.

Why: If a Run API implementation accidentally omits the userId parameter, cross-user data leakage occurs. The DAL signature allows this omission.

Fix: Either:

  1. Create a separate listLedgerArtifactsForUser DAL function that requires userId and always joins
  2. Add explicit documentation that Run API route handlers MUST always pass userId: endUserId
  3. Add a runtime assertion in the DAL when called from Run API context

Refs:

// agents-api/src/domains/run/handlers/executionHandler.ts
// After turn completes (where triggerConversationEvaluation is already called fire-and-forget)

void conversationMetadataGenerator.generate({
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Fire-and-forget creates eventual consistency gap — consider documenting

Issue: The spec proposes generating conversation metadata using fire-and-forget pattern. While this matches existing artifact metadata behavior, downstream consumers (SDK, platform tools, UI) making list/get calls immediately after a chat response may not see updated metadata.

Why: This is acceptable for display purposes but the consistency model should be documented. Additionally, the prompt always generates title/summary even when shouldUpdate: false is expected — consider a two-phase prompt to reduce cost.

Fix:

  1. Add a note in Section 5.1/5.2 that summary fields may be null or stale immediately after turn completion
  2. Consider restructuring the prompt to decide first, generate if needed (reduce token output in common case)
  3. Track metadata generation costs separately in telemetry to monitor impact

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thorough spec — well-structured, phased cleanly, and grounded in real codebase patterns. A few issues surfaced during validation against the code.

High priority: The getLedgerArtifactById design has a real correctness risk due to the 4-part composite PK, and the BaseMetadataGenerator generic signature has an unresolved TContext type. Both are straightforward to fix but should be addressed before implementation begins.

Medium priority: Missing index on conversations.agentId will cause sequential scans on every agentId-filtered query. The ConversationMetadataGenerator prompt always asks the LLM to produce title+summary even when shouldUpdate will be false — wasteful output tokens. And the first-turn behavior needs clarification (no existing title → LLM always gets shouldUpdate: true?).

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow run | Using Claude Opus𝕏

executionContext: ExecutionContext;
}

export abstract class BaseMetadataGenerator<TResult> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing type parameter: BaseMetadataGenerator<TResult> declares TResult but the generate(context: TContext) method on line 113 and abstract methods on lines 126/129/130 reference TContext, which is never declared as a type parameter on the class. This should be BaseMetadataGenerator<TResult, TContext>, or TContext should be a concrete type defined elsewhere.

This is a spec-level typo but it would block the implementer — worth fixing before handing off.

)
.limit(1);

return result[0];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLedgerArtifactById can return ambiguous results. The ledgerArtifacts PK is (tenantId, projectId, id, taskId). This function filters only on (scopes + id), so it can match multiple rows with different taskId values. The spec acknowledges this in Risk #5 (line 660) but defers disambiguation to V1.

The existing getLedgerArtifacts DAL function has the same issue when filtering by artifactId without taskId — it returns all matches as an array.

Suggested resolution: either (a) return the most recently created artifact with that id (add orderBy(desc(createdAt)).limit(1)) and document this, or (b) require the caller to pass taskId as a disambiguator, which the API endpoint could accept as an optional query param. Option (a) is simpler and likely correct for V0 since task-level duplication is rare in practice.

async (params: {
scopes: ProjectScopeConfig;
userId?: string;
agentId?: string; // NEW
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No index on conversations.agentId. The conversations table has no index on agentId (confirmed — only a PK on (tenantId, projectId, id)). Both the listConversations DAL update and the listLedgerArtifacts join will filter on agentId, which means full table scans within the tenant+project scope.

Consider adding an index in the Phase 1 migration:

CREATE INDEX conversations_agent_id_idx ON conversations (tenant_id, project_id, agent_id);

A composite index with tenant+project prefix will serve both the direct filter and the join path efficiently.

return z.object({
title: z.string().describe('Concise conversation title, max 80 chars'),
summary: z.string().describe('Brief conversation summary, max 200 chars'),
shouldUpdate: z.boolean().describe(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldUpdate wastes output tokens. The LLM always generates title, summary, AND shouldUpdate. When shouldUpdate is false, the title and summary are discarded. This means every turn that doesn't need an update still pays for generating those strings.

Consider a two-phase approach: have the LLM return shouldUpdate first (or as a standalone boolean-only call), then conditionally generate title+summary. Alternatively, flip the schema — return title and summary as nullable, where null means "no update needed." This saves output tokens on the majority of calls where nothing changes.

}

async processAndSave(result, ctx: ConversationMetadataContext): Promise<void> {
if (!result.shouldUpdate) return; // LLM decided no update needed
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First-turn behavior is underspecified. When existingTitle and existingSummary are both None, the prompt asks the LLM to "set shouldUpdate to false and return them unchanged" if the existing values still capture the conversation. But there are no existing values — None/None clearly doesn't "capture" anything.

The LLM will likely infer the right behavior, but it would be cleaner to add explicit guidance in the prompt: "If there is no current title or summary, always set shouldUpdate to true."

}

// Build base query — conditionally join conversations for userId/agentId filtering
let baseQuery = db.select().from(ledgerArtifacts);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drizzle type mismatch with conditional join. The let baseQuery = db.select().from(ledgerArtifacts) query returns LedgerArtifactSelect[]. After .innerJoin(conversations, ...), the return type changes to { ledger_artifacts: ..., conversations: ... }[]. Reassigning baseQuery with let won't compile in Drizzle because the types are structurally different.

The implementation will need to either: (a) always join and filter to avoid the conditional, or (b) use two separate query builders with a shared where-conditions array, or (c) use db.select({ ...getTableColumns(ledgerArtifacts) }) to normalize the output shape. Worth noting in the spec so the implementer doesn't hit this wall.

summary: string | null,
type: string,
conversationId: string, // = contextId
toolName: string | null, // from metadata
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toolName extraction unclear. ArtifactListItem includes toolName: string | null described as "from metadata". The ledgerArtifacts table has a metadata jsonb column — is toolName a known key in this JSON? Or does it come from some other source?

The existing schema has toolCallId (varchar) on the table itself, but no toolName column. If toolName needs to be extracted from the jsonb metadata field, the spec should document the expected shape of that JSON, or note that a join to the tool definitions table might be needed.


| Migration | Content |
|-----------|---------|
| `XXXX_add_conversation_summary.sql` | `ALTER TABLE conversations ADD COLUMN summary text;` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration scope. The migration section only lists ALTER TABLE conversations ADD COLUMN summary text. Consider also including the conversations_agent_id_idx index (per the comment on line 420) in the same migration, since both are Phase 1 changes and rolling them into one migration is cleaner.

interface ConversationMetadataContext {
existingTitle: string | null;
existingSummary: string | null;
recentMessages: FormattedMessage[]; // last 10 messages
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recentMessages: FormattedMessage[] // last 10 messages — is 10 the right window? For long conversations this is fine, but for the first 2-3 turns it means the LLM sees the full conversation anyway. Worth noting whether 10 is a hard cap or whether it should be min(messageCount, 10) (which it effectively is, but making it explicit avoids confusion during implementation).

client.artifacts.get(artifactId)
```

Thin HTTP wrappers over the manage API endpoints.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SDK wraps manage API — is that intentional for client.conversations.list()? The spec says SDK methods are "thin HTTP wrappers over the manage API endpoints." But client.conversations.list() includes userId as a filter param, which the manage API supports. If the SDK is used by end-users (not just admins), they'd need the run API endpoints instead. Clarify the intended audience — if admin-only, this is fine; if end-user-facing, the SDK should wrap the run API.

@github-actions github-actions bot deleted a comment from claude bot Apr 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Preview URLs

Use these stable preview aliases for testing this PR:

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

This is a delta review scoped to changes since the prior automated review (commit 58dd9ced27f3).

Prior Feedback Status

All major and minor issues from the previous reviews have been addressed in this update:

Prior Issue Status Resolution
🟠 Run API get-by-id missing user ownership check Fixed Added userId param to getLedgerArtifactById with join to conversations (line 644)
🟠 DAL conditional join security concern Fixed Added explicit "Security note" documentation + changed to always-join approach (line 568-572)
🟠 Artifact ID non-deterministic due to composite PK Fixed Added .orderBy(desc(ledgerArtifacts.createdAt)) (line 667) + documented in OQ1
🟠 SDK section underspecified Fixed Renamed to RuntimeClient, follows EvaluationClient pattern, clear API targeting (line 678+)
🟡 Fire-and-forget consistency gap Fixed Added Section 5.4 "Consistency Model" documenting expected behavior (line 528-534)
Missing TContext type parameter Fixed Changed to BaseMetadataGenerator<TResult, TContext> (line 105)
Missing index on conversations.agentId Fixed Added composite index (line 46-55)
shouldUpdate wastes output tokens Fixed Changed schema to nullable title/summary (line 295-305)
First-turn behavior underspecified Fixed Added explicit guidance in prompt (line 314)
Drizzle type mismatch with conditional join Fixed Changed to always-join with implementation note (line 570-571)
toolName extraction unclear Fixed Added comment explaining JSONB access (line 458)
Migration scope incomplete Fixed Migration now includes both column and index (line 772)

Key Improvements in Delta

  1. Cost optimization: Added shouldGenerate() gate — conversations with < 3 user messages skip the LLM call entirely and use cheap fallback (line 288-292). This addresses the majority of traffic.

  2. Token efficiency: Changed from shouldUpdate: boolean + always-generate to nullable fields — LLM returns null when no update needed, saving output tokens (line 295-305).

  3. Defense-in-depth security: DAL functions now support user ownership verification in a single query via optional userId param with join (line 644, 653-662).

  4. Implementation guidance: Added Drizzle-specific notes about conditional joins and type normalization (line 570-571).

  5. Vocabulary clarity: Added explicit mapping between API conversationId and DB contextId (line 572).

Review Assessment

The spec is now well-structured with clear decisions, phasing, acceptance criteria, and explicit risk mitigations. All security concerns have been addressed with defense-in-depth approaches. The implementation guidance is thorough enough to avoid common pitfalls.


✅ APPROVE

Summary: Excellent iteration on prior feedback. All major security and correctness concerns have been addressed. The cost optimization via shouldGenerate() gate is a thoughtful addition. The spec is ready for implementation.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator-delta 12 0 0 0 0 0 0
Total 12 0 0 0 0 0 0

Note: This delta review verified that all prior findings were addressed. No new issues identified.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

This is a delta review scoped to changes since the prior automated review (commit b9fbc277d620). The delta consists of 1 commit with +110/-56 line changes.

Delta Changes Reviewed

Change Assessment
D5 reworded — "every conversation always has title and summary" guarantee ✅ Clarification, strengthens the contract
D8 added — Backfill decision locked ✅ Formalizes deferred capability
saveFallback() expanded — now generates both title AND summary ✅ Addresses "no null metadata" guarantee
Section 4.7 (Backfill Strategy) added ✅ Comprehensive, covers graceful degradation
OQ2, OQ3 added — internal messages, retrieval approach ✅ Explicitly captures deferred scope
Deferred table expanded — "Likely approach" column ✅ Useful implementation guidance
Consistency Model rewritten ✅ Much clearer guarantees
Backfill risk mitigation added ✅ Appropriate coverage

Prior Feedback Status

All issues from previous reviews remain addressed. This delta only adds clarifications and improvements — no regressions.

Review Assessment

The delta strengthens the spec without introducing new concerns:

  1. "No null metadata" guarantee is now explicit and backed by the expanded saveFallback() implementation that always produces both title and summary.

  2. Backfill strategy is sound: paginated, deterministic-only (no LLM cost), and preserves the read-time fallback as a safety net.

  3. Open questions are well-scoped — deferring internal message visibility and retrieval approach to V1 is pragmatic for V0.

  4. Consistency model now clearly documents the brief staleness window and backfill coverage.


✅ APPROVE

Summary: Clean delta that improves spec clarity without introducing new issues. The "every conversation always has metadata" guarantee is now well-specified with deterministic fallback for short conversations and LLM generation for longer ones. Ready for implementation.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator-delta 8 0 0 0 0 0 0
Total 8 0 0 0 0 0 0

Note: This delta review verified all changes are clarifications/improvements. No new issues identified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants