fix: Improve BigQuery Agent Analytics plugin reliability and code quality#4491
fix: Improve BigQuery Agent Analytics plugin reliability and code quality#4491caohy1988 wants to merge 5 commits intogoogle:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
Response from ADK Triaging Agent Hello @caohy1988, thank you for creating this PR! Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). You can visit https://cla.developers.google.com/ to see your current agreements or to sign a new one. Also, this PR is a bug fix. Could you please associate a GitHub issue with this PR? If there is no existing issue, could you please create one? In addition, could you please provide logs or a screenshot after the fix is applied? This information will help reviewers to review your PR more efficiently. Thanks! |
401fb43 to
d9347d7
Compare
| if TYPE_CHECKING: | ||
| from ..agents.invocation_context import InvocationContext | ||
|
|
||
|
|
There was a problem hiding this comment.
remove unnecessary empty line
d9347d7 to
35d6028
Compare
| state_delta: dict[str, Any], | ||
| invocation_context: InvocationContext, | ||
| event: "Event", | ||
| **kwargs, |
There was a problem hiding this comment.
can you please help me understand why have **kwargs here? ADK doesn't set addtional arguments when running the callback
There was a problem hiding this comment.
Good catch — the **kwargs was added as a forward-compatibility guard in case ADK's callback signatures evolve, but you're right that the framework currently passes only the named parameters declared in BasePlugin. Removed **kwargs from all 12 callback signatures to match BasePlugin exactly. Also fixed on_model_error_callback to include the llm_request parameter that the base class declares.
…lity This CL fixes several bugs in the BigQuery Agent Analytics plugin and refactors the internal data-passing pattern for better type safety and maintainability. - **Stale Loop State Validation:** Use `loop.is_closed()` — a public, reliable API — to detect and clean up stale asyncio loop states in `_batch_processor_prop`, `_get_loop_state`, and `flush`. The previous approach used `asyncio.Queue._loop` which is `None` on Python 3.10+, causing the check to always treat states as stale. - **Quota Project ID Fallback:** Remove the `or project_id` fallback when setting `quota_project_id` on `BigQueryWriteAsyncClient`. This fixes Workload Identity Federation flows where the federated identity lacks `serviceusage.services.use` on the quota project. - **Kwargs Passthrough:** Pass `**kwargs` through to `_log_event` in all callbacks. Previously only model callbacks forwarded them, causing custom attributes (e.g. `customer_id`) to silently drop for agent, tool, run, and error events. - **State Delta Logging:** Replace the dead `on_state_change_callback` (never invoked by the framework) with `on_event_callback`, which is already dispatched by the runner for every event. Remove duplicate `STATE_DELTA` logging from `after_tool_callback`. - **EventData Dataclass:** Replace the `**kwargs`-as-data-bus pattern in `_log_event` with an explicit `EventData` dataclass. This makes the interface self-documenting, catches typos at construction time, and eliminates shared dict mutation across `_resolve_span_ids`, `_extract_latency`, and `_enrich_attributes`. All 12 callback call sites now construct typed `EventData` instances. - **Multi-Subagent Tool Logging Tests:** Add `TestMultiSubagentToolLogging` (6 tests) verifying that tool events are correctly attributed to subagents in multi-turn, multi-agent scenarios. Total tests: 111 (up from 60). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
35d6028 to
2f66ad4
Compare
2f66ad4 to
bbf648b
Compare
Remove intermediate `BigQueryLoggerConfig = bigquery_agent_analytics_plugin.BigQueryLoggerConfig` alias and use the fully qualified reference directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bbf648b to
e8a80aa
Compare
|
I've tested the latest of BigQuery Plugin in ADK, and I didn't catch how log_session_metadata works. I added state to my session, nothing is logged. I activated and then deactivated the log_session_metadata option in the plugins; however, no changes occurred in the data logged into BigQuery. So, I delved into the code. the SESSION object from the callback_context does not contain "metadata" attribute. Have you confused it with the "state" attribute? Or did I miss something? |
…istent metadata attribute Session has no `metadata` attribute (Pydantic model with `extra='forbid'`), and CallbackContext has no public `session` attribute. This made `log_session_metadata` a complete no-op. Fix by accessing `callback_context._invocation_context.session` and logging session_id, app_name, user_id, and the session state dict (which holds user-set metadata like thread_id, customer_id, etc.). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Good catch —
Fixed in commit ed1d1a7 — {
"session_metadata": {
"session_id": "...",
"app_name": "...",
"user_id": "...",
"state": {"thread_id": "gchat-123", "customer_id": "cust-42"}
}
}The |
|
content_json, content_parts, parser_truncated = await self.parser.parse( |
self.parser is typed as Optional[HybridContentParser] but was accessed without a null check at lines 2046-2048. While _ensure_started() should initialize it before _log_event runs, the type checker cannot prove this and a partial _lazy_setup failure could leave parser as None. Add an explicit guard that logs a warning and returns early. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Good catch — In normal flow, Fixed in commit 6e5046b — added an explicit null guard that logs a warning and returns early: if not self.parser:
logger.warning("Parser not initialized; skipping event %s.", event_type)
return |
|
ERROR: Found 5 breaking change(s) in API surface. |
|
Thanks for running the API surface check. All 5 breaking changes are now fixed in commit 0b143e0: Items 1-4 ( Our refactoring replaced the original Fixed by restoring the original pattern: class-level Item 5 ( This was intentionally replaced with However, to avoid breaking any subclasses that may override it, we've added back a deprecated stub that logs a warning: async def on_state_change_callback(self, *, callback_context, state_delta):
"""Deprecated: use on_event_callback instead."""
logger.warning("on_state_change_callback is deprecated and never called by the framework...") |
…ent, write_stream, and on_state_change_callback Restore class-level `= None` attributes and `__getattribute__` override for batch_processor, write_client, and write_stream to preserve the public API surface. The properties were renamed to `_*_prop` (private) with the `__getattribute__` routing access to them. Add deprecated `on_state_change_callback` stub for backward compat. This method is never invoked by the framework but retaining it prevents breaking subclasses that may override it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
0b143e0 to
aa2c196
Compare
| Running: uv run contributing/samples/hello_world/main.pyLog setup complete: /tmpfs/tmp/agents_log/agent.20260217_111202.log
|
|
This integration test failure is not caused by our PR. Our changes only touch 2 files:
The failing test ( This happens because the This appears to be a pre-existing flaky test — the |
This CL fixes several bugs in the BigQuery Agent Analytics plugin and refactors the internal data-passing pattern for better type safety and maintainability.
loop.is_closed()— a public, reliable API — to detect and clean up stale asyncio loop states in_batch_processor_prop,_get_loop_state, andflush. The previous approach usedasyncio.Queue._loopwhich isNoneon Python 3.10+, causing the check to always treat states as stale.or project_idfallback when settingquota_project_idonBigQueryWriteAsyncClient. This fixes Workload Identity Federation flows where the federated identity lacksserviceusage.services.useon the quota project.**kwargsthrough to_log_eventin all callbacks. Previously only model callbacks forwarded them, causing custom attributes (e.g.customer_id) to silently drop for agent, tool, run, and error events.on_state_change_callback(never invoked by the framework) withon_event_callback, which is already dispatched by the runner for every event. Remove duplicateSTATE_DELTAlogging fromafter_tool_callback.**kwargs-as-data-bus pattern in_log_eventwith an explicitEventDatadataclass. This makes the interface self-documenting, catches typos at construction time, and eliminates shared dict mutation across_resolve_span_ids,_extract_latency, and_enrich_attributes. All 12 callback call sites now construct typedEventDatainstances.TestMultiSubagentToolLogging(6 tests) verifying that tool events are correctly attributed to subagents in multi-turn, multi-agent scenarios. Total tests: 111 (up from 60).