fix: Improve BigQuery Agent Analytics plugin reliability and code quality by caohy1988 · Pull Request #4491 · google/adk-python

caohy1988 · 2026-02-14T07:41:21Z

This CL fixes several bugs in the BigQuery Agent Analytics plugin and refactors the internal data-passing pattern for better type safety and maintainability.

Stale Loop State Validation: Use loop.is_closed() — a public, reliable API — to detect and clean up stale asyncio loop states in _batch_processor_prop, _get_loop_state, and flush. The previous approach used asyncio.Queue._loop which is None on Python 3.10+, causing the check to always treat states as stale.
Quota Project ID Fallback: Remove the or project_id fallback when setting quota_project_id on BigQueryWriteAsyncClient. This fixes Workload Identity Federation flows where the federated identity lacks serviceusage.services.use on the quota project.
Kwargs Passthrough: Pass **kwargs through to _log_event in all callbacks. Previously only model callbacks forwarded them, causing custom attributes (e.g. customer_id) to silently drop for agent, tool, run, and error events.
State Delta Logging: Replace the dead on_state_change_callback (never invoked by the framework) with on_event_callback, which is already dispatched by the runner for every event. Remove duplicate STATE_DELTA logging from after_tool_callback.
EventData Dataclass: Replace the **kwargs-as-data-bus pattern in _log_event with an explicit EventData dataclass. This makes the interface self-documenting, catches typos at construction time, and eliminates shared dict mutation across _resolve_span_ids, _extract_latency, and _enrich_attributes. All 12 callback call sites now construct typed EventData instances.
Multi-Subagent Tool Logging Tests: Add TestMultiSubagentToolLogging (6 tests) verifying that tool events are correctly attributed to subagents in multi-turn, multi-agent scenarios. Total tests: 111 (up from 60).

google-cla · 2026-02-14T07:41:38Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

adk-bot · 2026-02-14T07:42:18Z

Response from ADK Triaging Agent

Hello @caohy1988, thank you for creating this PR!

Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). You can visit https://cla.developers.google.com/ to see your current agreements or to sign a new one.

Also, this PR is a bug fix. Could you please associate a GitHub issue with this PR? If there is no existing issue, could you please create one?

In addition, could you please provide logs or a screenshot after the fix is applied? This information will help reviewers to review your PR more efficiently. Thanks!

caohy1988 · 2026-02-14T17:51:38Z

src/google/adk/plugins/bigquery_agent_analytics_plugin.py

 if TYPE_CHECKING:
  from ..agents.invocation_context import InvocationContext

+


remove unnecessary empty line

caohy1988 · 2026-02-14T18:19:31Z

src/google/adk/plugins/bigquery_agent_analytics_plugin.py

-      state_delta: dict[str, Any],
+      invocation_context: InvocationContext,
+      event: "Event",
      **kwargs,


can you please help me understand why have **kwargs here? ADK doesn't set addtional arguments when running the callback

Good catch — the **kwargs was added as a forward-compatibility guard in case ADK's callback signatures evolve, but you're right that the framework currently passes only the named parameters declared in BasePlugin. Removed **kwargs from all 12 callback signatures to match BasePlugin exactly. Also fixed on_model_error_callback to include the llm_request parameter that the base class declares.

…lity This CL fixes several bugs in the BigQuery Agent Analytics plugin and refactors the internal data-passing pattern for better type safety and maintainability. - **Stale Loop State Validation:** Use `loop.is_closed()` — a public, reliable API — to detect and clean up stale asyncio loop states in `_batch_processor_prop`, `_get_loop_state`, and `flush`. The previous approach used `asyncio.Queue._loop` which is `None` on Python 3.10+, causing the check to always treat states as stale. - **Quota Project ID Fallback:** Remove the `or project_id` fallback when setting `quota_project_id` on `BigQueryWriteAsyncClient`. This fixes Workload Identity Federation flows where the federated identity lacks `serviceusage.services.use` on the quota project. - **Kwargs Passthrough:** Pass `**kwargs` through to `_log_event` in all callbacks. Previously only model callbacks forwarded them, causing custom attributes (e.g. `customer_id`) to silently drop for agent, tool, run, and error events. - **State Delta Logging:** Replace the dead `on_state_change_callback` (never invoked by the framework) with `on_event_callback`, which is already dispatched by the runner for every event. Remove duplicate `STATE_DELTA` logging from `after_tool_callback`. - **EventData Dataclass:** Replace the `**kwargs`-as-data-bus pattern in `_log_event` with an explicit `EventData` dataclass. This makes the interface self-documenting, catches typos at construction time, and eliminates shared dict mutation across `_resolve_span_ids`, `_extract_latency`, and `_enrich_attributes`. All 12 callback call sites now construct typed `EventData` instances. - **Multi-Subagent Tool Logging Tests:** Add `TestMultiSubagentToolLogging` (6 tests) verifying that tool events are correctly attributed to subagents in multi-turn, multi-agent scenarios. Total tests: 111 (up from 60). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove intermediate `BigQueryLoggerConfig = bigquery_agent_analytics_plugin.BigQueryLoggerConfig` alias and use the fully qualified reference directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

haiyuan-eng-google · 2026-02-17T15:51:21Z

I've tested the latest of BigQuery Plugin in ADK, and I didn't catch how log_session_metadata works.

I added state to my session, nothing is logged. I activated and then deactivated the log_session_metadata option in the plugins; however, no changes occurred in the data logged into BigQuery.

So, I delved into the code. the SESSION object from the callback_context does not contain "metadata" attribute.

Have you confused it with the "state" attribute? Or did I miss something?

…istent metadata attribute Session has no `metadata` attribute (Pydantic model with `extra='forbid'`), and CallbackContext has no public `session` attribute. This made `log_session_metadata` a complete no-op. Fix by accessing `callback_context._invocation_context.session` and logging session_id, app_name, user_id, and the session state dict (which holds user-set metadata like thread_id, customer_id, etc.). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

caohy1988 · 2026-02-17T15:55:31Z

Good catch — log_session_metadata was indeed completely broken (dead code). Two issues:

CallbackContext (which is Context) doesn't expose a public .session attribute, so hasattr(callback_context, "session") was always False
Even if it did reach Session, there's no metadata attribute — Session is a Pydantic model with extra='forbid' and only has id, app_name, user_id, state, events, last_update_time

Fixed in commit ed1d1a7 — log_session_metadata now accesses callback_context._invocation_context.session and logs:

{
  "session_metadata": {
    "session_id": "...",
    "app_name": "...",
    "user_id": "...",
    "state": {"thread_id": "gchat-123", "customer_id": "cust-42"}
  }
}

The state dict is where users store metadata like gchat thread-id, customer_id, etc. — so this is the correct field to capture.

haiyuan-eng-google · 2026-02-17T16:00:06Z

content_json, content_parts, parser_truncated = await self.parser.parse(
raw_content
) 'HybridContentParser | None' has no attribute 'parse'

self.parser is typed as Optional[HybridContentParser] but was accessed without a null check at lines 2046-2048. While _ensure_started() should initialize it before _log_event runs, the type checker cannot prove this and a partial _lazy_setup failure could leave parser as None. Add an explicit guard that logs a warning and returns early. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

caohy1988 · 2026-02-17T16:02:59Z

Good catch — self.parser is typed as Optional[HybridContentParser] (initialized to None in __init__) but was used without a null check in _log_event at lines 2046-2048.

In normal flow, _ensure_started() → _lazy_setup() creates the parser before _log_event runs, so it wouldn't be None at runtime. But the type checker can't prove this, and if _lazy_setup partially fails (e.g. schema creation succeeds but parser creation throws), self.parser could remain None.

Fixed in commit 6e5046b — added an explicit null guard that logs a warning and returns early:

if not self.parser:
    logger.warning("Parser not initialized; skipping event %s.", event_type)
    return

haiyuan-eng-google · 2026-02-17T16:05:42Z

ERROR: Found 5 breaking change(s) in API surface.
src/google/adk/plugins/bigquery_agent_analytics_plugin.py:1596: BigQueryAgentAnalyticsPlugin.batch_processor: Attribute value was changed: None -> unset
src/google/adk/plugins/bigquery_agent_analytics_plugin.py:1608: BigQueryAgentAnalyticsPlugin.write_client: Attribute value was changed: None -> unset
src/google/adk/plugins/bigquery_agent_analytics_plugin.py:1619: BigQueryAgentAnalyticsPlugin.write_stream: Attribute value was changed: None -> unset
/tmp/tmp.EFKMo1rzPc/src/google/adk/plugins/bigquery_agent_analytics_plugin.py:1629: BigQueryAgentAnalyticsPlugin.getattribute: Public object was removed
/tmp/tmp.EFKMo1rzPc/src/google/adk/plugins/bigquery_agent_analytics_plugin.py:2172: BigQueryAgentAnalyticsPlugin.on_state_change_callback: Public object was removed

caohy1988 · 2026-02-17T16:09:19Z

Thanks for running the API surface check. All 5 breaking changes are now fixed in commit 0b143e0:

Items 1-4 (batch_processor, write_client, write_stream value change + __getattribute__ removal):

Our refactoring replaced the original __getattribute__ + class-level = None masking pattern with clean @property decorators. While functionally equivalent, the API checker saw the class-level attributes change from None to unset (property descriptor).

Fixed by restoring the original pattern: class-level = None attributes + __getattribute__ routing to private _*_prop properties. The stale loop cleanup logic from our fix is preserved inside _batch_processor_prop.

Item 5 (on_state_change_callback removed):

This was intentionally replaced with on_event_callback because on_state_change_callback is dead code — it's not declared in BasePlugin, not dispatched by PluginManager, and not invoked by Runner. State deltas are now captured via on_event_callback which the framework actually calls.

However, to avoid breaking any subclasses that may override it, we've added back a deprecated stub that logs a warning:

async def on_state_change_callback(self, *, callback_context, state_delta):
    """Deprecated: use on_event_callback instead."""
    logger.warning("on_state_change_callback is deprecated and never called by the framework...")

…ent, write_stream, and on_state_change_callback Restore class-level `= None` attributes and `__getattribute__` override for batch_processor, write_client, and write_stream to preserve the public API surface. The properties were renamed to `_*_prop` (private) with the `__getattribute__` routing access to them. Add deprecated `on_state_change_callback` stub for backward compat. This method is never invoked by the framework but retaining it prevents breaking subclasses that may override it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

haiyuan-eng-google · 2026-02-17T16:14:28Z

| Running: uv run contributing/samples/hello_world/main.py

Log setup complete: /tmpfs/tmp/agents_log/agent.20260217_111202.log
To access latest log: tail -F /tmpfs/tmp/agents_log/agent.latest.log
Start time: 1771344722.3483453

** User says: {'parts': [{'text': 'Hi'}], 'role': 'user'}
** hello_world_agent: Hello! I'm a hello world agent that can roll an 8-sided die and check prime numbers. How can I help you today?

** User says: {'parts': [{'text': 'Roll a die with 100 sides'}], 'role': 'user'}
Traceback (most recent call last):
File "/tmpfs/src/piper/google3/third_party/py/google/adk/open_source_workspace/contributing/samples/hello_world/main.py", line 103, in
asyncio.run(main())
File "/root/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/root/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/tmpfs/src/piper/google3/third_party/py/google/adk/open_source_workspace/contributing/samples/hello_world/main.py", line 86, in main
await check_rolls_in_state(1)
File "/tmpfs/src/piper/google3/third_party/py/google/adk/open_source_workspace/contributing/samples/hello_world/main.py", line 77, in check_rolls_in_state
assert len(session.state['rolls']) == rolls_size
~~~~~~~~~~~~~^^^^^^^^^
KeyError: 'rolls'

caohy1988 · 2026-02-17T16:17:06Z

This integration test failure is not caused by our PR. Our changes only touch 2 files:

src/google/adk/plugins/bigquery_agent_analytics_plugin.py
tests/unittests/plugins/test_bigquery_agent_analytics_plugin.py

The failing test (contributing/samples/hello_world/main.py) does not use the BQ analytics plugin at all — it creates an InMemoryRunner with no plugins. The failure is:

assert len(session.state['rolls']) == rolls_size
KeyError: 'rolls'

This happens because the roll_die tool writes to tool_context.state['rolls'], but only if the LLM actually calls the tool. Since LLM behavior is non-deterministic, the model may not always invoke roll_die when prompted "Roll a die with 100 sides", leaving the rolls key absent from session state.

This appears to be a pre-existing flaky test — the main branch also shows test failures/cancellations in CI. Our PR has zero impact on the runner, session state, tool execution, or agent logic.

adk-bot added the services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc label Feb 14, 2026

caohy1988 force-pushed the fix/loop-state-validation branch 2 times, most recently from 401fb43 to d9347d7 Compare February 14, 2026 17:49

caohy1988 changed the title ~~fix: use loop.is_closed() for stale loop state validation in BigQuery plugin~~ fix: Improve BigQuery Agent Analytics plugin reliability and code quality Feb 14, 2026

caohy1988 commented Feb 14, 2026

View reviewed changes

caohy1988 force-pushed the fix/loop-state-validation branch from d9347d7 to 35d6028 Compare February 14, 2026 17:53

haiyuan-eng-google self-assigned this Feb 14, 2026

haiyuan-eng-google self-requested a review February 14, 2026 17:54

caohy1988 commented Feb 14, 2026

View reviewed changes

caohy1988 force-pushed the fix/loop-state-validation branch from 35d6028 to 2f66ad4 Compare February 14, 2026 18:38

caohy1988 mentioned this pull request Feb 17, 2026

BigQueryAgentAnalyticsPlugin: Pass **kwargs to _log_event for all callbacks (not just model callbacks) #4330

Open

caohy1988 force-pushed the fix/loop-state-validation branch from 2f66ad4 to bbf648b Compare February 17, 2026 15:47

cleanup: Inline BigQueryLoggerConfig references in tests

e8a80aa

Remove intermediate `BigQueryLoggerConfig = bigquery_agent_analytics_plugin.BigQueryLoggerConfig` alias and use the fully qualified reference directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

caohy1988 force-pushed the fix/loop-state-validation branch from bbf648b to e8a80aa Compare February 17, 2026 15:49

caohy1988 force-pushed the fix/loop-state-validation branch from 0b143e0 to aa2c196 Compare February 17, 2026 16:12

haiyuan-eng-google closed this Feb 17, 2026

		if TYPE_CHECKING:
		from ..agents.invocation_context import InvocationContext

Conversation

caohy1988 commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-cla bot commented Feb 14, 2026

Uh oh!

adk-bot commented Feb 14, 2026

Uh oh!

caohy1988 Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

caohy1988 Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

caohy1988 Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

haiyuan-eng-google commented Feb 17, 2026

Uh oh!

caohy1988 commented Feb 17, 2026

Uh oh!

haiyuan-eng-google commented Feb 17, 2026

Uh oh!

caohy1988 commented Feb 17, 2026

Uh oh!

haiyuan-eng-google commented Feb 17, 2026

Uh oh!

caohy1988 commented Feb 17, 2026

Uh oh!

haiyuan-eng-google commented Feb 17, 2026

| Running: uv run contributing/samples/hello_world/main.py

Log setup complete: /tmpfs/tmp/agents_log/agent.20260217_111202.log To access latest log: tail -F /tmpfs/tmp/agents_log/agent.latest.log Start time: 1771344722.3483453

Uh oh!

caohy1988 commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

caohy1988 commented Feb 14, 2026 •

edited

Loading

Log setup complete: /tmpfs/tmp/agents_log/agent.20260217_111202.log
To access latest log: tail -F /tmpfs/tmp/agents_log/agent.latest.log
Start time: 1771344722.3483453