Skip to content

DRILL-8529: Caching QueryPlan Results#3023

Open
vdegans wants to merge 10 commits intoapache:masterfrom
vdegans:rebased-cache
Open

DRILL-8529: Caching QueryPlan Results#3023
vdegans wants to merge 10 commits intoapache:masterfrom
vdegans:rebased-cache

Conversation

@vdegans
Copy link

@vdegans vdegans commented Sep 11, 2025

DRILL-8529: Caching QueryPlan Results

Description

Implements a caching mechanism for query plans and transformations, to shorten the prepare phase.

Documentation

The cache behavior can be customized via drill-override.conf

planner {
  query {
    cache {
      max_entries_amount: 100       # Maximum number of cached query plans (default: 100)
      plan_cache_ttl_minutes: 300   # Time-to-live for cached query plans in minutes (default: 300)
    }
  }
  transform {
    cache {
      max_entries_amount: 100       # Maximum number of cached transform plans (default: 100)
      plan_cache_ttl_minutes: 300   # Time-to-live for cached transform plans in minutes (default: 300)
    }
  }
}
  • max_entries_amount: limits the number of cached plans. Older entries are evicted when the limit is reached.
  • plan_cache_ttl_minutes: sets the lifetime of cached plans. Expired entries are recomputed on next use.

At runtime, caching can also be toggled with:
planner.cache.enable (true = enabled, false = disabled)

Testing

  • Manual testing shows reduced query planning time for repeated large query plans.
  • Automated tests are being added to verify correctness and cache eviction behavior.

@cgivre cgivre added doc-impacting PRs that affect the documentation performance PRs that Improve Performance labels Sep 11, 2025
@cgivre
Copy link
Contributor

cgivre commented Sep 11, 2025

@vdegans Wow! This is an impressive first contribution to Drill! Before we start review, would you please do a clean rebase on master? I'm sure you didn't mean to pull in all those old versions.

@vdegans vdegans force-pushed the rebased-cache branch 2 times, most recently from 5ac8803 to c49c581 Compare September 11, 2025 13:18
@vdegans
Copy link
Author

vdegans commented Sep 11, 2025

@cgivre Thanks! I fixed the rebase.

@cgivre
Copy link
Contributor

cgivre commented Sep 11, 2025

@cgivre Thanks! I fixed the rebase.

Thanks. Before I start review I had a few questions:

  1. Is there ever a case where someone might want different cache settings for different storage plugins?
  2. Or.. is there a situation where a user might want to disable caching entirely for certain plugins but not others?

What I'm getting at here is would it make sense to have global settings which you already have, but then also give the user the ability to set custom settings for specific plugins if they wanted to do so. I genuinely don't know if that is worth the effort or not. I could imagine this being more of an issue with data where the schema could change--MongoDB or JSON files for instance--and queries like SELECT * might bring back different data every time you run them.

@vdegans
Copy link
Author

vdegans commented Sep 11, 2025

Thanks, that’s a great point. I agree that giving users control over caching at the per-plugin level makes sense. Some plugins, especially ones where the underlying data might change frequently, like MongoDB or JSON files, could benefit from having caching disabled or customized independently from the global settings. I think supporting per-plugin cache configuration would give users the flexibility to optimize caching behavior for their specific use cases and improve the overall user experience.

@cgivre
Copy link
Contributor

cgivre commented Sep 18, 2025

@vdegans Hi Vincent, Any update?

@vdegans
Copy link
Author

vdegans commented Sep 19, 2025

Hi @cgivre, I was thinking about the suggestion for a setting to enable/disable caching per plugin and I got stuck on the idea of when to cache and when not to.
I think the best solution right now is to disable caching all together when one of the used plugins is set to disabled for caching, since I am not sure how a partially cached query plan would work (if it could even work).

I didn't get much time to look at the code yet, but I would like to hear your thoughts about the settings per plugin.

@cgivre
Copy link
Contributor

cgivre commented Sep 22, 2025

Hi @cgivre, I was thinking about the suggestion for a setting to enable/disable caching per plugin and I got stuck on the idea of when to cache and when not to. I think the best solution right now is to disable caching all together when one of the used plugins is set to disabled for caching, since I am not sure how a partially cached query plan would work (if it could even work).

I didn't get much time to look at the code yet, but I would like to hear your thoughts about the settings per plugin.

The whole idea of partially cached query plans is extremely tricky. I think but could be wrong but there may have been some work on that from the Calcite team at one point.

In any event, my suggestion would be to start simple. Let's get all the unit tests to pass and just start with simple caching. IE: exact query match. Once that's done and merged, we can iterate and find improvements.

@vdegans vdegans marked this pull request as ready for review October 9, 2025 11:46
@cgivre
Copy link
Contributor

cgivre commented Oct 9, 2025

@vdegans You should rebase on current master. I think that will solve the size limit issue you're running into.

@vdegans
Copy link
Author

vdegans commented Oct 9, 2025

I think caffeine might cause this, should I add caffeine to the exclude list?

@cgivre
Copy link
Contributor

cgivre commented Oct 9, 2025

I think caffeine might cause this, should I add caffeine to the exclude list?

You can either exclude caffeine or bump up the max size. Either is fine.

@vdegans
Copy link
Author

vdegans commented Oct 9, 2025

Locally this seems to have fixed the issue

@cgivre
Copy link
Contributor

cgivre commented Feb 4, 2026

@vdegans I have some time now and can assist with this. Could you please rebase on current master and once you've done that, I'll see how I can help.

@vdegans
Copy link
Author

vdegans commented Feb 4, 2026

Thank you for your time, I have just rebased the branch.

@cgivre
Copy link
Contributor

cgivre commented Feb 4, 2026

@vdegans Can you take a look at my fork (https://github.com/cgivre/drill/tree/rebased-cache)? I think these changes will help get the CI/CD to pass.

…ogic

- Fix CustomCacheManager static initializer that crashed tests during class load
  by using lazy initialization with double-checked locking
- Consolidate caching logic: remove duplicate ConcurrentHashMap cache from
  DrillSqlWorker and redundant caching in Foreman, use CustomCacheManager only
- Clean up verbose debug logging (remove logger.info calls, printStackTrace)
- Add clearCaches() and reset() methods for testing support
- Only cache SELECT queries (not DDL/DML statements)
Copy link
Contributor

@cgivre cgivre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this contribution. I have some minor nits, and some questions:

  1. Is there ever a situation where the underlying data would change and it would affect the query plan such that Drill isn't returning the correct data?
  2. If possible, I'd really like to see a flag added to the metadata that is returned which would indicate that the query plan was from cache.
  3. Do you anticipate any security issues from using cached query plans? For instance, let's say that we have user translation enabled and user 1 executes a query against a MySQL database. User 2 then tries the same query, but does not have the same access. Would user 2 be able to access the data?

}

public static void logCacheStats() {
ensureInitialized();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These stats would be really useful to expose in the UI. However, I'd suggest adding that in another PR. Let's get this merged first!

if (planCacheEnabled) {
CustomCacheManager.putTransformedPlan(key, output);
logger.debug("Cached transform result for phase: {}", phase);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be possible to include a flag in the results that would indicate that the query plan was retrieved from cache?

@SuppressWarnings("deprecation")
final OptionDefinition[] definitions = new OptionDefinition[]{
new OptionDefinition(PlannerSettings.PLAN_CACHE),
// new OptionDefinition(PlannerSettings.PLAN_CACHE_TTL),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove commented out lines.

* Moves query to RUNNING state.
*/
private void startQueryProcessing() {
logger.info("Starting query processing");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make debug. Drill already emits a lot of logs.

logger.info("Query text for query with id {} issued by {}: {}", queryIdString,
queryContext.getQueryUserName(), sql);
runSQL(sql);
logger.info("RunSQL is executed within {}", new Date().getTime() - start);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to debug.

@cgivre cgivre changed the title Caching QueryPlan Results DRILL-8529: Caching QueryPlan Results Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-impacting PRs that affect the documentation performance PRs that Improve Performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants