add setting to define filename pattern for part exports#1512
add setting to define filename pattern for part exports#1512arthurpassos wants to merge 6 commits intoantalya-25.8from
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8163fb0b93
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| try | ||
| { | ||
| const auto filename = buildDestinationFilename(manifest, destination_storage_id, local_context); |
There was a problem hiding this comment.
Use source table macros when building export filename
The new filename expansion passes destination_storage_id into buildDestinationFilename, so the {shard}/{replica} lookup checks the destination database instead of the source table’s database. When exporting from a DatabaseReplicated source to a regular S3 table (the common path), macro_info.shard and macro_info.replica stay unset and expansion falls back to server-config macros, which can raise NO_ELEMENTS_IN_CONFIG or inject wrong values. This means the newly added pattern feature does not work for the intended sharded/replicated source scenario.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
oh, port + ai completion suggestion glitch, fixed
| manifest.parquet_parallel_encoding = json->getValue<bool>("parquet_parallel_encoding"); | ||
| manifest.max_bytes_per_file = json->getValue<size_t>("max_bytes_per_file"); | ||
| manifest.max_rows_per_file = json->getValue<size_t>("max_rows_per_file"); | ||
| manifest.filename_pattern = json->getValue<String>("filename_pattern"); |
There was a problem hiding this comment.
Preserve backward compatibility for manifest deserialization
fromJsonString now unconditionally reads filename_pattern, but older metadata.json export manifests do not contain this key. On mixed-version or restarted clusters with pre-existing export tasks, json->getValue<String>("filename_pattern") will throw during task loading and prevent those exports from being processed. This field needs an optional read with a default (matching {part_name}_{checksum}) like other backward-compatible manifest fields.
Useful? React with 👍 / 👎.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add setting to define filename pattern for part exports - helps with sharding
Documentation entry for user-facing changes
...
CI/CD Options
Exclude tests:
Regression jobs to run: