Skip to content

Optimize Parquet block index updates#7669

Open
SungJin1212 wants to merge 2 commits into
masterfrom
optimize-bucket-index-parquet-marker-reads
Open

Optimize Parquet block index updates#7669
SungJin1212 wants to merge 2 commits into
masterfrom
optimize-bucket-index-parquet-marker-reads

Conversation

@SungJin1212

Copy link
Copy Markdown
Member

What this PR does:
The bucket index updater re-read every block's parquet converter marker from object storage on each update cycle, causing a GET call per parquet block per cycle even when nothing changed.
This PR skips re-reading the marker when the block already has a valid-version parquet entry in the previous index, reducing repeated GET calls.

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
  • docs/configuration/v1-guarantees.md updated if this PR introduces experimental flags

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant