Skip to content

fix: avoid panic parsing non-ASCII runtime config values#23316

Open
ByteBaker wants to merge 1 commit into
apache:mainfrom
ByteBaker:fix/set-df-runtime-non-ascii-panic
Open

fix: avoid panic parsing non-ASCII runtime config values#23316
ByteBaker wants to merge 1 commit into
apache:mainfrom
ByteBaker:fix/set-df-runtime-non-ascii-panic

Conversation

@ByteBaker

Copy link
Copy Markdown

Which issue does this PR close?

Rationale for this change

Presence of a non-ascii character in the value while setting any datafusion runtime variable was panicking.

What changes are included in this PR?

Better handling/iteration on character-boundary, and emits plan error now instead of panicking in case of such characters.

Are these changes tested?

Yes. UTs have been updated to include such cases.

Are there any user-facing changes?

No.

@github-actions github-actions Bot added the core Core DataFusion crate label Jul 3, 2026
@ByteBaker

Copy link
Copy Markdown
Author

Hi @alamb for your review please.

@comphead comphead left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ByteBaker can we please explore if runtime config can be part of SessionConfig? In SessionConfig there is already a framework that enforces type checking.

Edit: runtime configs looks looks finite according to set_runtime_variable, so should fit into main configs rather than having separate config set

@ByteBaker

ByteBaker commented Jul 4, 2026

Copy link
Copy Markdown
Author

@comphead yes that'd be a lot more cleaner/robust. However, that would also be a significant change in user-facing ergonomics.

My suggestion would be to keep this PR to ensure the panic is resolved, and separately start a discussion around doing that, since that's a larger change surface (and the question if we should remove all the runtime variables from the SET xyz semantics).


Or are you suggesting to only use those for internal validation of the values, w/o actually modifying the user facing API? In which case I agree w/ your position. However, this PR can still remain a tiny fix, and your suggestion could become a more broader refactor, which we should track via a different issue altogether IMO.

@alamb

alamb commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Thanks @ByteBaker can we please explore if runtime config can be part of SessionConfig? In SessionConfig there is already a framework that enforces type checking.

I think one difference was that the RuntimeConfig is potentially shared across multiple SessionConfigs (and maybe can also be adjusted during plan execution)

I agree trying to consolidate RuntimeConfig and SessioNConfig should be considered as an independent project

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @ByteBaker

I had one testing question, but otherwise 👍

for duration in duration.split_inclusive(&['m', 's']) {
let (number, unit) = duration.split_at(duration.len() - 1);
let (unit_start, unit) =
duration.char_indices().next_back().ok_or_else(|| {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this check have a test?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The tests for empty value and non-ASCII value are covered by test_parse_duration. The ok_or_else() branch itself is defensive since next_back() can be None only for an empty string, which is rejected by the earlier duration.trim().is_empty() guard before the loop.

@ByteBaker ByteBaker force-pushed the fix/set-df-runtime-non-ascii-panic branch from 93cb49b to 332a8da Compare July 4, 2026 11:00
@alamb alamb enabled auto-merge July 4, 2026 11:03
@alamb alamb added this pull request to the merge queue Jul 4, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jul 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

panic on SET datafusion.runtime.* when value ends with non-ASCII byte (split_at char boundary)

3 participants