fix: avoid panic parsing non-ASCII runtime config values#23316
fix: avoid panic parsing non-ASCII runtime config values#23316ByteBaker wants to merge 1 commit into
Conversation
|
Hi @alamb for your review please. |
There was a problem hiding this comment.
Thanks @ByteBaker can we please explore if runtime config can be part of SessionConfig? In SessionConfig there is already a framework that enforces type checking.
Edit: runtime configs looks looks finite according to set_runtime_variable, so should fit into main configs rather than having separate config set
|
@comphead yes that'd be a lot more cleaner/robust. However, that would also be a significant change in user-facing ergonomics. My suggestion would be to keep this PR to ensure the panic is resolved, and separately start a discussion around doing that, since that's a larger change surface (and the question if we should remove all the runtime variables from the Or are you suggesting to only use those for internal validation of the values, w/o actually modifying the user facing API? In which case I agree w/ your position. However, this PR can still remain a tiny fix, and your suggestion could become a more broader refactor, which we should track via a different issue altogether IMO. |
I think one difference was that the RuntimeConfig is potentially shared across multiple SessionConfigs (and maybe can also be adjusted during plan execution) I agree trying to consolidate RuntimeConfig and SessioNConfig should be considered as an independent project |
alamb
left a comment
There was a problem hiding this comment.
Looks good to me -- thank you @ByteBaker
I had one testing question, but otherwise 👍
| for duration in duration.split_inclusive(&['m', 's']) { | ||
| let (number, unit) = duration.split_at(duration.len() - 1); | ||
| let (unit_start, unit) = | ||
| duration.char_indices().next_back().ok_or_else(|| { |
There was a problem hiding this comment.
Yes. The tests for empty value and non-ASCII value are covered by test_parse_duration. The ok_or_else() branch itself is defensive since next_back() can be None only for an empty string, which is rejected by the earlier duration.trim().is_empty() guard before the loop.
93cb49b to
332a8da
Compare
Which issue does this PR close?
Rationale for this change
Presence of a non-ascii character in the value while setting any datafusion runtime variable was panicking.
What changes are included in this PR?
Better handling/iteration on character-boundary, and emits plan error now instead of panicking in case of such characters.
Are these changes tested?
Yes. UTs have been updated to include such cases.
Are there any user-facing changes?
No.