Bug description
string.Template compiles its substitution regex lazily and caches it on the class. Template._compile_pattern() reads the current pattern via cls.__dict__.get('pattern', _TemplatePattern). On the free-threaded build (--disable-gil), two concurrent first uses of a template can race: once one thread has compiled the pattern and stored the resulting re.Pattern back on the class, a second thread that entered _compile_pattern() reads that already-compiled re.Pattern and falls through to
re.compile(pattern, cls.flags | re.VERBOSE)
re.compile() rejects a flags argument when it is handed an already-compiled pattern, so the second thread raises
ValueError: cannot process flags argument with a compiled pattern
Only the first concurrent use of the base Template class is exposed. Subclasses compile their pattern eagerly in __init_subclass__, so they never reach _compile_pattern() concurrently on first use.
Reproduction
On a free-threaded build, starting several threads that each perform a first Template(...).substitute(...) reliably triggers the ValueError. Racing 16 threads across a fresh process on their first use of string.Template hit the ValueError in 40 of 40 processes; with the fix below applied, 0 of 40.
There is a deterministic, single-threaded form of the same bug: a subclass that supplies an already-compiled pattern raises the same ValueError at class-definition time, via __init_subclass__ -> _compile_pattern() -> re.compile(compiled, flags):
import re
from string import Template
class T(Template):
pattern = re.compile(r'\$(?:(?P<escaped>\$)|(?P<named>[a-z]+)|\{(?P<braced>[a-z]+)\}|(?P<invalid>))')
# ValueError: cannot process flags argument with a compiled pattern
Fix
Return the already-compiled pattern from _compile_pattern() when the stored pattern is already an re.Pattern, instead of trying to recompile it. This is idempotent and lock-free (the racing threads produce equivalent compiled patterns; last writer wins). It also fixes the deterministic subclass case above.
Linked PRs
Bug description
string.Templatecompiles its substitution regex lazily and caches it on the class.Template._compile_pattern()reads the current pattern viacls.__dict__.get('pattern', _TemplatePattern). On the free-threaded build (--disable-gil), two concurrent first uses of a template can race: once one thread has compiled the pattern and stored the resultingre.Patternback on the class, a second thread that entered_compile_pattern()reads that already-compiledre.Patternand falls through tore.compile()rejects aflagsargument when it is handed an already-compiled pattern, so the second thread raisesOnly the first concurrent use of the base
Templateclass is exposed. Subclasses compile their pattern eagerly in__init_subclass__, so they never reach_compile_pattern()concurrently on first use.Reproduction
On a free-threaded build, starting several threads that each perform a first
Template(...).substitute(...)reliably triggers theValueError. Racing 16 threads across a fresh process on their first use ofstring.Templatehit theValueErrorin 40 of 40 processes; with the fix below applied, 0 of 40.There is a deterministic, single-threaded form of the same bug: a subclass that supplies an already-compiled pattern raises the same
ValueErrorat class-definition time, via__init_subclass__->_compile_pattern()->re.compile(compiled, flags):Fix
Return the already-compiled pattern from
_compile_pattern()when the storedpatternis already anre.Pattern, instead of trying to recompile it. This is idempotent and lock-free (the racing threads produce equivalent compiled patterns; last writer wins). It also fixes the deterministic subclass case above.Linked PRs