fix: correct suffix boundary lookup for prefixed last names (#100) by derek73 · Pull Request #179 · derek73/python-nameparser

derek73 · 2026-06-29T20:43:05Z

Summary

Fixes issue Strange parsing of name w lastname prefix and title before and after #100: "dr Vincent van Gogh dr" was producing a corrupted middle name (" dr Vincent van") because pieces.index(stop_at) searched from position 0, matching the leading "dr" (a title that is also a suffix acronym) instead of the trailing one
One-line fix: adds the i + 1 start argument to pieces.index(stop_at, i + 1), making it consistent with the sibling next_prefix lookup just above it that was already correct
Adds a regression guard for MemoryError for a name with a lot of prefixes #108 (many repeated prefixes must not exhaust memory); that blow-up is already fixed by a prior refactor — the test ensures it cannot silently come back

Test Plan

test_title_before_and_after_prefixed_last_name — asserts the agreed output for Strange parsing of name w lastname prefix and title before and after #100: title="dr", first="Vincent", middle="", last="van Gogh", suffix="dr"
test_many_repeated_prefixes_does_not_blow_up — parses "Jan van der … Berg" (30× prefix) without hanging or raising
Full suite: 821 passed, 22 xfailed, 0 failed (up from 817 baseline)
No regressions to test_prefix_is_first_name (Van Johnson), test_portuguese_prefixes, test_portuguese_dos, test_prefix_before_two_part_last_name_with_acronym_suffix
mypy and ruff clean

Out of scope

Issues #121 and #132 were evaluated and excluded — they are irreducible ambiguities that collide with real names, not corruption bugs.

🤖 Generated with Claude Code

The prefix-joining loop located the suffix stop boundary with a value-based pieces.index() that searched from position 0. When a token value repeated (a trailing title that is also a suffix acronym, e.g. the second 'dr' in 'dr Vincent van Gogh dr'), it matched the leading occurrence, producing an empty slice that duplicated pieces and corrupted the middle name. Constrain the lookup to start at i + 1, consistent with the sibling next_prefix lookup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix inline comment in join_on_conjunctions: clarify that filter() finds the value in pieces[i+1:] but index() searches from 0 by default, and drop the misleading "title" framing (the token only needs to satisfy is_suffix, not is_title) - Add test for two-word prefix collision ("van der") — different loop iteration count than the single-word case - Add test with a genuine middle name alongside the repeated token, since the pre-fix bug corrupted the middle field specifically - Add @pytest.mark.timeout(2) to the #108 guard so the timeout is enforced locally and in CI, not just by CI job limits - Assert hn.last contains "Berg" in the #108 guard to catch silent last-name corruption - Add pytest-timeout dev dependency - Resolve pre-existing stash conflict in docs/resources.rst (keep upstream) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

derek73 and others added 3 commits June 29, 2026 13:40

test: guard against #108 exponential blow-up on repeated prefixes

7eb356d

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

derek73 self-assigned this Jun 29, 2026

derek73 added this to the v1.3.0 milestone Jun 29, 2026

derek73 merged commit 8cb62a9 into master Jun 29, 2026
8 checks passed

derek73 deleted the fix/prefix-suffix-lookup-issue-100 branch June 29, 2026 21:04

derek73 mentioned this pull request Jun 29, 2026

Strange parsing of name w lastname prefix and title before and after #100

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct suffix boundary lookup for prefixed last names (#100)#179

fix: correct suffix boundary lookup for prefixed last names (#100)#179
derek73 merged 3 commits into
masterfrom
fix/prefix-suffix-lookup-issue-100

derek73 commented Jun 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

derek73 commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

derek73 commented Jun 29, 2026 •

edited

Loading