fix(git): normalize git_log schema across filtered and unfiltered branches#4470
Open
Sagargupta16 wants to merge 1 commit into
Open
fix(git): normalize git_log schema across filtered and unfiltered branches#4470Sagargupta16 wants to merge 1 commit into
Sagargupta16 wants to merge 1 commit into
Conversation
…nches git_log emitted two different string shapes depending on whether start_timestamp or end_timestamp was passed: Unfiltered branch used commit.hexsha!r / commit.author!r, producing repr()-quoted values with leading single quotes and Actor angle brackets, and included the full commit body via commit.message. Filtered branch used git log --format=%H%n%an%n%ad%n%s%n, producing raw unquoted values, and dropped the commit body entirely (%s is subject only). Downstream parsers that split on "\n" and ":" have to special-case which branch produced the entry. Reported in modelcontextprotocol#4469. Collapse both branches onto repo.iter_commits() with since/until kwargs, so there is exactly one code path. Drop the repr() formatting so both branches emit raw values (bare commit hash, author name/email, full message). Add a regression test asserting the two branches produce the same key order and neither emits repr artefacts. Closes modelcontextprotocol#4469
51255b1 to
bd67660
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #4469.
`git_log` returned two different string shapes depending on whether `start_timestamp`/`end_timestamp` was passed:
Before (unfiltered branch —
server.py:187-197)Used `commit.hexsha!r` / `commit.author!r`, which produced repr-quoted values:
```
Commit: 'a1b2c3d4e5f6...'
Author: <git.Actor "Name ">
Date: 2026-07-01 12:00:00+00:00
Message: 'subject\n\nbody\n'
```
Before (filtered branch —
server.py:180-185)Used
git log --format=%H%n%an%n%ad%n%s%nand split, producing raw values but losing the commit body (`%s` is subject-only):```
Commit: a1b2c3d4e5f6...
Author: Name
Date: Wed Jul 1 12:00:00 2026 +0000
Message: subject
```
Downstream parsers that split on `\n` and `:` had to special-case which branch produced the entry.
Fix
Collapse both branches onto a single `repo.iter_commits(max_count=..., since=..., until=...)` call, so there is exactly one code path — schema drift is impossible by construction. Also drop the `!r` formatting so both filtered and unfiltered entries emit raw values (bare commit hash, plain author `Name `, full `commit.message` including body).
The flag-injection defense (`start_timestamp.startswith("-")`) is preserved and now applies uniformly whether or not the fast path is taken.
Tests
Added `test_git_log_schema_matches_across_filter_branches` that asserts:
All existing `git_log` tests pass; the unrelated `test_validate_repo_path_symlink_escape` failure on Windows is pre-existing on upstream (verified via stash).
Compatibility note
The
Date:line for the unfiltered branch now usescommit.authored_datetime(already the case before), and the filtered branch's format string is gone. If any external consumer was pattern-matching on the raw git-format date (`Wed Jul 1 12:00:00 2026 +0000`), they now get the ISO-style datetime for both branches, which is the consistent shape #4469 asks for.