Skip to content

Validate Mcp-Param-* headers server-side on the 2026-07-28 HTTP path (SEP-2243)#3033

Merged
maxisbey merged 5 commits into
mainfrom
mcp-param-validation
Jun 30, 2026
Merged

Validate Mcp-Param-* headers server-side on the 2026-07-28 HTTP path (SEP-2243)#3033
maxisbey merged 5 commits into
mainfrom
mcp-param-validation

Conversation

@maxisbey

Copy link
Copy Markdown
Contributor

Implements the server half of SEP-2243 custom headers for the 2026-07-28 Streamable HTTP path: tools/call requests are validated Mcp-Param-*-header ↔ body before dispatch, and mismatches are rejected with HTTP 400 + JSON-RPC -32020 (HeaderMismatch). The client half (mirroring, sentinel codec, annotation validation) already shipped; this closes the remaining server-side gap and removes the http-custom-header-server-validation entries from both conformance expected-failures baselines.

Refs #2900, #2715.

Motivation and Context

The draft transport spec requires it (Server Validation): "Any server that processes the message body MUST validate that encoded header values, after decoding if Base64-encoded, match the corresponding values in the request body. Servers MUST reject requests with a 400 Bad Request HTTP status and JSON-RPC error code -32020 (HeaderMismatch) if any validation fails." Until now the modern entry validated MCP-Protocol-Version/Mcp-Method/Mcp-Name only; the http-custom-header-server-validation conformance scenario ran 3 accept-pass / 6 reject-fail and was baselined as a known failure.

Design

Schema source = the server's own registered tools/list handler. The validator needs the called tool's inputSchema to know which Mcp-Param-* headers are recognized (the header name comes from the x-mcp-header annotation value, not the property name, so without the schema there is nothing to compare). Rather than adding a registry hook or configuration, the entry resolves the schema by dispatching an internal tools/list through the normal serve_one path with the caller's own envelope. Properties of this approach:

  • Nothing to configure on MCPServer or on low-level servers — anyone serving tools/list gets validation automatically.
  • A visibility-scoped catalog (per-caller tool filtering) validates against exactly what this caller was advertised; a tool hidden from the caller was never advertised, so its headers are unrecognized and ignored.
  • Validation is skipped — never failing the call — when: no tools/list handler is registered (an undiscoverable catalog has no recognized headers), the tool isn't in the listing (dispatch owns unknown-tool), the handler raises (logged at exception level — a deliberate availability-over-strictness call: a server broken here is equally broken for real discovery, and the skip is just the pre-feature status quo for that request), the pagination walk exceeds a 100-page cap or repeats a cursor (logged), or the call has no arguments and no Mcp-Param-* headers (no declaration can be violated in either direction).
  • The cost is an internal listing per validated tools/call: middleware, lifespans, and expensive/paginated tools/list handlers see extra invocations. Documented in docs/migration.md; optimizable later behind the same surface (e.g. a registry fast-path for the built-in handler) without API change.

Validation semantics (pure, exported mcp.shared.inbound.validate_mcp_param_headers, sharing the schema walker and scalar rendering with the client emit side so the two halves cannot drift):

  • Presence per the spec's scenario table: argument present → header must be present and match after sentinel decoding; argument null/absent → header must not be expected. A header present for an absent/null argument is rejected — the spec's table doesn't pin this cell, but its purpose clause ("a load balancer routing on the header value while the MCP server executes based on the body value") is exactly this case; the Go SDK rejects too, the TypeScript SDK skips.
  • A recognized header supplied more than once is rejected: intermediaries typically read the first copy while a last-wins lookup would validate the second, recreating the divergence the gate exists to prevent.
  • Strict canonical base64 in the =?base64?...?= sentinel: wrong padding, non-alphabet characters, non-zero trailing bits, or invalid UTF-8 are malformed (the conformance suite mandates strict rejection; Python's default lenient b64decode would wrongly accept two of its reject cases). This now applies to Mcp-Name decoding symmetrically.
  • Integer-typed declarations compare numerically (4242.0, the spec's SHOULD) behind a canonical-decimal gate (no 1e2), exactly (no float round-trip), and in both directions (an integral-float body value like 42.0 matches).
  • Unrecognized Mcp-Param-* headers are forwarded-and-ignored per the spec; a header claiming a non-primitive body value is a mismatch, while a non-primitive value without a header is left to params validation at dispatch.

How Has This Been Tested?

  • Conformance (pinned referee conformance@4944b268, locally against the everything-server): http-custom-header-server-validation 9/9 (was 3/9); http-header-validation 13/13 (unchanged — guards the shared decoder hardening); server-stateless 25/28 with exactly the 3 pre-existing known failures.
  • Live wire probes against the running everything-server: accept (plain, sentinel, mixed-case header name, undeclared-header-ignored, both-sides-empty) and reject (mismatch, missing header, bad padding, orphan header) all return the documented status/code/message; a legacy 2025 initialize on the same endpoint is untouched.
  • ./scripts/test: 4750 passed, 100% branch coverage, strict-no-cover clean. New unit matrix in tests/shared/test_inbound.py mirrors every conformance leg plus the edge semantics (duplicates, huge-digit headers, integral floats, nested paths, invalid-annotation schemas); entry-level wire tests in tests/server/test_streamable_http_modern.py cover skip paths (no list handler, unlisted tool, raising handler, cursor cycle, page cap), pagination, SSE-mode 400-as-JSON, and envelope isolation of the internal listing.

Breaking Changes

Spec-mandated behavior change on the 2026-07-28 path only (documented in docs/migration.md): a client that sends an annotated argument without its mirroring header — e.g. one that calls a tool it never listed — is now rejected with 400/-32020 instead of silently served. The spec's recovery is to re-list and retry (a client-side SHOULD this SDK does not implement yet; the TypeScript SDK does). Base64-sentinel decoding, including for Mcp-Name, is now strictly canonical. Legacy (≤2025-11-25) traffic is unaffected.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

The orphan-header cell (header present, argument absent) is genuinely unpinned by the spec's table and the SDKs split on it (Go rejects, TypeScript skips); this PR takes the reject posture with the purpose-clause rationale above. May be worth a small spec clarification upstream.

AI Disclaimer

…(SEP-2243)

Servers that process the request body MUST validate that Mcp-Param-*
headers match the corresponding x-mcp-header-annotated body arguments,
rejecting mismatches with HTTP 400 and -32020 (HeaderMismatch). The
client half shipped earlier; this adds the server half:

- mcp.shared.inbound grows a pure, exported validate_mcp_param_headers
  built on the same walker and scalar rendering the client emit side
  uses, so the two halves of the mirror contract cannot drift. Presence
  rules follow the spec's scenario table; recognized headers supplied
  more than once are rejected (first-wins consumers vs last-wins
  validation would otherwise disagree); integer values compare
  numerically behind a canonical-decimal gate, exactly and in both
  directions; unrecognized headers stay ignored.
- decode_header_value now requires canonical base64 in the sentinel
  (bad padding, stray characters, non-zero trailing bits, or invalid
  UTF-8 are malformed), which the conformance suite mandates for
  Mcp-Param-* and which now applies to Mcp-Name symmetrically.
- The modern HTTP entry validates tools/call pre-dispatch, resolving
  the called tool's inputSchema through the server's own registered
  tools/list handler via the normal serve_one path with the caller's
  envelope - so a visibility-scoped catalog validates exactly what
  this caller was advertised, with nothing to configure on MCPServer
  or lowlevel servers. The listing is skipped (never failing the
  call) when no tools/list handler is registered, the tool is not
  advertised, the handler raises (logged), pagination exceeds a page
  cap or cycles, or the call has no arguments and no Mcp-Param-*
  headers.
- Remove http-custom-header-server-validation from both conformance
  expected-failures baselines; the scenario passes 9/9 against the
  everything-server, with http-header-validation and server-stateless
  unchanged.
@maxisbey maxisbey marked this pull request as ready for review June 30, 2026 16:43
@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

📚 Documentation preview

Preview https://pr-3033.mcp-python-docs.pages.dev
Deployment https://19084f94.mcp-python-docs.pages.dev
Commit e6d58ce
Triggered by @maxisbey
Updated 2026-06-30 20:03:42 UTC

@maxisbey maxisbey marked this pull request as draft June 30, 2026 16:47
@maxisbey maxisbey marked this pull request as ready for review June 30, 2026 16:48
Comment thread src/mcp/shared/inbound.py Outdated
Comment thread docs/migration.md Outdated
maxisbey added 3 commits June 30, 2026 19:43
…ate routing headers, parse-limit 500s

- A value's header rendering (or its absence) is now a single shared fact
  (_render_header_scalar returns None for non-primitives and integers
  beyond the int-to-str digit limit): the client omits such headers, the
  validator treats a header claiming such a value as a mismatch, and a
  huge-integer body can no longer raise out of the public validator.
- A non-canonical-decimal header for an integer declaration falls back to
  the rendered comparison instead of rejecting outright, so the client's
  own scientific-notation rendering of large integral floats (str(1e16)
  == '1e+16') round-trips against this server.
- Duplicated MCP-Protocol-Version / Mcp-Method / Mcp-Name raw header
  lines are rejected before classification (find_duplicated_routing_header)
  - the same first-wins/last-wins divergence the Mcp-Param duplicate
  rejection closes, where it matters most.
- The synthetic listing's fail-open boundary now covers the result scan
  (a middleware short-circuiting tools/list with a mis-shaped result is a
  logged skip, not a 500), and a mis-shaped envelope failing the listing's
  surface validation logs at debug as client fault instead of an
  error-level traceback blaming the tools/list handler.
- json.loads failures are caught as ValueError (an integer literal beyond
  the digit limit raises the bare parent, not JSONDecodeError), keeping
  unparseable bodies at 400 + PARSE_ERROR.
- migration.md: handler-raise skip is logged as an error, not a warning;
  document the omitted-unrenderable-value and duplicate-line rules.
…dler faults in the schema listing

json.loads raises RecursionError, not ValueError, for deeply nested
bodies - widen the parse guard so they stay 400 + PARSE_ERROR.

The synthetic tools/list now surface-validates its params before
dispatch: a mis-shaped envelope is caught up front (debug, client
fault), so a ValidationError escaping serve_one can only be
handler-origin and gets the error-level log the fail-open skip
promises.
Comment on lines +395 to 399
mcp_param_rejection = await _mcp_param_rejection(app, request, req, verdict, lifespan_state)
if mcp_param_rejection is not None:
await _write_rejection(mcp_param_rejection, req.id, scope, receive, send)
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 In SSE mode (json_response=False), the new pre-dispatch Mcp-Param validation phase runs before the SSE deferral/keepalive machinery, so a 2026-07-28 tools/call writes no bytes to the wire while the internal tools/list schema walk runs (up to 100 paginated serve_one round trips). A deployment whose tools/list handler is slower than the upstream proxy's idle-read timeout previously worked (the keepalive committed within 15s of dispatch) but would now have every validated tools/call reset before dispatch — consider bounding the schema-resolving walk with a timeout that degrades to the existing logged fail-open skip.

Extended reasoning...

What happens. _mcp_param_rejection is awaited in handle_modern_request (src/mcp/server/_streamable_http_modern.py:395-399) before the SSE branch is reached — before send_ch/run_handler are constructed and before the anyio.move_on_after(_SSE_PING_INTERVAL) deferral windows exist. For a 2026-07-28 tools/call with non-empty arguments or any Mcp-Param-* header (the gate only skips when both are absent — it does not depend on the tool actually carrying x-mcp-header annotations), the entry awaits _tool_input_schema, which dispatches an internal tools/list through serve_one for up to _MCP_PARAM_LIST_PAGE_CAP = 100 pages, each through middleware and the user handler with a fresh Connection. Nothing is written to the wire while that walk runs.

Why this is a coverage regression rather than just extra cost. The module's own docstring states the SSE deferral exists so that "a handler that runs silent past the window commits SSE so the keepalive ping can keep the connection open behind a proxy idle-read timeout" — i.e. the design explicitly bounds the silent window at _SSE_PING_INTERVAL (15s) precisely because slow work behind proxy idle timeouts is an acknowledged deployment reality. The new validation phase sits entirely outside that guarantee: the maximum silent time before the first byte grows from ~15s to (full listing duration + 15s). docs/migration.md documents that "middleware and an expensive or paginated tools/list handler see extra invocations" — the cost — but not the loss of keepalive coverage.

Concrete walkthrough. 1) An SSE-mode (default json_response=False) deployment sits behind a proxy with a 60s idle-read timeout, and its tools/list handler walks a slow paginated catalog backend taking ~90s. 2) Pre-PR: a tools/call is dispatched immediately; within 15s the entry either replies or commits text/event-stream and starts : ping keepalives, so the proxy never sees 60s of silence — the call succeeds. 3) Post-PR: the same tools/call (it has arguments, so the gate fires) first runs the internal tools/list walk; the connection is byte-silent for ~90s; the proxy resets it at 60s; the request never reaches dispatch. Every validated tools/call to that deployment now fails the same way, and the failure is a connection reset rather than a diagnosable JSON-RPC error.

Why it is narrow. The trigger requires SSE mode plus a tools/list handler (or paginated catalog walk) slower than the proxy idle-read timeout — typically 30-60s, which is unusual; most catalogs list from memory in milliseconds, and the walk stops as soon as the called tool is found. JSON-mode deployments were never protected by a keepalive, so they are unchanged. The placement before SSE is also partly forced by the spec: a HEADER_MISMATCH rejection MUST be a plain application/json 400, which cannot be honored after SSE has committed, so simply moving the validation under the keepalive machinery is not a drop-in fix.

Suggested fix. Bound the schema-resolving walk with a wall-clock timeout (e.g. wrap the _tool_input_schema call in anyio.move_on_after(...)) that degrades to the existing logged fail-open skip — the same availability-over-strictness posture already taken for a raising handler, cursor cycle, or page cap. Alternatively, document the limitation in docs/migration.md alongside the extra-invocations note, or use it as motivation for the registry fast-path the PR description already anticipates.

@maxisbey maxisbey merged commit 48ef569 into main Jun 30, 2026
40 checks passed
@maxisbey maxisbey deleted the mcp-param-validation branch June 30, 2026 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants