Redact query/fragment from URLs sent to model and logs#45
Redact query/fragment from URLs sent to model and logs#45taarikashenafi wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
⚠️ Not ready to approve
Current logging/model context can still leak sensitive query/fragment data (e.g., page URL not redacted and raw Error objects may still contain full URLs), and the new redaction paths in alt-text-quality lack targeted test coverage.
Pull request overview
This PR introduces a shared URL-redaction helper to prevent URL query strings/fragments (e.g., signed CDN tokens) from being sent to the model context or written into CI logs, and applies it to the alt-text-quality rule’s link context and error logging.
Changes:
- Add
redactUrl()utility to strip query strings and fragments with a safe fallback for non-parseable URLs. - Update
alt-text-qualityto redact linkhrefvalues and the image URL interpolated intoconsole.errormessages. - Add unit tests for
redactUrl().
File summaries
| File | Description |
|---|---|
src/utils/redact-url.ts |
Adds the reusable URL redaction helper. |
src/rules/alt-text-quality.ts |
Uses redactUrl() for link href context and error-log URL interpolation. |
tests/unit/redact-url.test.ts |
Adds unit coverage for redactUrl() behavior. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 4
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } catch (err) { | ||
| console.error(`[alt-text-quality] failed to load ${resolved}:`, err) | ||
| console.error(`[alt-text-quality] failed to load ${redactUrl(resolved)}:`, err) | ||
| continue | ||
| } |
| } catch (err) { | ||
| console.error(`[alt-text-quality] judge failed for ${resolved}:`, err) | ||
| console.error(`[alt-text-quality] judge failed for ${redactUrl(resolved)}:`, err) | ||
| continue | ||
| } |
| if (image.sectionHeading) parts.push(`Nearest heading above the image: ${JSON.stringify(image.sectionHeading)}`) | ||
| parts.push(`Image HTML: ${sanitizeImageHtml(image.outerHTML)}`) | ||
| if (image.inLink) parts.push(`The image is inside a link with href="${image.inLink.href}".`) | ||
| if (image.inLink) parts.push(`The image is inside a link with href="${redactUrl(image.inLink.href)}".`) | ||
| if (image.inButton) parts.push('The image is inside a button (or role="button" element).') |
| const parts: string[] = [`Page URL: ${pageUrl}`] | ||
| if (image.pageTitle) parts.push(`Page title: ${JSON.stringify(image.pageTitle)}`) | ||
| if (image.sectionHeading) parts.push(`Nearest heading above the image: ${JSON.stringify(image.sectionHeading)}`) | ||
| parts.push(`Image HTML: ${sanitizeImageHtml(image.outerHTML)}`) | ||
| if (image.inLink) parts.push(`The image is inside a link with href="${image.inLink.href}".`) | ||
| if (image.inLink) parts.push(`The image is inside a link with href="${redactUrl(image.inLink.href)}".`) |
9b5b167 to
5ae4bc1
Compare
This function helps in sanitizing URLs by removing sensitive query strings and fragments.
Updated error logging to redact URLs for improved privacy.
5125e11 to
5bfdaa3
Compare
| }) | ||
| } catch (err) { | ||
| console.error(`[alt-text-quality] judge failed for ${resolved}:`, err) | ||
| const msg = (err instanceof Error ? err.message : String(err)).replace(/([?#])[^\s]*/g, '$1…') |
There was a problem hiding this comment.
I suggest leaving a comment here about why this regex is here, we probably won't recall down the line :)
There was a problem hiding this comment.
even better, I wonder if using redactUrl directly on the String cast would work? 🤔
Follow-up to #38.
Builds on the bot's href redaction catch and extends it: pulls the href plus the two console.error image-URL logs in evaluate() (which leak the same querystring data into CI logs) into one reusable redactUrl() helper. Keeps the original link/button wording, the rewording in the bot's inline suggestion is a separate concern.