Skip to content

Implement copilot quality alt#38

Open
kzhou314 wants to merge 21 commits into
mainfrom
implement-copilot-quality-alt
Open

Implement copilot quality alt#38
kzhou314 wants to merge 21 commits into
mainfrom
implement-copilot-quality-alt

Conversation

@kzhou314

@kzhou314 kzhou314 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Closes github/accessibility#10635

Adds an opt-in, model-backed alt-text-quality rule that judges whether an image's alt text actually describes the image — a gap the deterministic rules can't reach.

Design & rationale: ADR

GitHub Advanced Security started work on behalf of kzhou314 June 18, 2026 23:19 View session
GitHub Advanced Security finished work on behalf of kzhou314 June 18, 2026 23:21
GitHub Advanced Security started work on behalf of kzhou314 June 18, 2026 23:39 View session
GitHub Advanced Security finished work on behalf of kzhou314 June 18, 2026 23:40
GitHub Advanced Security started work on behalf of kzhou314 June 19, 2026 00:01 View session
GitHub Advanced Security finished work on behalf of kzhou314 June 19, 2026 00:03
GitHub Advanced Security started work on behalf of kzhou314 June 19, 2026 00:29 View session
GitHub Advanced Security finished work on behalf of kzhou314 June 19, 2026 00:31
@kzhou314 kzhou314 marked this pull request as ready for review June 19, 2026 00:38
@kzhou314 kzhou314 requested a review from a team as a code owner June 19, 2026 00:38

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

There are correctness and reliability issues in new code paths (notably link-context wording in prompts and loadImageAsDataUrl handling of data: URLs / timeouts) that should be addressed before merging.

Pull request overview

Adds a new opt-in, model-backed alt-text-quality rule to evaluate whether an image’s alt text actually describes the image (using GitHub Models, optionally enriched by an Azure AI Vision pre-pass), and extends image extraction to capture additional context for higher-quality judging.

Changes:

  • Introduces the async, opt-in alt-text-quality rule plus a new judge layer (Copilot judge, optional Azure-augmented judge, caching).
  • Enhances extractImages and ImageRecord with intrinsic dimensions and page/section/prose context used by the model-backed rule.
  • Adds an offline probe harness + fixtures and updates docs/schema/tests to cover the new rule and async rule evaluation.
File summaries
File Description
tsconfig.json Includes scripts/**/*.ts in typechecking.
tests/utils/helpers.ts Adds ImageRecord defaults for new fields; adds sync-only rule evaluation helper.
tests/unit/vague-alt-text.test.ts Updates typing to account for async-capable Rule.evaluate.
tests/unit/missing-alt-text.test.ts Updates typing to account for async-capable Rule.evaluate.
tests/unit/judges-caching.test.ts Adds unit tests for judge/vision caching behavior.
tests/unit/alt-text-quality.test.ts Adds unit tests for the new alt-text-quality rule behavior.
tests/fixtures/alt-quality/cases.json Adds offline probe fixture cases for expected verdicts.
tests/fixtures/alt-quality/cases-github.json Adds GitHub-specific probe fixture cases.
tests/extract.test.ts Adds tests for capturing pageTitle and sectionHeading.
tests/example-site.test.ts Skips opt-in rules when asserting expected findings in the example site.
src/utils/load-image-data-url.ts Adds helper to load images as data URLs for the rule and probe.
src/utils/fetch-with-retry.ts Adds fetch wrapper with timeout and bounded retry for transient failures.
src/types.ts Extends ImageRecord; makes Rule.evaluate async-capable.
src/rules/index.ts Registers alt-text-quality in the append-only rules list.
src/rules/alt-text-quality.ts Implements the async, opt-in model-backed rule orchestrator.
src/judges/types.ts Defines judge interfaces/types shared across judge implementations.
src/judges/prompt.ts Adds the system prompt and strict JSON schema for structured verdicts.
src/judges/index.ts Adds createJudge() with mode selection and wiring (copilot vs azure-augmented + caching).
src/judges/copilot-judge.ts Implements the GitHub Models-backed judge call.
src/judges/caching.ts Adds content-hash caches for verdicts and Azure vision analysis.
src/judges/azure-vision-api-client.ts Implements Azure AI Vision Image Analysis client.
src/judges/azure-augmented-judge.ts Decorates the Copilot judge with optional Azure pre-analysis context.
src/extract.ts Enhances extraction with nearby text, figcaption, link/button context, page title, and section heading.
scripts/probe-alt-quality.ts Adds offline probe CLI for scoring fixture cases against the judge.
schema/config.schema.json Adds alt-text-quality to config schema and updates schema description.
README.md Documents the new opt-in rule, configuration, and required secrets.
package.json Adds probe script and tsx dev dependency.
package-lock.json Locks new dependencies (tsx, esbuild, etc.).
index.ts Awaits rule evaluation to support async rules.
.gitignore Ignores .env (used by the probe).
.github/workflows/scan-static-sites.yml Adds job-level env wiring for model/vision credentials.

Copilot's findings

  • Files reviewed: 28/31 changed files
  • Comments generated: 4

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/types.ts
Comment thread src/rules/alt-text-quality.ts Outdated
Comment thread src/utils/load-image-data-url.ts
Comment thread src/utils/load-image-data-url.ts
GitHub Advanced Security started work on behalf of kzhou314 June 19, 2026 00:55 View session
@kzhou314 kzhou314 requested a review from Copilot June 19, 2026 00:56
GitHub Advanced Security finished work on behalf of kzhou314 June 19, 2026 00:57

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

It introduces a model call pathway that currently forwards raw outerHTML into prompts (risking sensitive URL/query leakage) and includes a few correctness/perf/test-safety issues that should be addressed before approval.

Copilot's findings
  • Files reviewed: 29/32 changed files
  • Comments generated: 6

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread tests/unit/vague-alt-text.test.ts Outdated
Comment thread tests/unit/missing-alt-text.test.ts Outdated
Comment thread src/extract.ts Outdated
Comment thread src/extract.ts Outdated
Comment thread src/rules/alt-text-quality.ts
Comment thread src/judges/azure-vision-api-client.ts Outdated
GitHub Advanced Security started work on behalf of kzhou314 June 19, 2026 01:25 View session
GitHub Advanced Security finished work on behalf of kzhou314 June 19, 2026 01:27
Copilot AI review requested due to automatic review settings June 19, 2026 18:37
GitHub Advanced Security started work on behalf of kzhou314 June 19, 2026 18:38 View session
GitHub Advanced Security finished work on behalf of kzhou314 June 19, 2026 18:40

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

The model-backed rule currently risks leaking sensitive URL data (e.g., link hrefs and logged image URLs with querystrings) into model context and CI logs, which should be redacted before merging.

Copilot's findings
  • Files reviewed: 29/32 changed files
  • Comments generated: 5

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment on lines +28 to +42
function normalizeDataUrl(dataUrl: string): string {
const comma = dataUrl.indexOf(',')
if (comma === -1) return dataUrl
const meta = dataUrl.slice('data:'.length, comma)
if (/;base64/i.test(meta)) return dataUrl
const payload = dataUrl.slice(comma + 1)
let decoded: string
try {
decoded = decodeURIComponent(payload)
} catch {
decoded = payload
}
const mime = meta.split(';')[0] || 'text/plain'
return `data:${mime};base64,${Buffer.from(decoded, 'utf8').toString('base64')}`
}
Comment on lines +50 to +51
if (image.inLink) parts.push(`The image is inside a link with href="${image.inLink.href}".`)
if (image.inButton) parts.push('The image is inside a button (or role="button" element).')

@taarikashenafi taarikashenafi Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna add a small redactUrl() helper and apply it here + the console.error lines in evaluate() as they all leak the same querystring stuff into CI logs. The ADR calls this out too. Opening a PR (#45) so we don't end up doing this in two places.

Comment on lines +104 to +107
} catch (err) {
console.error(`[alt-text-quality] failed to load ${resolved}:`, err)
continue
}
Comment on lines +118 to +121
} catch (err) {
console.error(`[alt-text-quality] judge failed for ${resolved}:`, err)
continue
}
Comment on lines +16 to +20
export async function fetchWithRetry(
url: string | URL,
init: RequestInit,
options: FetchRetryOptions = {},
): Promise<Response> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this one's worth doing before merge, it's in the path of every model and image call.

@taarikashenafi taarikashenafi self-requested a review June 23, 2026 21:30

@taarikashenafi taarikashenafi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw on doing the other stretch goals in this branch, might be easier to get this one in first and then branch the next thing off main? this one's already pretty big and the rules registry seems set up for adding one rule at a time, so keeping them split might make each easier to review since Joyce hasn't' started on this yet.

Comment on lines +50 to +51
if (image.inLink) parts.push(`The image is inside a link with href="${image.inLink.href}".`)
if (image.inButton) parts.push('The image is inside a button (or role="button" element).')

@taarikashenafi taarikashenafi Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna add a small redactUrl() helper and apply it here + the console.error lines in evaluate() as they all leak the same querystring stuff into CI logs. The ADR calls this out too. Opening a PR (#45) so we don't end up doing this in two places.

@kzhou314

Copy link
Copy Markdown
Contributor Author

btw on doing the other stretch goals in this branch, might be easier to get this one in first and then branch the next thing off main? this one's already pretty big and the rules registry seems set up for adding one rule at a time, so keeping them split might make each easier to review since Joyce hasn't' started on this yet.

Yeah for sure, good call.

@kzhou314 kzhou314 force-pushed the implement-copilot-quality-alt branch from 9b5b167 to 5ae4bc1 Compare June 24, 2026 22:11
GitHub Advanced Security started work on behalf of kzhou314 June 24, 2026 22:11 View session
GitHub Advanced Security finished work on behalf of kzhou314 June 24, 2026 22:13
Comment thread schema/config.schema.json
"rules": {
"type": "object",
"description": "Per-rule enable/disable overrides. Rules not listed keep their default behavior (today, every rule defaults to enabled).",
"description": "Per-rule enable/disable overrides. Rules not listed keep their default behavior (every rule defaults to enabled except alt-text-quality, which is opt-in).",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Per-rule enable/disable overrides. Rules not listed keep their default behavior (every rule defaults to enabled except alt-text-quality, which is opt-in).",
"description": "Per-rule enable/disable overrides. Rules not listed keep their default behavior (enabled with the exception of alt-text-quality).",

Comment thread schema/config.schema.json
},
"alt-text-quality": {
"type": "boolean",
"description": "Opt-in, model-backed. Sends each image and its alt text to a vision model (GitHub Models) to flag quality issues such as redundant prefixes, vague or context-missing descriptions, and medium-announcing link text. Disabled by default; requires GITHUB_MODELS_TOKEN and incurs per-image API cost."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "medium-announcing link text" mean in this context?

// GITHUB_MODELS_TOKEN=<pat-with-models:read> npm run probe
//
// Optional env:
// PROBE_MODEL — model id, default "openai/gpt-4o"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we specify here this must be a vision-enabled model?

console.log(` reason: ${verdict.reasoning}`)
console.log('')
} catch (err) {
console.log('ERROR')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to console.error here to draw attention, or are these relatively common transient errors we might get per-case?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also could be helpful to log the case object's alt value for more human-friendly debugging.

Comment thread README.md

### Alt-text quality (model-backed, opt-in)

The five rules above are deterministic pattern matches. `alt-text-quality` goes further: it sends each image and its alt text to a vision model, which judges whether the alt text actually and sufficiently describes the image. This catches plausible-looking but wrong or incomplete alt text — for example `alt="a person"` on a photo of a named individual.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The five rules above are deterministic pattern matches. `alt-text-quality` goes further: it sends each image and its alt text to a vision model, which judges whether the alt text actually and sufficiently describes the image. This catches plausible-looking but wrong or incomplete alt text — for example `alt="a person"` on a photo of a named individual.
The five rules above are deterministic pattern matches. `alt-text-quality` goes further: it sends each image, its alt text, and surrounding context in the DOM to a vision model, which judges whether the alt text actually and sufficiently describes the image. This catches plausible-looking but wrong or incomplete alt text — for example `alt="a person"` on a photo of an individual named in surrounding text.


//
// Run:
// GITHUB_MODELS_TOKEN=<pat-with-models:read> npm run probe

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, it would be good to have this documenting comment also note you can use GH_TOKEN as an alternative.

//
// Optional env:
// PROBE_MODEL — model id, default "openai/gpt-4o"
// PROBE_CASES — path to a cases.json

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// PROBE_CASES — path to a cases.json
// PROBE_CASES — path to a cases.json file

// ALT_TEXT_JUDGE_MODE — force "copilot" or "azure-augmented". When unset,
// auto-selects azure-augmented if AZURE_VISION_* are set.
// PROBE_MIN_INTERVAL_MS — minimum ms between cases (rate-limit pacing).
// Set to 3500 for Azure F0's 20-calls/min ceiling.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to documentation? or was this discovered via trial-and-error?

}

async function main(): Promise<void> {
const here = dirname(fileURLToPath(import.meta.url))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const here = dirname(fileURLToPath(import.meta.url))
const currentDirectory = dirname(fileURLToPath(import.meta.url))

? resolve(process.cwd(), process.env['PROBE_CASES'])
: resolve(here, '..', 'tests', 'fixtures', 'alt-quality', 'cases.json')

const cases = JSON.parse(await readFile(casesPath, 'utf8')) as ProbeCase[]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to gracefully error log and exit if you encounter an invalid JSON file; if someone tries to edit test cases and accidentally leaves out a comma or something similar, could happen!

console.log(`Model: ${model}`)
console.log(`Cases: ${casesPath}`)
console.log(`Total: ${cases.length}`)
if (minIntervalMs > 0) console.log(`Pacing: ${minIntervalMs}ms minimum between cases`)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (minIntervalMs > 0) console.log(`Pacing: ${minIntervalMs}ms minimum between cases`)
if (minIntervalMs > 0) console.log(`Pacing: ${minIntervalMs}ms minimum between cases\n`)

console.log(`Cases: ${casesPath}`)
console.log(`Total: ${cases.length}`)
if (minIntervalMs > 0) console.log(`Pacing: ${minIntervalMs}ms minimum between cases`)
console.log('')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, is this even affecting output?

minImageDimension?: number
}

const DEFAULT_TAG_CONFIDENCE = 0.7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking, range for this is 0-1, yes? I'm a bit rusty :)

const DEFAULT_TAG_CONFIDENCE = 0.7
const DEFAULT_MAX_TAGS = 8
// Azure's documented hard minimum: images must be greater than 50x50 px.
const DEFAULT_MIN_IMAGE_DIMENSION = 50

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this is both minimum height and minimum width?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants