Skip to content

Bug: Search result templates apply |safe to striptags output #3046

Description

@gpshead

Describe the bug

(drafted by claude, reviewed by me -gpshead)

Four search include templates render the indexed description with |safe:

  • templates/search/includes/jobs.job.html:6
  • templates/search/includes/events.event.html:28
  • templates/search/includes/events.calendar.html:2
  • templates/search/includes/downloads.release.html:4
<p>{{ result.description|safe }}</p>

The indexed value is produced by striptags(truncatewords_html(..., 50)) in the corresponding prepare_description methods:

  • apps/jobs/search_indexes.py:104 (obj.description.rendered — user-submitted job markup)
  • apps/events/search_indexes.py:68 (obj.description.rendered — event descriptions, which can originate from imported external calendar feeds)
  • apps/events/search_indexes.py:31 (obj.description, calendar model)
  • apps/pages/search_indexes.py:30 and apps/downloads/search_indexes.py:46 (staff-authored content)

Django's docs warn explicitly: "striptags doesn't provide any guarantee about its output being HTML safe … you should NEVER apply the safe filter to striptags output."

To be fair about actual risk (and thus this being a public issue instead of GHSA), I tested the pinned Django (5.2.11) rather than just citing the docs: the classic nested-tag bypass (<sc<script>ript>) is neutralized by strip_tags' strip-until-stable loop, HTML entities stay encoded, and truncatewords_html runs before striptags (the safe order). No working bypass today. However, literal </> characters that don't parse as tags do pass through (strip_tags("a < b and c > d") returns it unchanged), so the pattern's safety ultimately rests on Python's HTMLParser and every browser tokenizing identically — a differential class with historical precedent. Since the indexed value feeds from user-submitted and externally-imported content, that's a fragile invariant to bet on.

One wrinkle: |safe is currently load-bearing for display — strip_tags leaves entities encoded (&amp; stays &amp;), so simply dropping |safe would double-escape them and render visible &amp; in search results. One possible fix is to make the indexed value genuinely plain text at index time and let autoescaping do its job:

import html
return html.unescape(strip_tags(truncatewords_html(obj.description.rendered, 50)))

in the five prepare_description methods, then drop |safe from the four templates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis is a bug!

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions