Skip to content

Conversation

@Crozzers
Copy link
Contributor

This PR fixes #637.

The issue came down to the fact that the extra would run before the italics and bold stage. It would attempt to ignore the <strong> and then process strict <em>s and middle-word-ems. The problem is that the syntax for strongs and ems are very similar, and trying to craft a regex that can differentiate is tough.

The way this extra worked previously was to process valid <em> syntax and then hash anything that looks like <em> syntax but isn't quite valid.

The new approach is simply to find any _ or * character in the middle of a word and hash it. This way, the regular italics and bold stage don't have to worry about them and we can keep the regexes simple.

The hash we use is basically the same that you find in self._escape_table except we prefix the extra's name to the input text to prevent interference with escaped/hashed chars from other stages

@nicholasserra nicholasserra merged commit f44849c into trentm:master Sep 29, 2025
15 checks passed
@nicholasserra
Copy link
Collaborator

LGTM Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Disabling middle-word-em causes bugs

2 participants