Skip to content

Conversation

dpvc
Copy link
Member

@dpvc dpvc commented Sep 4, 2025

This PR changes the handling of tags to make three separate mtext nodes: one for the left delimiter, one for the tag number, and one for the right delimiter.

The formatTag() and formatRef() functions now take either a string or a triple of strings to use for the tag format. If it is just a string, then parentheses, brackets, and braces are removed from the beginning and ending of the thing to make the triple, otherwise, the string is used as is (and only a single mtext will be created).

The tests are updated to take these changes into account.

@dpvc dpvc requested a review from zorkow September 4, 2025 15:04
@dpvc dpvc added this to the v4.0.1 milestone Sep 4, 2025
Copy link

codecov bot commented Sep 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.71%. Comparing base (efcdd46) to head (c97db48).
⚠️ Report is 2 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1345      +/-   ##
===========================================
- Coverage    86.72%   86.71%   -0.01%     
===========================================
  Files          337      337              
  Lines        84145    84147       +2     
  Branches      4769     4771       +2     
===========================================
- Hits         72971    72968       -3     
- Misses       11151    11156       +5     
  Partials        23       23              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@zorkow zorkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to always consider tags to be a triple of strings [left, tag, right]. If one does not need the fences, once can leave them empty. This would avoid a all the case analysis for Array and would also simplify types.
Is there a reason why we would not want that?

* @param {string} tag The tag name (e.g., 1).
* @param {string} tagId The unique id for that tag (e.g., mjx-eqn:1).
* @param {string} tagFormat The formatted tag (e.g., "(1)").
* @param {string|string[]} tagFormat The formatted tag (e.g., "$\bullet$" or ['(','1',')']).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be easier to always have this as [string, string, string] return type and then use empty strings if no fence is used. That would allow for (left) tags like 1) to be written as ['', '1', ')'].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would allow for (left) tags like 1) to be written as ['', '1', ')']

That is allowed already, or (as you are suggesting filtering out the blank entries) it could be ['1', ')']. See my comment below.

return format
? `${left}${format}{${tag}}${right}`
: `${left}${tag}${right}`;
return format ? [left, `${format}{${tag}}`, right] : [left, tag, right];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
return format ? [left, `${format}{${tag}}`, right] : [left, tag, right];
return [left, `format ? ${format}{${tag}} : tag`, right];

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The back-tics are in the wrong place, but otherwise a good idea.

'node',
'mrow',
ParseUtil.internalMath(parser, tag),
ParseUtil.internalMath(parser, Array.isArray(tag) ? tag.join('') : tag),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could then be simplified to just tag.join('').

: format.match(/^(\(|\[|\{)(.*)(\}|\]|\))$/)?.slice(1) || [format];
const mml = new TexParser(
'\\text{' + this.currentTag.tagFormat + '}',
tag.map((part) => `\\text{${part}}`).join(''),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need to filter out the empty strings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SRE is going to use the three separate pieces to isolate the tag from the left- and right-hand parts, then don't we need to keep blank ones as empty mtext elements so that the middle one is always the tag. Otherwise how you could SRE tell whether a two-mtext pair was left-plus-tag or tag-plus-right?

Here we either get a single mtext (no question about what is the tag) or a triple of mtext nodes (middle is tag), so no ambiguity.

I could turn the [format] into ['', format, ''] so it is always three. But this is only what will happen in TeX input. I'm wondering if there shouldn't be something in the MathML input to split up a tag like is being done here, as well.

const format = this.currentTag.tagFormat;
const tag = Array.isArray(format)
? format
: format.match(/^(\(|\[|\{)(.*)(\}|\]|\))$/)?.slice(1) || [format];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified if we always assume a triple.

@dpvc
Copy link
Member Author

dpvc commented Oct 10, 2025

Is there a reason why we would not want that?

That would be a potential breaking change, which we aren't supposed to make in minor versions. So this is for backward compatibility, mainly with the tagformat extension, which currently has the tag format being a string not an array of strings. This allows either one.

I suppose the breaking up of the string could be moved to the tagformat extension rather than here, if you like that better. But it will need to be done somewhere, so I put it at the lowest level so that in case there are other subclasses of Tag that are in use, they will still work. Probably not much chance of that, however, so moving the complications into tagformat is probably sufficient. Would you prefer that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants