Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode lookalikes / confusables break integer value parsing #78

Open
JohannStudanskiEnscape opened this issue Jan 7, 2025 · 2 comments

Comments

@JohannStudanskiEnscape
Copy link

We recently had an issue where some third party updated the plist files for which we used Claunia, in a way that a minus sign in an integer value was added as a "U+2212 : MINUS SIGN" instead of the normal / required "0x2D" ASCII minus, which lead to an uncaught exception in long.Parse().

However, because Claunia supports no way of plugging in preprocessors for values or something similar, we found no way of working around it and allow Claunia to succeed in parsing the plist file despite the error but to hot-patch the Claunia code directly. Which of course means we can't rely on the nuget packages, for example.

It would be nice if the currently hardcoded value parsers in XMLPropertyListParser.ParseObject() could be overridden with an external, registered implementation, or if there was at least a way to plug in a value preprocessor into each of the parsers.

The quick in-line workaround in our case looked as follows:

...
        private record ConfusableReplacement
        {
            public char[] Confusables { get; set; }
            public char Replacement { get; set; }

            public string ApplyTo(string input) =>
                Confusables.Aggregate(input, (current, confusable) => current.Replace(confusable, Replacement));
        }

        private ConfusableReplacement MinusConfusables = new() { Confusables = ['\x002D', '\x02D7', '\x06D4', '\x2010', '\x2011', '\x2012', '\x2013', '\x2043', '\x2212', '\x2796', '\x2CBA', '\xFE58'], Replacement = '-' };
        private ConfusableReplacement DotConfusables = new() { Confusables =  ['\x002E', '\x0701', '\x0702', '\x2024', '\xA4F8'], Replacement = '.' };

        public NSNumber(string text, int type)
        {
...
            text = MinusConfusables.ApplyTo(text);
            text = DotConfusables.ApplyTo(text);
...

but of course this is not a long-time workable solution.


Now, of course this is an actual error outside of Claunia.

Since it is a real-world failure scenario we encountered we still decided to file it as an issue however so you can decide if it's something that warrants a change in Claunia main.

@claunia
Copy link
Owner

claunia commented Jan 8, 2025

Hi,

Right now I cannot spend the time needed to fix that issue properly.

However if you implement it in a PR, I will review and approve it.

Thanks.

@JohannStudanskiEnscape
Copy link
Author

Pull request is open for review at #82

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants