-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Japanese text is misidentified as URL. #390
Comments
Ideographic full stop and full width period are not handled as dot in IDNA2008 while IDNA2003 does. A current recommendation is IDNA2008. linkedin/URL-Detector looks implemented based on IDNA2003 about IDN which may causes the issue. |
@vitorpamplona Many users in the East Asian region have been waiting for this fix for a long time. |
We are waiting for the library to be able to support these additional characters. Until that library is fixed, there is not much we can do :( |
Interesting. That is a different "bug". We have a separate procedure to linkify all |
And now that urls can have any Unicode character, I am not really sure how to solve it. Because the |
I confirmed this issue is fixed in #491 Thanks. |
Describe the bug
Some Japanese text is unexpectedly misidentified as URL.
To Reproduce
Steps to reproduce the behavior:
。
(\u3002
)Expected behavior
The text should be normal text.
Device (please complete the following information):
This is related on linkedin/URL-Detector. linkedin/URL-Detector#39
URL-Detector handle
。
as dot. This is not a bug because IDN allow to use。
as dot. However, most of Japanese text are often misidentified.The text was updated successfully, but these errors were encountered: