Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion problems for some input character, and question about how to correct this problem #1

Open
ivansetiawantky opened this issue Aug 2, 2024 · 6 comments

Comments

@ivansetiawantky
Copy link

Thank you for creating such a good Brazilian Portuguese text to IPA converter.
I found some problems in the phonemes transformed by xphonebr:
(1) Some phonemes correspond to "r" in input text disappear. e.g.:
caderneta [kadeˈnetə ], engordei [ẽgoˈdey ], formar [foˈmax].
(2) Pronunciation problems on "x". e.g.:
exame [eˈksãmɪ ] => [ks] might be [z]?
lixo [ˈliksʊ ] => [ks] might be [ʃ]?
(3) Other problems:
eu [ˈiʊ ] => [i] might be [e]?
homem [ˈmẽ ] => initial [ɔ] or [o] is lacking?
posso [ˈposʊ] (<poder) => [o] might be [ɔ]?
Do you plan to revise the checkpoint to improve the known problems (plus the ones shown above)? Or, is it easy to correct these problems by myself with fine-tuning, etc.?

@traderpedroso
Copy link
Owner

Thank you for creating such a good Brazilian Portuguese text to IPA converter. I found some problems in the phonemes transformed by xphonebr: (1) Some phonemes correspond to "r" in input text disappear. e.g.: caderneta [kadeˈnetə ], engordei [ẽgoˈdey ], formar [foˈmax]. (2) Pronunciation problems on "x". e.g.: exame [eˈksãmɪ ] => [ks] might be [z]? lixo [ˈliksʊ ] => [ks] might be [ʃ]? (3) Other problems: eu [ˈiʊ ] => [i] might be [e]? homem [ˈmẽ ] => initial [ɔ] or [o] is lacking? posso [ˈposʊ] (<poder) => [o] might be [ɔ]? Do you plan to revise the checkpoint to improve the known problems (plus the ones shown above)? Or, is it easy to correct these problems by myself with fine-tuning, etc.?

Yes, I found the issue. I was using a list with a "Paulista" accent. I am now creating other lists to cater to different regions of Brazil, such as the South, North, and Northeast. This will allow you to train your model based on the region. It took a while to understand the logic of phonemes since there weren't any projects to guide me in Brazilian Portuguese. Despite that, I trained the Styletts 2 model with this phoneme, and the results were incredible, it turned out really good! I'm a bit short on time these days, but I promise to bring improvements in the coming days. Thank you for your feedback, and feel free to submit a pull request if you have one.

@ivansetiawantky
Copy link
Author

Thank you for your kind response. I am glad that you are aware of these issues. I will wait for the improvements!
Unfortunately, I do not know how to train the model, so no pull request in the meantime.

@traderpedroso
Copy link
Owner

Thank you for your kind response. I am glad that you are aware of these issues. I will wait for the improvements! Unfortunately, I do not know how to train the model, so no pull request in the meantime.

In my next commit, I will ensure to add the simplified procedure for fine-tuning, specifically for improvements in the pronunciations of phonemes in regional variations. Thank you for your feedback.

@traderpedroso
Copy link
Owner

Is there now a fine-tune explanation on README, thank you!

@ivansetiawantky
Copy link
Author

Thank you very much for your explanation. We will try your method. Thank you again.

@traderpedroso
Copy link
Owner

Thank you very much for your explanation. We will try your method. Thank you again.

any PR are wellcome thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants