Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

~/.cache/yadt directory do not create,so the font is downloaded repeated and slow the translate speed #584

Open
gansui opened this issue Feb 8, 2025 · 4 comments
Labels
enhancement New feature or request Low priority

Comments

@gansui
Copy link

gansui commented Feb 8, 2025

in the yadt module,the ~/.cache/yadt directory do not create . I browser the code, I think the reason is, in the yadt/main.py file,code is like that:
if name == "main":
main()

since the program is start by run pdf2zh , the yadt/main.py is not called directly,so the main() function is not call actually, hereby the create_cache_folder() function is not called. so ~/.cache/yadt directory is not created. each time the pdf2zh program is started , the font file must be downloaded again, slow the run speed.

I modified some code, the my way is put ~/.cache/yadt directory detection code in get_cache_file_path(). just an advice

since code is simple,I dont paste it

@awwaawwa
Copy link
Collaborator

awwaawwa commented Feb 9, 2025

  1. Font re-downloading is a pdf2zh issue, not a yadt issue. If ~/.cache/yadt is not created, yadt will crash directly instead of repeatedly downloading fonts.

  2. It's great to put the ~/.cache/yadt directory detection code in get_cache_file_path(), I fixed it in funstory-ai/BabelDOC@9905989.

  3. The real reason for repeatedly downloading fonts is that pdf2zh creates a new temporary folder each time and puts the fonts in it. See https://github.com/Byaidu/PDFMathTranslate/blob/8bbc7adf145fbebd9dcf2ecadee8a0d67e1ef952/pdf2zh/high_level.py#L402C1-L403C1.

@awwaawwa
Copy link
Collaborator

awwaawwa commented Feb 9, 2025

After careful consideration, I rolled back the changes to yadt.

YADT requires calling the yadt.high_level.init function for initialization before running. This function will definitely create a cache path. Therefore, this path will definitely exist afterwards. If this path disappears later, it's considered undefined behavior. https://github.com/funstory-ai/yadt/blob/1fccbef00a75df30f87c84c0bcc6dec7287bbd8c/yadt/high_level.py#L449

You might notice that the current pdf2zh doesn't call this function, which is indeed a bug, as I didn't notice it because this path already exists on my computer. However, this will only cause the program to crash, and it only happens when you specify the yadt backend, without affecting pdf2zh's original translation path. I will submit a PR to pdf2zh later to fix this issue.

high_level.init is a recently introduced change, so pdf2zh has not been synchronized with the modification yet.

update: #589 will fix it.

@awwaawwa
Copy link
Collaborator

awwaawwa commented Feb 9, 2025

I haven't decided how the program should behave when ~/.cache/yadt is suddenly deleted during runtime. Feel free to discuss at funstory-ai/BabelDOC#70

@awwaawwa
Copy link
Collaborator

awwaawwa commented Feb 12, 2025

Since the new backend BabelDOC doesn't have this issue, combined with the proposal in #586, I don't want to improve this issue, so I'll mark it as low priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Low priority
Projects
None yet
Development

No branches or pull requests

2 participants