Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

翻译无效 #358

Closed
wx-11 opened this issue Dec 27, 2024 · 18 comments
Closed

翻译无效 #358

wx-11 opened this issue Dec 27, 2024 · 18 comments

Comments

@wx-11
Copy link
Contributor

wx-11 commented Dec 27, 2024

问题描述

使用测试站点/以及自建服务 谷歌/bing/openai密钥 没效果 而且gui没有预览
image

测试文档

033-Teacher efficacy_ its meaning and measure.pdf

@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

感觉有点奇怪 翻译文档不是需要ocr吗 这些翻译服务都具备ocr了吗
不太懂技术 没看源码 所以顺便问一句原理 有点好奇
不过oai的自带ocr不知道为啥也不行

@Byaidu
Copy link
Owner

Byaidu commented Dec 27, 2024

翻译文档不需要ocr

@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

日志 感觉没什么异常

pdf2zh-1  | * Running on local URL:  http://0.0.0.0:7860
pdf2zh-1  | 
pdf2zh-1  | To create a public link, set `share=True` in `launch()`.
pdf2zh-1  | Files before translation: ['033-Teacher efficacy_ its meaning and measure.pdf']
pdf2zh-1  | {'files': ['pdf2zh_files/033-Teacher efficacy_ its meaning and measure.pdf'], 'pages': None, 'lang_in': 'en', 'lang_out': 'zh', 'service': 'google', 'output': PosixPath('pdf2zh_files'), 'thread': 4, 'callback': <function translate_file.<locals>.progress_bar at 0x7fefb0385d00>}
100%|██████████| 48/48 [00:30<00:00,  1.60it/s]
pdf2zh-1  | Files after translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']
pdf2zh-1  | Files before translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']
pdf2zh-1  | {'files': ['pdf2zh_files/033-Teacher efficacy_ its meaning and measure.pdf'], 'pages': None, 'lang_in': 'en', 'lang_out': 'zh', 'service': 'openai', 'output': PosixPath('pdf2zh_files'), 'thread': 4, 'callback': <function translate_file.<locals>.progress_bar at 0x7fefb0687b00>}
100%|██████████| 48/48 [00:36<00:00,  1.33it/s]
pdf2zh-1  | Files after translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']
pdf2zh-1  | Files before translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']
pdf2zh-1  | {'files': ['pdf2zh_files/033-Teacher efficacy_ its meaning and measure.pdf'], 'pages': None, 'lang_in': 'en', 'lang_out': 'zh', 'service': 'google', 'output': PosixPath('pdf2zh_files'), 'thread': 4, 'callback': <function translate_file.<locals>.progress_bar at 0x7fefb04d47c0>}
100%|██████████| 48/48 [00:30<00:00,  1.58it/s]
pdf2zh-1  | Files after translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']
pdf2zh-1  | Files before translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']
pdf2zh-1  | {'files': ['pdf2zh_files/033-Teacher efficacy_ its meaning and measure.pdf'], 'pages': None, 'lang_in': 'en', 'lang_out': 'zh', 'service': 'bing', 'output': PosixPath('pdf2zh_files'), 'thread': 4, 'callback': <function translate_file.<locals>.progress_bar at 0x7fefb0687ec0>}
100%|██████████| 48/48 [00:35<00:00,  1.35it/s]
pdf2zh-1  | Files after translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']
pdf2zh-1  | Files before translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']
pdf2zh-1  | {'files': ['pdf2zh_files/033-Teacher efficacy_ its meaning and measure.pdf'], 'pages': None, 'lang_in': 'en', 'lang_out': 'zh', 'service': 'bing', 'output': PosixPath('pdf2zh_files'), 'thread': 4, 'callback': <function translate_file.<locals>.progress_bar at 0x7fefb02cefc0>}
100%|██████████| 48/48 [00:32<00:00,  1.47it/s]
pdf2zh-1  | Files after translation: ['033-Teacher efficacy_ its meaning and measure.pdf', '033-Teacher efficacy_ its meaning and measure-dual.pdf', '033-Teacher efficacy_ its meaning and measure-mono.pdf']

@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

翻译文档不需要ocr

哦哦 好的谢谢 之前见过PDF文字不能复制只能ocr 还以为是内嵌图片的类型文字

那么没用到ocr 是不是说明图片类型的文档无法翻译? 后续有考虑加一个ocr的可选菜单(只在支持ocr的翻译引擎里使用比如oai)吗

@hellofinch
Copy link
Contributor

#19
OCR WIP

@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

好的谢谢 请问你们可以复现我的PDF问题吗

#19 OCR WIP

@hellofinch
Copy link
Contributor

可以,后边的那几页都是扫描件,暂时确实处理不了。

@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

啊 我是一个字都翻译不出来呀 难道是我部署有问题? 但是日志也没有报错 请问要怎么抓log

@hellofinch
Copy link
Contributor

准确描述应该是我这里也没有出结果,但是可以看出来,后边论文正文都是扫描件。

@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

前面不是扫描也没反应 而且预览不出来 在示例站点用谷歌也不行

@Byaidu
Copy link
Owner

Byaidu commented Dec 27, 2024

image

一眼扫描件,别试了

@hellofinch
Copy link
Contributor

第一页不是扫描件,确实没出来很奇怪。
image

@hellofinch
Copy link
Contributor

后边那些都是扫描件,第一页上有什么东西做了特别判断吗?

@Byaidu Byaidu closed this as completed Dec 27, 2024
@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

找到问题了 抓包一下 根本没有请求api接口 也就是第一页不是扫描也没有被当做文本翻译

@Byaidu
Copy link
Owner

Byaidu commented Dec 27, 2024

是这样,因为第一段根本没有正文……

@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

请问为什么关闭issue 前两页的文字没有翻译 这个是已知问题吗
image

因为第一段根本没有正文

有地方配置变量关闭只翻译正文的选项吗 我的理解是全文能翻译应该就翻译 , 为什么不处理非正文部分呢

@Byaidu
Copy link
Owner

Byaidu commented Dec 27, 2024

没有,公式部分和装饰部分显然是不应该处理的,所以为什么不处理非正文部分呢

@wx-11
Copy link
Contributor Author

wx-11 commented Dec 27, 2024

好的 明白了 我并没有恶意 只是当做找bug反馈角度提问 没有责怪的意思 别误会 对于开源项目我是很感激的 只是想知道原因 那不打扰了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants