Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load TSL model #27

Closed
mrzhnex opened this issue Apr 27, 2024 · 5 comments
Closed

Unable to load TSL model #27

mrzhnex opened this issue Apr 27, 2024 · 5 comments
Assignees

Comments

@mrzhnex
Copy link

mrzhnex commented Apr 27, 2024

I don't really know what to write it about, just as only I getting this error, i suppose.
Was trying different settings, source languages, models... It keeps downloading to directory, but when it's time to actually load model - it threw an error. What kind of permission this app need?

Windows 10. Symlink'ed .ocr_translate directory
CPU version currently (getting same error in GPU as well)

2024-04-27 17:39:30,023 - INFO - django.server:basehttp - "GET /get_active_options/ HTTP/1.1" 200 64
2024-04-27 17:39:30,049 - INFO - django.server:basehttp - "GET / HTTP/1.1" 200 3776
2024-04-27 17:39:40,772 - INFO - ocr.general:views - SET LANG: {'lang_src': 'ja', 'lang_dst': 'en'}
2024-04-27 17:39:40,775 - INFO - django.server:basehttp - "POST /set_lang/ HTTP/1.1" 200 2
2024-04-27 17:39:40,798 - INFO - django.server:basehttp - "GET /get_active_options/ HTTP/1.1" 200 64
2024-04-27 17:39:40,829 - INFO - django.server:basehttp - "GET / HTTP/1.1" 200 3926
2024-04-27 17:39:51,882 - INFO - django.server:basehttp - "GET /get_active_options/ HTTP/1.1" 200 64
2024-04-27 17:39:51,893 - INFO - django.server:basehttp - "GET / HTTP/1.1" 200 3926
2024-04-27 17:39:54,027 - INFO - ocr.general:views - SET LANG: {'lang_src': 'ja', 'lang_dst': 'en'}
2024-04-27 17:39:54,030 - INFO - django.server:basehttp - "POST /set_lang/ HTTP/1.1" 200 2
2024-04-27 17:39:54,037 - INFO - django.server:basehttp - "GET /get_active_options/ HTTP/1.1" 200 64
2024-04-27 17:39:54,047 - INFO - django.server:basehttp - "GET / HTTP/1.1" 200 3926
2024-04-27 17:39:59,675 - INFO - ocr.general:views - LOAD MODELS: {'box_model_id': 'easyocr', 'ocr_model_id': 'tesseract', 'tsl_model_id': 'facebook/m2m100_1.2B'}
2024-04-27 17:39:59,676 - INFO - ocr.general:box - Loading BOX model: easyocr
2024-04-27 17:40:02,111 - INFO - plugin:plugin - Loading BOX model: easyocr
Using CPU. Note: This module is much faster with a GPU.
2024-04-27 17:40:05,808 - INFO - ocr.general:ocr - Loading OCR model: tesseract
2024-04-27 17:40:05,819 - INFO - ocr.general:tsl - Loading TSL model: facebook/m2m100_1.2B
2024-04-27 17:40:06,495 - INFO - plugin:plugin - Loading TSL model: facebook/m2m100_1.2B
2024-04-27 17:53:03,187 - ERROR - ocr.general:views - Failed to load models: Unable to load vocabulary from file. Please check that the provided vocabulary is accessible and not corrupted.
2024-04-27 17:53:05,137 - INFO - django.server:basehttp - - Broken pipe from ('127.0.0.1', 60097)

Please tell me if I need to provide other information.

P.S. Using latest release from 17.12.2023 (Same error at version from 29.10.2023)

@Crivella Crivella self-assigned this Apr 27, 2024
@Crivella
Copy link
Owner

This seems to be related to

but in both cases i do not think it is clear what the root cause is (the error you see is the server reporting an exception happened in transformers).

The server does not require any particular permission, just normal R/W which you should have as I imagine the database has already been created the database.

A few troubleshooting step i would suggest that might solve or give us more clues

  • Try deleting the .ocr_translate and start fresh without using a symlink (also out of curiosity did you use the command prompt/power shell to create a symlink? Creating a shortcut will not do the trick.)
  • Try loading a smaller model like staka/fugumt (see also above, if it is a problem of space either Disk or RAM)
  • You can also load only the OCR/TSL models individually (e.g. load only the TSL model and see if it manages to or give the same error)
  • As this is related with HuggingFace models I would be curios if you are able to load an OCR model like kha-white.

As a reference when downloading staka/fugumt-ja-en i just tried and you should see something like

config.json: 100%|████████████████████████████████████████████████████████████████████████| 1.03k/1.03k [00:00<?, ?B/s]
pytorch_model.bin: 100%|████████████████████████████████████████████████████████████| 121M/121M [00:01<00:00, 76.5MB/s]
generation_config.json: 100%|█████████████████████████████████████████████████████████████████| 289/289 [00:00<?, ?B/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████| 42.0/42.0 [00:00<?, ?B/s]
source.spm: 100%|███████████████████████████████████████████████████████████████████| 797k/797k [00:00<00:00, 2.23MB/s]
vocab.json: 100%|███████████████████████████████████████████████████████████████████| 861k/861k [00:00<00:00, 2.37MB/s]
special_tokens_map.json: 100%|██████████████████████████████████████████████████████████████| 74.0/74.0 [00:00<?, ?B/s]

@mrzhnex
Copy link
Author

mrzhnex commented Apr 27, 2024

12322222

Yesh, to create symlink i used powershell.
First, i used random model (from presented in browser extension), but i got an error.
Then, i tried others - get error by error, until my disk space is run out.
So, i switched to D:/, create symlink and tried the rest of models.

This screenshot from freash start (no symlink), just now. Maybe there is some hint at "could not find image processor class", but i don't really think so...
Is there something about CUDA and python in general?

@Crivella
Copy link
Owner

Python by itself should not access the CUDA api directly, it is usually done by C/C++ code under the hood.
The ... processor class ... has nothing to do with it.
Interestingly loading the VED model did not give you any error, only the SEQ2SEQ seems to.

I am not sure what the problem here could be, it might require some digging through the transformers library.

As another possible patchwork solution, could you try also the manual download approach?

  • Under your .ocr_translate create a folder named staka
  • Under staka create a folder named fugumt-ja-en
  • Under this folder download all the json files at https://huggingface.co/staka/fugumt-ja-en/tree/main
  • (This could also be done with GIT+LFS if you are practical with it)

Another thing, for running the code could you try doing it from powershell after setting the following environment variables, as the info if the code is being loaded from manually stores files is shown as a debug message

$env:DJANGO_DEBUG="true"
$env:DJANGO_LOG_LEVEL="DEBUG"

and than either write the path to the EXE, or depending on your terminal you could also drag and drop the file on it and it will autowrite the file path.

You could even try playing around with the TRANSFORMERS_CACHE environment variables to tell transformers where to store stuff. For more details see docs and where the variable is being used in the plugin enabling HuggingFace models https://github.com/Crivella/ocr_translate-hugging_face/blob/d6ae9d8f0b6f48b201bfc2ff74a8383909c7a680/ocr_translate_hugging_face/plugin.py#L115

@mrzhnex
Copy link
Author

mrzhnex commented Apr 28, 2024

Found source problem. It was cyrillic symbols in user's name. Was changed name to latin (hell of a job i would say), or how is it all calls...

Now it works like a charm, like a clock.
Possible upgrade for future versions:

  1. Add support for non-latin symbols in path.
  2. Add possibility to change default folder (user/.ocr_translate); I believe there is some tech in config or startup args, but i am blind or sms tell me if it is here, please.

Kinda curious, why application manage to access to /user/.ocr_translate with cyrillic symbols in username, but when it is time to load translation model it gave an error? Maybe different load/access methods?

But, the problem is gone, at least for me and you are now aware of possibility of it for future.
Should I close the Issue? I suppose I need to.

Thank you for your support, great work overall!

@mrzhnex mrzhnex closed this as completed Apr 28, 2024
@Crivella
Copy link
Owner

Nice you were able to figure it out.

It is already possible to control where model are stored using the TRANSFORMERS_CACHE environment variable see here for more details.

I am not 100% sure (should investigate) but i think the problem with the non-latin characters is inside the transformers library, since the problems is with the models and my code was able to create the database.
Might open an issue or PR with them in case i can pin point where stuff is going wrong.

Thanks, hope you will enjoy the tool ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants