The main change is the introduction of a plugin manager to install the plugins+dependencies on demand.
This makes the release versions (both windows EXE and docker image) much smaller, and allow users to decide which functionalities they want to use.
IMPORTANT
From version 0.6 onward python and pip need to be installed on the system (3.10 or 3.11).
See more below in the Changes section.
-
Windows: https://www.python.org/downloads/windows/
- NOTE: make sure to check the box that says "Add Python to PATH" so that pip can be found by the server script without having to make any assumptions
-
Linux: Use your package manager (e.g. sudo apt install python3 python3-pip)
While i think for the most part everything should work fine, i assume there might be some edge cases that I've not considered,
that might make the way I've handled the plugin manager not work for everyone.
Feel free to report such issues and I'll try to fix them as soon as possible.
(If to many problems arise I might consider redesigning the plugin manager itself)
Changes
- Removed the frozen executable from the release files in favor of an Automatic1111 stile batch script
- Even with the plugin manager, installing some dependencies that requiers actual compilation by invoking pip from within the frozen executable was giving non trivial to fix trouble.
For this reason I decided to axe the PyInstaller frozen EXE all together and go with a batch script that will:- Allow user to more easily set environment variables (a few of the most relevant ones are already set as empty in the script)
- Create or reuse a virtual environment in a folder venv in the same directory as the script
- Install the minimum required packages in it to run the server
- Run the server
- Even with the plugin manager, installing some dependencies that requiers actual compilation by invoking pip from within the frozen executable was giving non trivial to fix trouble.
- Added a plugin manager to install/uninstall plugins on demand
-
The installed plugins can be controlled via the new version of the firefox extension or directly using the
manage_plugins/ endpoint. -
The plugins will by be installed under
$OCT_BASE_DIR/plugins
which by default will be under your user profile (e.g.C:\Users\<USERNAME>\.ocr_translate
on windows).
If you have trouble with space under C:\ consider setting the OCT_BASE_DIR environment variable to a different location. -
The plugin data is stored in a JSON file inside the project plugins_data.json
-
Version/Scope/Extras of a package to be installed can be controlled via environment variables
OCT_PKG_<package_name(uppercase)>_[VERSION|SCOPE|EXTRAS]
(eg to change torch to version A.B.C you would set OCT_PKG_TORCH_VERSION="A.B.C").
If the package name contains a-
it should be replaced with_min_
in the package name -
Removed env variable AUTOCREATE_VALIDATED_MODELS and relative server initialization.
Now models are created/activated or deactivated via the plugin manager, when the respective plugin is installed/uninstalled.
-
- Streamlined docker image to also use the run_server.py script for initialization.
- Added plugin for ollama (https://github.com/ollama/ollama) for translation using LLMs
- Note ollama needs to be run/installed separately and the plugin will just make calls to the server.
- Use the OCT_OLLAMA_ENDPOINT environment variable to specify the endpoint of the ollama server
(see the plugin page for more details)
- Added plugin for PaddleOCR (https://github.com/PaddlePaddle/PaddleOCR) (Box and OCR) (seems to work very well
with chinese).- The default versions installed by the plugin_manager of paddlepaddle (2.5.2 on linux and 2.6.1 on windows)
might not work for every system as there can be underlying failures in the C++ code that the plugin uses.
The version installed can be controlled using the environment variable OCT_PKG_PADDLEPADDLE_VERSION.
- The default versions installed by the plugin_manager of paddlepaddle (2.5.2 on linux and 2.6.1 on windows)
- Added possibility to specify extra DJANGO_ALLOWED_HOSTS and a server bind address via environment variables. (Fixes #30)
- Manual model is not implemented as an entrypoint anymore (will work also without recreating models).
- OCR models can now use a tokenizer and a processor from different models.
- Added caching of the languages and allowed box/ocr/tsl models for faster response times on the handshake endpoint.
- New endpoint run_tsl_xua made to work with XUnity.AutoTranslator (https://github.com/bbepis/XUnity.AutoTranslator)
- Improved API return codes
Migrating from an older version
As usual, the database will be upgraded automatically to the new version.
For safety, it is suggested to make a copy of it (by default under %USERPROFILE%/.ocr_translate
) in case you need to downgrade.
Already downloaded model can be reused, but the new structure is slightly different, before you would have something like:
- %USERPROFILE%/.ocr_translate/
- <huggingface_models>
- .easyocr/
- <easyocr_models>
- tesseract/
- <tesseract_models>
Now by default you will have:
- %USERPROFILE%/.ocr_translate/ (or whatever
OCT_BASE_DIR
is set to)- models/
- huggingface/ (or whatever
TRANSFORMERS_CACHE
is set to)- <huggingface_models>
- easyocr/ (or whatever
EASYOCR_PREFIX
is set to)- <easyocr_models>
- tesseract/ (or whatever
TESSEARCT_PREFIX
is set to)- <tesseract_models>
- paddleocr/ (or whatever
PADDLEOCR_PREFIX
is set to)- <paddleocr_models>
- huggingface/ (or whatever
- models/
You can move them manually to mimic the new structure or delete the them and let the server re-download them.
Plugins will be stored under OCT_BASE_DIR/plugins
(default to %USERPROFILE%/.ocr_translate
)
OCT_BASE_DIR/
(default to%USERPROFILE%/.ocr_translate
)- plugins.json (list of installed plugins)
- plugins/
- <plugin_data>/
The installed python packages divided by scope depending if they are ment to be used for CPU/GPU/BOTH
- <plugin_data>/
This folder can go up to several GB when installing torch (huggingface and easyocr) for GPU, so make sure you have enough space.