Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for OCR tool data missing in the imports #224

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions crewai_tools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
MultiOnTool,
MySQLSearchTool,
NL2SQLTool,
OCRTool,
PatronusEvalTool,
PatronusLocalEvaluatorTool,
PatronusPredefinedCriteriaEvalTool,
Expand Down
1 change: 1 addition & 0 deletions crewai_tools/tools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
from .multion_tool.multion_tool import MultiOnTool
from .mysql_search_tool.mysql_search_tool import MySQLSearchTool
from .nl2sql.nl2sql_tool import NL2SQLTool
from .ocr_tool.ocr_tool import OCRTool
from .patronus_eval_tool import (
PatronusEvalTool,
PatronusLocalEvaluatorTool,
Expand Down
6 changes: 5 additions & 1 deletion crewai_tools/tools/ocr_tool/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
This tool performs Optical Character Recognition (OCR) on images using supported LLMs. It can extract text from both local image files and images available via URLs. The tool leverages the LLM's vision capabilities to provide accurate text extraction from images.

## Installation
Install the crewai_tools package

Install the crewai_tools package:

```shell
pip install 'crewai[tools]'
```
Expand All @@ -14,6 +16,7 @@ pip install 'crewai[tools]'

Any LLM that supports the `vision` feature should work. It must accept image_url as a user message.
The tool has been tested with:

- OpenAI's `gpt-4o`
- Gemini's `gemini/gemini-1.5-pro`

Expand All @@ -38,5 +41,6 @@ def researcher(self) -> Agent:
```

The tool accepts either a local file path or a URL to the image:

- For local files, provide the absolute or relative path
- For remote images, provide the complete URL starting with 'http' or 'https'