Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Argilla integration for v2.x SDK #2915

Merged
merged 23 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
7bf8f08
update argilla version
sdiazlor Aug 7, 2024
42a7ce5
Update flavor
sdiazlor Aug 7, 2024
83e8006
update the ArgillaAnnotator
sdiazlor Aug 7, 2024
3248a21
Update docs
sdiazlor Aug 7, 2024
a67b4dc
fix formatting
sdiazlor Aug 7, 2024
423a162
Merge branch 'develop' into update-integration-argilla-2.0
sdiazlor Aug 7, 2024
446be37
Merge branch 'develop' into update-integration-argilla-2.0
strickvl Aug 8, 2024
d942c7f
Update docs/book/component-guide/annotators/argilla.md
sdiazlor Aug 8, 2024
177be8e
Update docs/book/component-guide/annotators/argilla.md
sdiazlor Aug 8, 2024
5471431
update type hinting
sdiazlor Aug 8, 2024
90aadbd
update paragraph
sdiazlor Aug 8, 2024
43e5cf1
Merge branch 'develop' into update-integration-argilla-2.0
sdiazlor Aug 8, 2024
b0f5ee1
Merge branch 'develop' into update-integration-argilla-2.0
strickvl Aug 8, 2024
1b783b4
Merge branch 'develop' into update-integration-argilla-2.0
strickvl Aug 9, 2024
7c19830
fix: add deprecation validator
sdiazlor Aug 9, 2024
a4d4a8a
fix: use logger
sdiazlor Aug 9, 2024
acb2b8a
Merge branch 'develop' into update-integration-argilla-2.0
strickvl Aug 9, 2024
be92aac
Add argilla to list of ignored integrations
strickvl Aug 9, 2024
53fad2a
Merge branch 'develop' into update-integration-argilla-2.0
strickvl Oct 14, 2024
a5081b2
Update registration command
AlexejPenner Oct 23, 2024
5276538
Merge branch 'develop' into update-integration-argilla-2.0
AlexejPenner Oct 23, 2024
3d405da
Merge branch 'develop' into update-integration-argilla-2.0
strickvl Oct 23, 2024
1c4c505
Merge branch 'develop' into update-integration-argilla-2.0
AlexejPenner Oct 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified docs/book/.gitbook/assets/argilla_annotator.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/book/component-guide/annotators/annotators.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ The core parts of the annotation workflow include:
### List of available annotators

For production use cases, some more flavors can be found in specific `integrations` modules. In terms of annotators,
ZenML features integrations with `label_studio` and `pigeon`.
ZenML features integrations with the following tools.

| Annotator | Flavor | Integration | Notes |
|-----------------------------------------|----------------|----------------|----------------------------------------------------------------------|
Expand Down
20 changes: 7 additions & 13 deletions docs/book/component-guide/annotators/argilla.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,7 @@ description: Annotating data using Argilla.

# Argilla

[Argilla](https://github.com/argilla-io/argilla) is an open-source data curation
platform designed to enhance the development of both small and large language
models (LLMs) and NLP tasks in general. It enables users to build robust
language models through faster data curation using both human and machine
feedback, providing support for each step in the MLOps cycle, from data labeling
to model monitoring.
[Argilla](https://github.com/argilla-io/argilla) is a collaboration tool for AI engineers and domain experts who need to build high-quality datasets for their projects. It enables users to build robust language models through faster data curation using both human and machine feedback, providing support for each step in the MLOps cycle, from data labeling to model monitoring.

![Argilla Annotator](../../.gitbook/assets/argilla_annotator.png)

Expand All @@ -31,7 +26,7 @@ of Argilla as well as a deployed instance of Argilla. There is an easy way to
deploy Argilla as a [Hugging Face
Space](https://huggingface.co/docs/hub/spaces-sdks-docker-argilla), for
instance, which is documented in the [Argilla
documentation](https://docs.argilla.io/en/latest/getting_started/installation/deployments/huggingface-spaces.html).
documentation](https://docs.argilla.io/latest/getting_started/quickstart/).

### How to deploy it?

Expand Down Expand Up @@ -59,16 +54,16 @@ zenml secret create argilla_secrets --api_key="<your_argilla_api_key>"
Then register your annotator with ZenML:

```shell
zenml annotator register argilla --flavor argilla --authentication_secret=argilla_secrets
zenml annotator register argilla --flavor argilla --authentication_secret=argilla_secrets --port=6900
```

When using a deployed instance of Argilla, the instance URL must be specified
without any trailing `/` at the end. If you are using a Hugging Face Spaces
instance and its visibility is set to private, you must also set the
`extra_headers` parameter which would include a Hugging Face token. For example:
`headers` parameter which would include a Hugging Face token. For example:

```shell
zenml annotator register argilla --flavor argilla --authentication_secret=argilla_secrets --instance_url="https://[your-owner-name]-[your_space_name].hf.space" --extra_headers="{"Authorization": f"Bearer {<your_hugging_face_token>}"}"
zenml annotator register argilla --flavor argilla --authentication_secret=argilla_secrets --instance_url="https://[your-owner-name]-[your_space_name].hf.space" --headers='{"Authorization": "Bearer {[your_hugging_face_token]}"}'
```

Finally, add all these components to a stack and set it as your active stack.
Expand All @@ -95,9 +90,8 @@ functionality via the ZenML SDK.

You can access information about the datasets you're using with the `zenml
annotator dataset list`. To work on annotation for a particular dataset, you can
run `zenml annotator dataset annotate <dataset_name>`. What follows is an
overview of some key components to the Argilla integration and how it can be
used.
run `zenml annotator dataset annotate <dataset_name>`. This will open the Argilla
web interface for you to start annotating the dataset.

#### Argilla Annotator Stack Component

Expand Down
5 changes: 1 addition & 4 deletions docs/mocked_libs.json
Original file line number Diff line number Diff line change
Expand Up @@ -229,10 +229,7 @@
"xgboost",
"argilla",
"argilla.client",
"argilla.client.client",
"argilla.client.sdk",
"argilla.client.sdk.commons",
"argilla.client.sdk.commons.errors",
"argilla._exceptions._api",
"peewee",
"prodigy",
"prodigy.components",
Expand Down
2 changes: 1 addition & 1 deletion scripts/install-zenml-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ install_integrations() {
# figure out the python version
python_version=$(python -c "import sys; print('.'.join(map(str, sys.version_info[:2])))")

ignore_integrations="feast label_studio bentoml seldon pycaret skypilot_aws skypilot_gcp skypilot_azure pigeon prodigy"
ignore_integrations="feast label_studio bentoml seldon pycaret skypilot_aws skypilot_gcp skypilot_azure pigeon prodigy argilla"

# Ignore tensorflow and deepchecks only on Python 3.12
if [ "$python_version" = "3.12" ]; then
Expand Down
2 changes: 1 addition & 1 deletion src/zenml/integrations/argilla/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ class ArgillaIntegration(Integration):

NAME = ARGILLA
REQUIREMENTS = [
"argilla>=1.20.0,<2",
"argilla>=2.0.0",
]

@classmethod
Expand Down
Loading
Loading