Question: How can we efficiently retrieve existing annotation data by searching based on key and value? #32

tenzin3 · 2024-10-08T04:22:44Z

# If ann data already exists, use it . Otherwise create a new one with new id
prepared_ann_data = []
for k, v in ann_data.items():
    try:
        ann_datas = list(ann_store.data(set=ann_dataset.id(), key=k, value=v))
        prepared_ann_data.append(ann_datas[0])
    except:  # noqa
        prepared_ann_data.append(
            {"id": get_uuid(), "set": ann_dataset.id(), "key": k, "value": v}
        )

ann_store.annotate(target=text_selector, data=prepared_ann_data, id=get_uuid())

In ann_data, we have annotation data that we want to associate with an annotation. We aim to avoid creating a new annotation data entry with a new ID if it already exists. If annotation data with the same key and value is already present, we want to link it to the incoming annotation instead of duplicating it. The current code works, but I wanted to know if there's a better solution using the STAM API.

Apparently if the key doesnt exists in the annotation data set, it throws an error.

The text was updated successfully, but these errors were encountered:

proycon · 2024-10-09T09:57:22Z

STAM will already do something similar internally, assigning a new random ID for the annotation data if it is new, and reusing the existing one if not, so you can just pass something like:

ann_store.annotate(target=text_selector, data=[
  {
     "set": ann_dataset.id(), "key": k, "value": v
  },
  {
     "set": ann_dataset.id(), "key": k2, "value": v2
  },
], id=get_uuid())

Note that I omitted the AnnotationData ID here, that means an ID will be assigned automatically. STAM assigns a random 21-char nanoid rather than a uuid, as that takes less space, see https://crates.io/crates/nanoid .

If you really do want to assign the annotationdata ID explicitly, then the method you used is okay, but can be improved slightly for performance inside the try block:

prepared_ann_data.append( next(ann_store.data(set=ann_dataset.id(), key=k, value=v, limit=1)) )

proycon self-assigned this Oct 9, 2024

proycon added the question Further information is requested label Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: How can we efficiently retrieve existing annotation data by searching based on key and value? #32

Question: How can we efficiently retrieve existing annotation data by searching based on key and value? #32

tenzin3 commented Oct 8, 2024

proycon commented Oct 9, 2024

Question: How can we efficiently retrieve existing annotation data by searching based on key and value? #32

Question: How can we efficiently retrieve existing annotation data by searching based on key and value? #32

Comments

tenzin3 commented Oct 8, 2024

proycon commented Oct 9, 2024