feat: Export brush label masks with matching file base name #254

PhillipRDI · 2024-06-21T20:23:34Z

…nd masks with base filenames that match images

Previously it was impossible with the exported ZIP file to utilize exported brush data. It was not possible to determine with the mask filename which mask corresponded to which original image you uploaded.

The proposed change exports 2 folders: images and masks.
The images folder simply contains the original images with filename manipulations done by Label Studio. The masks folder contains masks according to IMAGEBASE-CLASS-SEQUENCE.png where the original image is just IMAGEBASE.IMAGEXTENSION.

You could potentially work around this issue with multiple exports including (1) Brush Labels to PNG, (2) JSON, and (3) YOLO. You'd have to copy the images from the YOLO export and then write a JSON parser to figure out which original image maps to which annotation in the JSON file. This seemed overly complicated and also inefficient since there's encoded masks in the JSON already.

…nd masks with base filenames that match images Previously it was impossible with the exported ZIP file to utilize exported brush data. It was not possible to determine with the mask filename which mask corresponded to which original image you uploaded.

Added a fix to the brush label export feature so that filenames with multiple periods are properly exported.

makseq · 2024-08-05T22:22:33Z

src/label_studio_sdk/converter/brush.py

@@ -116,7 +116,7 @@ def decode_from_annotation(from_name, results):
        width = result["original_width"]
        height = result["original_height"]
        labels = result[key] if key in result else ["no_label"]
-        name = from_name + "-" + "-".join(labels)


if you have multiple brushlabel control tags, your brushlabels will contain only one of them.

I included an example in the next comment where I used the attached labeling interface (XML below). I was able to define two brushlabel classes and exported masks contained both classes. The first 3 exported image masks were of type "Hotdog" and the last 4 exported image masks were of type "Not Hotdog". I've used keypoints, rectangles, and brush labels to annotate and all were exported as masks with the proper class name.

I'm not sure if this is what you mean though by multiple brushlabel control tags.

<View> <Image name="image" value="$image" zoom="true"/> <Header value="Keypoint Labels"/> <KeyPointLabels name="tag2" toName="image" smart="true"> <Label value="Not Hotdog" smart="true" background="#00FF00" showInline="true"/> <Label value="Hotdog" smart="true" background="#FF0000" showInline="true"/> </KeyPointLabels> <Header value="Rectangle Labels"/> <RectangleLabels name="tag3" toName="image" smart="true"> <Label value="Not Hotdog" smart="true" background="#00FF00" showInline="true"/> <Label value="Hotdog" smart="true" background="#FF0000" showInline="true"/> </RectangleLabels> <Header value="Brush Labels"/> <BrushLabels name="tag" toName="image"> <Label value="Not Hotdog" smart="true" background="#00FF00" showInline="true"/> <Label value="Hotdog" smart="true" background="#FF0000" showInline="true"/> </BrushLabels> </View>

makseq · 2024-08-05T22:23:26Z

src/label_studio_sdk/converter/brush.py

-        x for x in email if x.isalnum() or x == "@" or x == "."
-    )  # sanitize filename
+    layers = decode_from_annotation(results)
+    image_base = ".".join(image_name.split('.')[0:-1])


if you have multiple annotations per one tasks - you will export only the last one, others will be overwritten.

I have used the PR of label-studio-sdk to label many images that contain multiple annotations per task and this has worked through hundreds of images. By the way, the segment anything integration with label-studio is a game changer when it comes to efficiently labeling images in a production environment.

To verify merges haven't broken anything, I pulled the latest label-studio and label-studio-sdk from the PR repo and verified with all updates functionality is preserved. I just created a simple project to demonstrate to avoid sharing any proprietary images (see attached). Inside the "decode_from_annotation" function there's an included sequence counter that is incremented and appended to the filename at the end of this function in the "counters[name]". This creates a unique filename for multiple annotations in a single task.

In the attached example, all images are stored in a separate folder. If you look at the filename for the example image it starts with a base of "c2825893-NathonFamous". Every mask of class "Hotdog" will then be named according to "c2825893-NathonFamous-Hotdog-X.png" where X is the sequence number.

Example export image in the "images" folder of an export:

Image "c2825893-NathonFamous.png":

Example export masks in the "masks" folder of an export (each mask filename has a unique sequence name per class):

Mask Image "c2825893-NathonFamous-Hotdog-0.png":

Mask Image "c2825893-NathonFamous-Hotdog-1.png":

Mask Image "c2825893-NathonFamous-Hotdog-2.png"

Mask Image "c2825893-NathonFamous-Not Hotdog-0.png"

Mask Image "c2825893-NathonFamous-Not Hotdog-1.png"

Mask Image "c2825893-NathonFamous-Not Hotdog-2.png"

Mask Image "c2825893-NathonFamous-Not Hotdog-3.png"

makseq · 2024-08-05T22:24:00Z

src/label_studio_sdk/converter/brush.py

-            item["annotation_id"],
-            item["completed_by"],
-            from_name,
+            os.path.basename(item["input"]["image"]),


what if you have another image field name in the task.data?

Sorry I'm not following what you mean. When I loaded the value of the dictionary in item["input"] I get the following:
item["input"] = {'image': '/data/upload/2/c2825893-NathonFamous.png'}

Is there a better place to pull the image name? I'm having troubles tracing back the definition of the input data.

makseq · 2024-08-05T22:24:42Z

src/label_studio_sdk/converter/converter.py

@@ -735,6 +743,90 @@ def add_image(images, width, height, image_id, image_path):
                indent=2,
            )

+    def convert_to_brush(


We have a separate file for brush functions - brush.py - please move it there.

Moved convert_to_brush to brush.py in the PR repo with commit bd7bbba

github-actions bot added the title needs formatting label Jun 21, 2024

PhillipRDI changed the title ~~Exporting brush labels in both PNG and Numpy formats exports images a…~~ feat: Export brush label masks with matching file base name Jun 23, 2024

github-actions bot added feat and removed title needs formatting labels Jun 23, 2024

PhillipRDI added 3 commits June 27, 2024 08:41

Merge branch 'HumanSignal:master' into master

53ac094

feat: Export brush labels with matching file base name (updated)

48693c4

Added a fix to the brush label export feature so that filenames with multiple periods are properly exported.