Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Export brush label masks with matching file base name #254

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

PhillipRDI
Copy link

…nd masks with base filenames that match images

Previously it was impossible with the exported ZIP file to utilize exported brush data. It was not possible to determine with the mask filename which mask corresponded to which original image you uploaded.

The proposed change exports 2 folders: images and masks.
The images folder simply contains the original images with filename manipulations done by Label Studio. The masks folder contains masks according to IMAGEBASE-CLASS-SEQUENCE.png where the original image is just IMAGEBASE.IMAGEXTENSION.

You could potentially work around this issue with multiple exports including (1) Brush Labels to PNG, (2) JSON, and (3) YOLO. You'd have to copy the images from the YOLO export and then write a JSON parser to figure out which original image maps to which annotation in the JSON file. This seemed overly complicated and also inefficient since there's encoded masks in the JSON already.

…nd masks with base filenames that match images

Previously it was impossible with the exported ZIP file to utilize exported brush data.   It was not possible to determine with the mask filename which mask corresponded to which original image you uploaded.
@PhillipRDI PhillipRDI changed the title Exporting brush labels in both PNG and Numpy formats exports images a… feat: Export brush label masks with matching file base name Jun 23, 2024
@@ -116,7 +116,7 @@ def decode_from_annotation(from_name, results):
width = result["original_width"]
height = result["original_height"]
labels = result[key] if key in result else ["no_label"]
name = from_name + "-" + "-".join(labels)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have multiple brushlabel control tags, your brushlabels will contain only one of them.

Copy link
Author

@PhillipRDI PhillipRDI Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included an example in the next comment where I used the attached labeling interface (XML below). I was able to define two brushlabel classes and exported masks contained both classes. The first 3 exported image masks were of type "Hotdog" and the last 4 exported image masks were of type "Not Hotdog". I've used keypoints, rectangles, and brush labels to annotate and all were exported as masks with the proper class name.

I'm not sure if this is what you mean though by multiple brushlabel control tags.

<View>
  <Image name="image" value="$image" zoom="true"/>
  <Header value="Keypoint Labels"/>
  <KeyPointLabels name="tag2" toName="image" smart="true">
    <Label value="Not Hotdog" smart="true" background="#00FF00" showInline="true"/>
    <Label value="Hotdog" smart="true" background="#FF0000" showInline="true"/>
  </KeyPointLabels>
  <Header value="Rectangle Labels"/>
  <RectangleLabels name="tag3" toName="image" smart="true">
    <Label value="Not Hotdog" smart="true" background="#00FF00" showInline="true"/>
    <Label value="Hotdog" smart="true" background="#FF0000" showInline="true"/>
  </RectangleLabels>
  <Header value="Brush Labels"/>
  <BrushLabels name="tag" toName="image">
    <Label value="Not Hotdog" smart="true" background="#00FF00" showInline="true"/>
    <Label value="Hotdog" smart="true" background="#FF0000" showInline="true"/>
  </BrushLabels>
</View>

x for x in email if x.isalnum() or x == "@" or x == "."
) # sanitize filename
layers = decode_from_annotation(results)
image_base = ".".join(image_name.split('.')[0:-1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have multiple annotations per one tasks - you will export only the last one, others will be overwritten.

Copy link
Author

@PhillipRDI PhillipRDI Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used the PR of label-studio-sdk to label many images that contain multiple annotations per task and this has worked through hundreds of images. By the way, the segment anything integration with label-studio is a game changer when it comes to efficiently labeling images in a production environment.

To verify merges haven't broken anything, I pulled the latest label-studio and label-studio-sdk from the PR repo and verified with all updates functionality is preserved. I just created a simple project to demonstrate to avoid sharing any proprietary images (see attached). Inside the "decode_from_annotation" function there's an included sequence counter that is incremented and appended to the filename at the end of this function in the "counters[name]". This creates a unique filename for multiple annotations in a single task.

In the attached example, all images are stored in a separate folder. If you look at the filename for the example image it starts with a base of "c2825893-NathonFamous". Every mask of class "Hotdog" will then be named according to "c2825893-NathonFamous-Hotdog-X.png" where X is the sequence number.

Example export image in the "images" folder of an export:

Image "c2825893-NathonFamous.png":
c2825893-NathonFamous

Example export masks in the "masks" folder of an export (each mask filename has a unique sequence name per class):

Mask Image "c2825893-NathonFamous-Hotdog-0.png":
c2825893-NathonFamous-Hotdog-0

Mask Image "c2825893-NathonFamous-Hotdog-1.png":
c2825893-NathonFamous-Hotdog-1

Mask Image "c2825893-NathonFamous-Hotdog-2.png"
c2825893-NathonFamous-Hotdog-2

Mask Image "c2825893-NathonFamous-Not Hotdog-0.png"
c2825893-NathonFamous-Not Hotdog-0

Mask Image "c2825893-NathonFamous-Not Hotdog-1.png"
c2825893-NathonFamous-Not Hotdog-1

Mask Image "c2825893-NathonFamous-Not Hotdog-2.png"
c2825893-NathonFamous-Not Hotdog-2

Mask Image "c2825893-NathonFamous-Not Hotdog-3.png"
c2825893-NathonFamous-Not Hotdog-3

item["annotation_id"],
item["completed_by"],
from_name,
os.path.basename(item["input"]["image"]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if you have another image field name in the task.data?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm not following what you mean. When I loaded the value of the dictionary in item["input"] I get the following:
item["input"] = {'image': '/data/upload/2/c2825893-NathonFamous.png'}

Is there a better place to pull the image name? I'm having troubles tracing back the definition of the input data.

@@ -735,6 +743,90 @@ def add_image(images, width, height, image_id, image_path):
indent=2,
)

def convert_to_brush(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a separate file for brush functions - brush.py - please move it there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved convert_to_brush to brush.py in the PR repo with commit bd7bbba

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants