Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to prepare the ground truth with partial labels in training set #20

Open
JunMa11 opened this issue Aug 7, 2022 · 3 comments
Open
Assignees
Labels
question Further information is requested

Comments

@JunMa11
Copy link

JunMa11 commented Aug 7, 2022

Dear @silvandeleemput ,

Thanks for the awesome work.

I want to train a three-organ segmentation model based on the following partial label dataset

  • dataset 1: organ 1 and organ 2 are labeled
  • dataset 2: organ 2 and organ 3 are labeled
  • dataset 3: organ 3 is labeled

The document mentioned that

Unlabeled segmentation data should be marked with a value of -1, inside the labels/ ground truth segmentation maps.

Does it mean all the background voxels should be marked with values -1?

For example, in dataset 1, I keep the organ 1 and organ 2 labels and set all the remaining voxels as -1. Is it right?

@silvandeleemput silvandeleemput added the question Further information is requested label Aug 18, 2022
@silvandeleemput silvandeleemput self-assigned this Aug 18, 2022
@silvandeleemput
Copy link
Member

Hi, @JunMa11. I am currently on vacation until the beginning of October, so don't expect quick replies.

Regarding your question, it depends on your data. In general, it is best to use as much labeled training data as possible even with the partially labeled trainer. If you have background voxels that you know will be unambiguously background voxels please set them to 0, in your case these might be non-organ voxels. If there are unlabeled voxels that might be background but could also be some other organ then consider setting them to -1.

To help you further I would need to know a little bit more about the datasets you want to use:

  • What kind of organs are you interested in?
  • How much (in percentage) of the datasets are annotated/labeled?
  • Do organs 1, 2 & 3 appear in all the datasets, but are they just labeled in some and not in others?
  • Do you have voxels in your datasets that are really not organ tissue and should be labeled as other/background?

If you can answer these questions I can better help you set up the trainer and the training data.

@JunMa11
Copy link
Author

JunMa11 commented Jan 17, 2023

Hi @silvandeleemput

Happy new year! Hope you enjoyed the holidays:)

I'm extremely sorry for the late response. My partial-label learning project was suspended during the past few months but now it has restarted.

Here are answers to your questions:

  • What kind of organs are you interested in?

I'm working on head organ segmentation (34 organs) and I have a partially labeled dataset with 2000 3D CT images.

  • How much (in percentage) of the datasets are annotated/labeled?

The number of annotations for each organ ranges from 50 to 1500.

  • Do organs 1, 2 & 3 appear in all the datasets, but are they just labeled in some and not in others?

All organs appear in the images but only part of the images have these organ annotations.

  • Do you have voxels in your datasets that are really not organ tissue and should be labeled as other/background?

I think the zero-image-intensity regions can be labeled as background.

During my previous experiments, I set all non-labeled voxels to -1. The training goes well but the segmentation results are poor. Based on your guidance, I should keep the zero-image-intensity voxels to 0 and only set the unlabeled non-zero-image regions to -1 for the new experiments. Am I right?

Any comments are highly appreciated:)

@silvandeleemput
Copy link
Member

Given the information you provided, you might want to try one of the following trainers:

  • nnUNetTrainerV2Sparse (I would recommend trying this one first, since if I understood correctly the dataset annotations can be quite sparse in some cases)
  • nnUNetTrainerV2SparseNormalSampling

During my previous experiments, I set all non-labeled voxels to -1. The training goes well but the segmentation results are poor. Based on your guidance, I should keep the zero-image-intensity voxels to 0 and only set the unlabeled non-zero-image regions to -1 for the new experiments. Am I right?

That's correct. If you know something is a background voxel better label it with a 0, if it is a non-annotated tissue of an unknown label set it to -1. The less uncertainty the better training should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants