-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Method to fill holes in segmentations #2678
Comments
Hi @Spenhouet , Thanks for your sharing! I think that's an interesting idea. Thanks. |
Hi @Nic-Ma, my suggestion for the transformation class The issue with the morphology implementations in scipy is that they all only work for binary data. |
Thanks for your analysis. Thanks. |
Okay, I see a way this could work:
But this might be even slower than my implementation suggested above. @Nic-Ma Did you have another implementation with I can open a pull request with my suggested implementation but as I stated it is unacceptably slow. I did really hope that someone already done something like this or is way more knowledgeable than me and knows how to implemented this with much better performance. |
looks like this approach is promising https://stackoverflow.com/a/66496234, would be nice to have a 3D version of it in MONAI with pytorch cuda extensions... or if you are looking for offline preprocessing, the gpu-based apis might be helpful |
@wyli Good catch. I take back my statement with respect to kornia. I did not work with it before and just now tried their dilation and erosion implementations and it seems they can not handle 3D data. I did yet only work with the binary_dilation and binary_erosion functions of scipy which have the limitation of not working for multiple classes.
Yes, we are. Thanks for the hint. I will take a look. I'm happy to provide a PR for this feature but I would like to implement something that is performant first. |
Okay, I'm still stuck on the The I will also take a look at the other implementations @wyli linked. |
Hi @Spenhouet , In my simple mind, I think maybe we can:
Here is how we filter out the largest connected component region: Thanks. |
I checked out the other linked implementations and they also only seem to work for binary data. One could implement the same as I did in my first post with a GPU accelerated version but I did hope that this could actually be implemented more efficient (not just faster via hardware resources). I could not completely follow the Given the following simple 2D binary image: [ 1, 1, 1 ]
[ 1, 0, 1 ]
[ 1, 1, 1 ] The mask is: [ 0, 0, 0 ]
[ 0, 1, 0 ]
[ 0, 0, 0 ] The image which will be dilated with a border is: [ 1, 1, 1, 1, 1 ]
[ 1, 0, 0, 0, 1 ]
[ 1, 0, 0, 0, 1 ]
[ 1, 0, 0, 0, 1 ]
[ 1, 1, 1, 1, 1 ] This now repeats a dilation step growing the border area. The mask is used to stop the growth. This stops when there is no more growth. Then it takes the dilation mask to define the final shape. In the binary case I guess that is very similar to the idea of using the What do you mean with step 3. and 4.? Maybe I'm not yet fully getting your idea. The remove_small_holes implementation @wyli linked to might be similar to what you are proposing? This also works on binary images.
This is still not without issues. For example the following would work with the edge based growing approach but not with the largest connected component approach: [ 1, 1, 1, 1, 1 ]
[ 1, 0, 0, 0, 1 ]
[ 1, 0, 0, 0, 1 ]
[ 1, 0, 0, 0, 1 ]
[ 1, 1, 1, 1, 1 ] The larges connected component here is actually the hole to fill. So except for further understanding the existing / proposed solutions, I still have no clue if this could be implemented in a non binary way. For now I might just open a PR for my original implementation. |
Hi @Spenhouet , Actually, I didn't get a chance to think clearly about how to remove the Thanks. |
There's ways of implementing the morphology operators but the solution mentioned relies on unfold which is 2D only which is the same problem Kornia has with its operators. We would need 3D specific definitions of dilation, erosion, etc. to implement at a hole filling operation. |
I did already look into the methods posted by @wyli but sadly they do not solve the problem. We want to fill all holes which are completely outlined by a single class which is not the background. The connected component solution as provided by skimage.morphology.area_closing does not take open shapes (holes which are not completely enclosed by a single class) into account (@Nic-Ma maybe this is what you refer to as open holes?). This solution also does not work for arbitrary hole sizes. We could theoretically find all connected components from the background class and then determine all neighbors per component so filter for all connected components which only have a single class as neighbor excluding background. @ericspod The binary_dilation and binary_erosion implementation of scipy does work for ndimages. The only issue is that we need to perform this per class (binary) and thereby multiply the execution time with the number of classes. This is currently the only implementation which actually works correct. It is just slow. |
@Nic-Ma I now implemented a correct version with connected component labeling: def fill_holes8(img: np.ndarray, applied_labels: np.ndarray) -> np.ndarray:
connectivity = 1
footprint = generate_binary_structure(img.ndim, connectivity)
background_mask = img == 0
components, num_components = label(background_mask, structure=footprint)
filled_holes = np.zeros_like(img)
for component_label in range(1, num_components+1):
component_mask = components == component_label
# Pad with -1 to detect edge voxels
component_neighborhood = np.pad(img, 1, constant_values=-1)[binary_dilation(np.pad(component_mask, 1), structure=footprint)]
neighbor_labels = np.unique(component_neighborhood)
if len(neighbor_labels) == 2 and -1 not in neighbor_labels:
neighbor_label = neighbor_labels[1]
filled_holes[component_mask] = neighbor_label
return img + filled_holes At least for my random sampled data this is extremely slower but the performance here depends on the number of background components. In real world data this most likely is way less. This code can probably be optimized much further but I doubt that it will get more performant than fill_holes7. |
Hi @Spenhouet , Sounds great, could you please help contribute a PR? I think it's good to have it as initial version, and may optimize perf later. Thanks. |
@Nic-Ma Which solution would you prefer for a pull request? |
@Spenhouet , I think your solution8 looks good. Thanks. |
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@outlook.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@outlook.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
I would choose whichever correct implementation is fastest for a PR, that would be helpful. This would be quite useful for other operations especially post-processing, but if the speed is poor we may need our own implementation of a 3D unfold or max/min filters. |
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Instead of benchmarking with random data, I did load one of our brain segmentations (256 x 256 x 256) and did run both methods (
So even for real world data, growing the background from the edge for every label is the "faster" solution (less slow). There is also the option to go into one more dimension by one-hot encoding but this will blow up memory usage too much. That is why I did not consider that solution. I did now implement the PR with the |
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
* Add transform to fill holes and to filter (#2678) Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de> * Change name of label filter class (#2678) Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de> * Change fill holes to growing logic (#2678) Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de> * Fix missing entry in min_tests (#2678) Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de> * Fix review comments (#2678) Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de> * Remove batch dim and add one-hot handling (#2678) Signed-off-by: Sebastian Penhouet <sebastian.penhouet@airamed.de> * [MONAI] python code formatting Signed-off-by: monai-bot <monai.miccai2019@gmail.com> Co-authored-by: Sebastian Penhouet <sebastian.penhouet@airamed.de>
Is your feature request related to a problem? Please describe.
We have 3D segmentation masks. Every segmented shape is not supposed to have holes within its borders.
Any wholes might be considered potential artifacts.
Especially for our training data we would like to close such holes.
For an example, the following matrix:
Should result in
The only filled holes are the 1 and 3 in the middle slice.
The 2 shape is open to the side and the 4 is open to the back.
The 0 between the classes should stay untouched.
Describe the solution you'd like
MONAI is missing a "fill_holes" function / transformation.
scipy has an implementation which works for 3D data but only for binary images (scipy.ndimage.morphology.binary_fill_holes). This requires an iteration over all labels of the image which makes this already slow method unacceptably slow.
I wish that MONAI would implement such a function.
Describe alternatives you've considered
I opened this feature request on the scipy project: scipy/scipy#14504
And this stackoverflow question: https://stackoverflow.com/questions/68608749/performant-way-to-fill-holes-for-categorical-data
I implemented 7 versions using the existing scipy.ndimage.morphology.binary_fill_holes function (or its implementation) and
numpy
. Here the two best versions so far:In MONAI this could be implemented something like this:
I measured the performance the following way (matching my real world data distribution):
For my first implementations I got the following execution times (t=100):
This is very slow.
The last implementation fill_holes7 is only 1.27 times faster than fill_holes3.
I really hope there is a more performant way of doing this.
The text was updated successfully, but these errors were encountered: