Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DO NMS before or after keypoints are dectected, what's the differences? #112

Open
Vincentqyw opened this issue Apr 1, 2022 · 7 comments

Comments

@Vincentqyw
Copy link

Hi, I notice that there are slightly differences between Magicleap SuperPoint implementation by Daniel and this repo.
In this repo, nms (defined by a max_pooling kernel) is performed on heatmap/scores matrix (https://github.com/magicleap/SuperGluePretrainedNetwork/blob/master/models/superpoint.py#L167), but in Daniel's version, nms is perform on keypoints locations(https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/demo_superpoint.py#L258). I can't figure out why you make this revision and what's the differences of the final keypoint localtion? Thanks for your reply:)
@skydes

@sarlinpe
Copy link
Contributor

sarlinpe commented Apr 1, 2022

Hi @Vincentqyw, good question! The two approaches produce nearly-identical results but the implementation of this repo:

  • is much faster: it can run on GPU (it works on a torch.Tensor) and has near-constant time (instead of iterating over the keypoints)
  • can be batched if images have the same size.

The whole forward pass of SuperPoint is actually batched, so we used this implementation of SuperPoint in our training pipeline, detecting & describing data-augmented images on the fly.

@Vincentqyw
Copy link
Author

Excellent! Thx again for your prompt reply!

@javierttgg
Copy link

javierttgg commented Jun 26, 2022

Hi @skydes, if I may ask something related, why should points that are not initially detected as local maxima in max_mask = scores == max_pool(scores) be added in max_mask = max_mask | (new_max_mask & (~supp_mask))?:

zeros = torch.zeros_like(scores)
max_mask = scores == max_pool(scores)
for _ in range(2):
supp_mask = max_pool(max_mask.float()) > 0
supp_scores = torch.where(supp_mask, zeros, scores)
new_max_mask = supp_scores == max_pool(supp_scores)
max_mask = max_mask | (new_max_mask & (~supp_mask))
return torch.where(max_mask, scores, zeros)

I'm struggling to see the intuition behind this decision. I believe this is also a difference with respect to the Daniel's Superpoint implementation.
Thanks in advance! :)

@sarlinpe
Copy link
Contributor

The initialization of L56 is too conservative as it doesn't account for chains - consider a point A that suppresses B, which itself suppresses C. Daniel's original implementation would retain A and C if they are sufficiently far away from each other, while L56 would retain A only. By removing suppressed pixels supp_mask in supp_scores and propagating scores for a few iterations, we recover points like C. This is a very good approximation and much faster than the original iterative implementation.

@javierttgg
Copy link

Understood, thanks a lot @skydes

@javierttgg
Copy link

javierttgg commented Jul 29, 2022

Sorry to bother again @skydes, I tried a little bit the code and I detected a possible undesired result.

This way of performing NMS can produce false positives. See for for instance this 1d example with a nms radius of 2:

image

In this case the false positive is 4.

A more complete example with nms_radius = 4:

image

[Code for reproduction]
import torch
import matplotlib.pyplot as plt


def simple_nms(scores, nms_radius: int):
    """ Fast Non-maximum suppression to remove nearby points """
    assert(nms_radius >= 0)

    def max_pool(x):
        return torch.nn.functional.max_pool2d(
            x, kernel_size=nms_radius*2+1, stride=1, padding=nms_radius)

    zeros = torch.zeros_like(scores)
    max_mask = scores == max_pool(scores)
    for _ in range(2):
        supp_mask = max_pool(max_mask.float()) > 0
        supp_scores = torch.where(supp_mask, zeros, scores)
        new_max_mask = supp_scores == max_pool(supp_scores)
        max_mask = max_mask | (new_max_mask & (~supp_mask))
    return torch.where(max_mask, scores, zeros)


if __name__ == "__main__":
    
    nms_radius = 4
    scores = torch.tensor(
        [[[[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
           [.7, .8, .9, .8, .7, .6, .5, .4, .3, .2, .1],
           [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
           [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]]]
        )
    
    scores = torch.tensor(
        [[[[.3, .4, .5, .4, .3, .2, .1, .0, .0, .0, .0],
           [.4, .5, .6, .5, .4, .3, .2, .1, .0, .0, .0],
           [.5, .6, .7, .6, .5, .4, .3, .2, .1, .0, .0],
           [.6, .7, .8, .7, .6, .5, .4, .3, .2, .1, .0],
           [.7, .8, .9, .8, .7, .6, .5, .4, .3, .2, .1],
           [.6, .7, .8, .7, .6, .5, .4, .3, .2, .1, .0],
           [.5, .6, .7, .6, .5, .4, .3, .2, .1, .0, .0],
           [.4, .5, .6, .5, .4, .3, .2, .1, .0, .0, .0],
           [.3, .4, .5, .4, .3, .2, .1, .0, .0, .0, .0]]]]
        )
    
    nms = simple_nms(scores, nms_radius)
    
    fig, axes = plt.subplots(ncols=3)
    titles = ['scores', 'expected', 'actual result']
    
    axes[0].imshow(
        scores.numpy()[0,0], 
        cmap='jet', 
        vmin=0., vmax=0.9)
    
    axes[1].imshow(
        (scores == 0.9).float()[0,0].numpy(), 
        cmap='Greens', 
        vmin=0, vmax=1)
    
    axes[2].imshow(
        (nms > 0).float()[0,0].numpy(), 
        cmap='Greens', 
        vmin=0, vmax=1)
    
    for axi, title in zip(axes, titles): axi.set(title=title)
    fig.tight_layout()

I believe this issue can be corrected easily by checking with a nms_radius=1 if the new detected keypoints are local maxima i.e.:

def simple_nms(scores, nms_radius: int):
    """ Fast Non-maximum suppression to remove nearby points """
    assert(nms_radius >= 0)

    def max_pool(x):
        return torch.nn.functional.max_pool2d(
            x, kernel_size=nms_radius*2+1, stride=1, padding=nms_radius)
    
    # ❕ added function  
    def max_pool_r1(x):
        return torch.nn.functional.max_pool2d(
            x, kernel_size=3, stride=1, padding=1)

    zeros = torch.zeros_like(scores)
    max_mask = scores == max_pool(scores)
    max_mask_r1 = scores == max_pool_r1(scores) # ❕ added line
    for _ in range(2):
        supp_mask = max_pool(max_mask.float()) > 0
        supp_scores = torch.where(supp_mask, zeros, scores)
        new_max_mask = supp_scores == max_pool(supp_scores)
        max_mask = max_mask | (new_max_mask & (~supp_mask) & max_mask_r1) # ❕ modified line
    return torch.where(max_mask, scores, zeros)

New result:

image

As a sidenote, I feel that this issue is not very likely to happen or very important (the proof is that SuperGlue works great :) ). Nevertheless this may help a bit (if deemed useful I can submit a PR).

On the other hand, I feel that it may not be sensible to introduce this change with the actual weights of SuperGlue. After this change, the detected keypoint patterns of Superpoint may change a bit. So, at least, it should be tested first.

@sarlinpe
Copy link
Contributor

Good catch, thank you for the detailed analysis and minimal example with code. This should ideally be fixed, but I indeed expect this to not make much difference. I'll run some evaluations when I find the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants