You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your prompt answer for my last question, but I have another one here which is not related with the code.
As mentioned in the paper and the code, the ref images are [t-n, t-n+1, .. ,t-1, t+1, ...t+n], but what if I use [t-n, .. t-1] only? What do you suppose the impact of the result? Thank you.
Cheers,
Rui
The text was updated successfully, but these errors were encountered:
Network can still converge, but the fact that in KITTI the camera is more or less always going forward will restrict your network from getting useful information.
By restricting the ref frame to anterior frames will make warping always do the same thing, which is zooming out. It's better than restricting to posterior frames, which feature a whole set of out of bounds pixels, but you will lack precision in the pixels closer to the center of the image, where the optical is very low because it's close to the focus of expansion.
A quick analysis of what translation between ref frame and target frame (excluding rotations) is the best can lead to believe :
forward translation is good for pixel close to the focus of expansion
backward translation (your case) is good for pixel close to the boundaries
lateral translation is good for every pixel, which is partly why monodepth is better than sfmlearner, but you then have to be careful with occlusions, which will be more proheminent with this kind of translation.
All in all, the best is probably a mix of everything, which is helped by the posterior/anterior mix that having the target in the center of the sequence can get.
You may be thinking of an inline learning, where posterior frames are not there yet, but even in that case, I think that it would be better to make target frame a bit back in time to be able to compare it to more recent frames, so that you will gain more translation heterogeneity.
Hi Clement
Thanks for your prompt answer for my last question, but I have another one here which is not related with the code.
As mentioned in the paper and the code, the ref images are [t-n, t-n+1, .. ,t-1, t+1, ...t+n], but what if I use [t-n, .. t-1] only? What do you suppose the impact of the result? Thank you.
Cheers,
Rui
The text was updated successfully, but these errors were encountered: