updated result

btilmon · Mar 21, 2022 · 14eb871 · 14eb871
1 parent 7f52c72
commit 14eb871
Show file tree

Hide file tree

Showing 8 changed files with 28 additions and 38 deletions.
diff --git a/README.md b/README.md
@@ -1,25 +1,15 @@
 # illumiGrad
 
-Automatically calibrate RGBD cameras with PyTorch. The intrinsics and extrinsics of the camera pair are optimized based on photometric consistency after projecting the ToF camera into the color camera. I tested on semi-rectified color and Kinect continuous wave ToF cameras from the [NYU Depth V2 dataset](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html). Related work using photometric consistency as a loss signal: [LSD-SLAM](https://jakobengel.github.io/pdf/engel14eccv.pdf), [KinectFusion](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ismar2011.pdf), [CVPR 2017](https://arxiv.org/pdf/1704.07813.pdf), [ICCV 2019](https://arxiv.org/pdf/1806.01260.pdf), [ICCV 2021](https://arxiv.org/pdf/2108.13826.pdf).
+Automatically calibrate RGBD cameras with PyTorch. The intrinsics and extrinsics of the camera pair are optimized based on photometric consistency after projecting the depth camera into the color camera. I tested on semi-rectified color and Kinect V1 depth cameras from the [NYU Depth V2 dataset](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html). Related work using photometric consistency as a loss signal: [LSD-SLAM](https://jakobengel.github.io/pdf/engel14eccv.pdf), [KinectFusion](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ismar2011.pdf), [CVPR 2017](https://arxiv.org/pdf/1704.07813.pdf), [ICCV 2019](https://arxiv.org/pdf/1806.01260.pdf), [ICCV 2021](https://arxiv.org/pdf/2108.13826.pdf).
 
 <p align="center">
-  <img src="results/1.gif" alt="example input output gif" width="800" />
+  <img src="results/result.gif" alt="example input output gif" />
 </p>
 
-<p align="center">
-  <img src="results/2.gif" alt="example input output gif" width="800" />
-</p>
-
-<p align="center">
-  <img src="results/0.gif" alt="example input output gif" width="800" />
-</p>
-
-
-
 
-## Setting up color camera and ToF camera
+## Setting up color camera and depth camera
 
-1. I tested on semi-rectified color and ToF cameras in a stereo arrangement. This makes initialization much nicer because we can assume identity rotation. The translation and rotation vectors are updated during optimization, but there is a better chance of convergence if you tune the x component of the translation vector from the ToF camera to the color camera. I initialize the x component of the translation vector to 0.1.
+1. I tested on semi-rectified color and depth cameras in a stereo arrangement. This lets us constrain optimization by initializing with identity rotation. The translation and rotation vectors are updated during optimization, but there is a better chance of convergence if you tune the x component of the translation vector from the depth camera to the color camera. I initialize the x component of the translation vector to 0.1.
 
 2. There is a better chance of convergence if you initialize the focal lengths in a sensible range. The focal length fx = F * s, where F is your lens in mm and s is the unit-less horizontal resolution. Repeat for fy. F and s can be found either in the metadata of your image or can be easily looked up through the technical camera docs online. I tested with randomly initializing focal lengths between 400 and 600. I initialize the other intrinsic matrix parameters to 0.5.
 
@@ -30,7 +20,7 @@ In summary, get the cameras decently rectified and initialize the camera matrice
 
 ## Calibrate
 
-I only tested NYU Depth V2 and provide a short segment of it. I recommend using scenes with weak perspective and valid ToF pixels to calibrate since they optimized better from my experience (middle and bottom video from above). When there is strong perspective and less valid ToF pixels optimization struggled more (top video from above). Taking a varied video of a dynamic environment will potentially improve performance because it gives optimization a chance to get out of local minima. Optimization infrequently diverged after quality convergence even for long videos with varied scenes, so it seems that camera matrix initialization matters most and quality scene content initialization matters second for good final convergence. 
+I only tested NYU Depth V2 and provide a short segment of it. I recommend using scenes with weak perspective and valid depth pixels to calibrate since they optimized better from my experience (middle and bottom video from above). When there is strong perspective and less valid depth pixels optimization struggled more (top video from above). Taking a varied video of a dynamic environment will potentially improve performance because it gives optimization a chance to get out of local minima. Optimization infrequently diverged after quality convergence even for long videos with varied scenes, so it seems that camera matrix initialization matters most and quality scene content initialization matters second for good final convergence. 
 
 Calibrate with:
 

diff --git a/__pycache__/camera.cpython-36.pyc b/__pycache__/camera.cpython-36.pyc
diff --git a/camera.py b/camera.py
@@ -4,7 +4,7 @@
 
 class BackprojectDepth(torch.nn.Module):
     """
-    Backproject absolute depth from ToF to point cloud
+    Backproject absolute depth from depth camera to point cloud
     (adapted from https://github.com/nianticlabs/monodepth2)
     """
     def __init__(self, opt):
@@ -68,16 +68,16 @@ def __init__(self, opt):
 
         # initialize color camera intrinsics K and extrinsics E
         self.initColorCamera()
-        # initialize ToF camera intrinsics K
-        self.initTofCamera()
+        # initialize depth camera intrinsics K
+        self.initDepthCamera()
 
 
-    def initTofCamera(self):
+    def initDepthCamera(self):
         # NYU Depth V2 has calibrated camera parameters
         # self.opt.refine is if you have already have decently good cailibration but 
         # still want to tune it
         if self.opt.refine:
-            self.tofK = torch.tensor(
+            self.depthK = torch.tensor(
                 [[582.624, 0, 0.0313, 0], 
                 [0, 582.691, 0.024, 0], 
                 [0, 0, 1, 0], 
@@ -92,7 +92,7 @@ def initTofCamera(self):
             col3 = torch.cat((offsets, offsets), dim=0)
             col3 = torch.cat((col3, torch.tensor([[1], [0]], requires_grad=False)), dim=0)
             col4 = torch.tensor([[0], [0], [0], [1]], requires_grad=False)
-            self.tofK = torch.nn.Parameter(
+            self.depthK = torch.nn.Parameter(
                 torch.cat((col1, col2, col3, col4), dim=1)[None], 
                 requires_grad=True)          
 
@@ -104,15 +104,15 @@ def initColorCamera(self):
                 [0, 519.470, 0.024, 0], 
                 [0, 0, 1, 0], 
                 [0, 0, 0, 1]], requires_grad=True)[None]
-            self.colorEgt = torch.tensor(
+            self.colorE = torch.tensor(
                 [[0.999, 0.0051, 0.0043, 0.025], 
                 [-0.0050, 0.999, -0.0037, -0.000293], 
                 [-0.00432, 0.0037, 0.999, 0.000662], 
                 [0, 0, 0, 1]])[None]
-            self.colorEgt = self.colorEgt.transpose(1,2)
-            self.colorEgt = torch.linalg.inv(self.colorEgt)
-            print(self.colorEgt); sys.exit()
-            self.colorEgt = torch.nn.Parameter(self.colorEgt, requires_grad=True)
+            self.colorE = self.colorE.transpose(1,2)
+            self.colorE = torch.linalg.inv(self.colorE)
+            print(self.colorE); sys.exit()
+            self.colorE = torch.nn.Parameter(self.colorE, requires_grad=True)
         else:
             # randomly generate focal lengths in a range
             # randomly generate remaining intrinsic parameters between 0 and 1
@@ -133,14 +133,14 @@ def initColorCamera(self):
             a = torch.cat((a, torch.zeros(1, 3)), dim=0)
             t = torch.tensor([[.1], [0.], [0.]], requires_grad=True) # translation vec
             t = torch.cat((t, torch.tensor([[1.]])))
-            self.colorEgt = torch.cat((a, t), dim=1)[None]
-            self.colorEgt = self.colorEgt.transpose(1, 2)
-            self.colorEgt = torch.linalg.inv(self.colorEgt)
-            self.colorEgt = torch.nn.Parameter(self.colorEgt, requires_grad=True)
+            self.colorE = torch.cat((a, t), dim=1)[None]
+            self.colorE = self.colorE.transpose(1, 2)
+            self.colorE = torch.linalg.inv(self.colorE)
+            self.colorE = torch.nn.Parameter(self.colorE, requires_grad=True)
 
-    def forward(self, tofDepth, color):
-        pointCloud = self.backprojectDepth(tofDepth, self.tofK)
-        predCoords = self.projectDepth(pointCloud, self.colorK, self.colorEgt)
+    def forward(self, depth, color):
+        pointCloud = self.backprojectDepth(depth, self.depthK)
+        predCoords = self.projectDepth(pointCloud, self.colorK, self.colorE)
         predColor = torch.nn.functional.grid_sample(color, 
                                                     predCoords, 
                                                     padding_mode="border", 

diff --git a/main.py b/main.py
@@ -26,11 +26,11 @@ def run(self):
         color = self.data[("color", 0)]
         depth = self.data[("depth", 0)]
         mask = self.data[("mask", 0)]
-        beforecolorE = self.camera.colorEgt.clone()
-        beforedepthK = self.camera.tofK.clone()
+        beforecolorE = self.camera.colorE.clone()
+        beforedepthK = self.camera.depthK.clone()
         initNum = 200
         # initialize training on single pair
-        print("\n Begin optimizing on initial color-ToF pair in trajectory \n")
+        print("\n Begin optimizing on initial color-depth pair in trajectory \n")
         for i in range(initNum):
             self.optimizer.zero_grad()
             predColor = self.camera(depth, color)
@@ -76,10 +76,10 @@ def run(self):
 
         print("EXTRINSICS")
         print(beforecolorE)
-        print(self.camera.colorEgt)
+        print(self.camera.colorE)
         print("\n INTRINSICS")
         print(beforedepthK)
-        print(self.camera.tofK)
+        print(self.camera.depthK)
 
 if __name__ == "__main__":
     opt = Options()

diff --git a/results/0.gif b/results/0.gif
diff --git a/results/1.gif b/results/1.gif
diff --git a/results/2.gif b/results/2.gif
diff --git a/results/result.gif b/results/result.gif