Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training on a different dataset - Intrinsic parameters #6

Closed
bl0up opened this issue May 21, 2019 · 10 comments
Closed

Training on a different dataset - Intrinsic parameters #6

bl0up opened this issue May 21, 2019 · 10 comments

Comments

@bl0up
Copy link

bl0up commented May 21, 2019

Hello,

This is not an issue with the code, but rather a question about training on cityscapes instead of kitti. I created a new Dataset class in which I read the camera parameters stored in a json file and set them in a numpy array:

K = np.array([[fx, 0, u0, 0], [0, fy, v0, 0], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=np.float32)

These intrinsics are then used in generate_images_pred. I noticed on tensorboard that the color_pred_s_0 was sometimes really messy. Apart from that, the other outputs look good. So, I was wondering, is there anything else to change in the code?

Best regards

@daniyar-niantic
Copy link
Collaborator

daniyar-niantic commented May 21, 2019

Hello!

The intrinsics in kitti_dataset are normalized. So, you need to do
K[0,:] /= image_width
K[1,:] /= image_height

Make sure to use the image resolution of the original files that were used for calibration.

Regards,

Daniyar

@bl0up
Copy link
Author

bl0up commented May 21, 2019

Ok, thanks a lot.

Best regards,
Ambroise

@ezorfa
Copy link

ezorfa commented Jan 6, 2020

@bl0up Could you please let me know the intrinsics value for the cityscapes dataset?

@bl0up
Copy link
Author

bl0up commented Jan 6, 2020

@ezorfa Sure, I used this matrix:

self.K = np.array([[2262.52 / 2048, 0, 0.5, 0],
                           [0, 1096.98 / 1024, 0.5, 0],
                           [0, 0, 1, 0],
                           [0, 0, 0, 1]], dtype=np.float32)

@ezorfa
Copy link

ezorfa commented Jan 6, 2020

@bl0up Thankyou so much for quick reply! Appreciated.

@mrharicot
Copy link
Collaborator

@ezorfa @bl0up There is a mistake in your intrinsics matrix
In the very large majority of cases you should have fx = fy (unless you have an anamorphic lens)
Here is a single calibration from a cityscapes sequence:

{
    "extrinsic": {
        "baseline": 0.209313, 
        "pitch": 0.038, 
        "roll": 0.0, 
        "x": 1.7, 
        "y": 0.1, 
        "yaw": -0.0195, 
        "z": 1.22
    }, 
    "intrinsic": {
        "fx": 2262.52, 
        "fy": 2265.3017905988554, 
        "u0": 1096.98, 
        "v0": 513.137
    }
}

Thus the intrinsics matrix should be:

self.K = np.array([[2262 / 2048, 0, 0.5, 0],
                   [0, 2262 / 1024, 0.5, 0],
                   [0, 0, 1, 0],
                   [0, 0, 0, 1]], dtype=np.float32)

@ezorfa
Copy link

ezorfa commented Jan 6, 2020

@bl0up @mrharicot Hmm.. I see the mistake. Thankyou!

@ezorfa
Copy link

ezorfa commented Jan 7, 2020

@mrharicot @daniyar-niantic Please clarify this:
When I prepare the cityscapes data using https://github.com/anuragranj/cc/blob/afd407869b89d61ee6911b13659e8d1c39dc3634/data/prepare_train_data.py, then the images are cropped (to remove car logo) and the resultant shape would be 2048 X 768 .

Will the intrinsics remain the same value as :
self.K = np.array([[2262 / 2048, 0, 0.5, 0], [0, 2262 / 1024, 0.5, 0], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=np.float32)
or should I divide by img_height = 768 instead?

@ezorfa
Copy link

ezorfa commented Jan 7, 2020

@mrharicot In my understanding, for the resultant image size of (img_width=2048 , img_ht = 768) the intrinsics value should be:

self.K = np.array([[2262 / 2048, 0, 1096 / 2048, 0],
                   [0, 2262 / 768, 513 / 768, 0],
                   [0, 0, 1, 0],
                   [0, 0, 0, 1]], dtype=np.float32)

Could you please verify the u0 and v0 values?

Thankyou!

@mrharicot
Copy link
Collaborator

@ezorfa for simplicity we did set the principal point to the center of the image, which is slightly inaccurate but has little impact in practice and makes horizontal flipping easy as there is no need to change the intrinsics
if you crop the bottom part of the image, you will need to recompute the location of the principal point.
see here for more details https://github.com/BerkeleyAutomation/perception/blob/master/perception/camera_intrinsics.py#L184
if you also resize the image you will also need to scale the vertical focal length by the inverse ratio of heights

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants