Skip to content

jinlinyi/PerspectiveFields

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Perspective Fields for Single Image Camera Calibration

Hugging Face Spaces

CVPR 2023 (✨Highlight)

Linyi Jin1, Jianming Zhang2, Yannick Hold-Geoffroy2, Oliver Wang2, Kevin Matzen2, Matthew Sticha1, David Fouhey1

1University of Michigan, 2Adobe Research


alt text

We propose Perspective Fields as a representation that models the local perspective properties of an image. Perspective Fields contain per-pixel information about the camera view, parameterized as an up vector and a latitude value.

swiping-1 swiping-2 swiping-3 swiping-4

📷 From Perspective Fields, you can also get camera parameters if you assume certain camera models. We provide models to recover camera roll, pitch, fov and principal point location.

Image 1 Image 2 Image 2

Updates

  • [April 2024]: 🚀 We've launched an inference version (main branch) with minimal dependencies. For training and evaluation, please checkout train_eval branch.
  • [July 2023]: We released a new model trained on 360cities and EDINA dataset, consisting of indoor🏠, outdoor🏙️, natural🌳, and egocentric👋 data!
  • [May 2023]: Live demo released 🤗. https://huggingface.co/spaces/jinlinyi/PerspectiveFields. Thanks Huggingface for funding this demo!

Table of Contents

Environment Setup

Inference

PerspectiveFields requires python >= 3.8 and PyTorch. | Pro tip: use mamba in place of conda for much faster installs.

# install pytorch compatible to your system https://pytorch.org/get-started/previous-versions/
conda install pytorch=1.10.0 torchvision cudatoolkit=11.3 -c pytorch
pip install git+https://github.com/jinlinyi/PerspectiveFields.git

Alternatively, install the package locally,

git clone git@github.com:jinlinyi/PerspectiveFields.git
# create virtual env
conda create -n perspective python=3.9
conda activate perspective
# install pytorch compatible to your system https://pytorch.org/get-started/previous-versions/
# conda install pytorch torchvision cudatoolkit -c pytorch
conda install pytorch=1.10.0 torchvision cudatoolkit=11.3 -c pytorch
# install Perspective Fields.
cd PerspectiveFields
pip install -e .

Train / Eval

For training and evaluation, please checkout the train_eval branch.

Demo

Here is a minimal script to run on a single image, see demo/demo.py:

import cv2
from perspective2d import PerspectiveFields
# specify model version
version = 'Paramnet-360Cities-edina-centered'
# load model
pf_model = PerspectiveFields(version).eval().cuda()
# load image
img_bgr = cv2.imread('assets/imgs/cityscape.jpg')
# inference
predictions = pf_model.inference(img_bgr=img_bgr)

# alternatively, inference a batch of images
predictions = pf_model.inference_batch(img_bgr_list=[img_bgr_0, img_bgr_1, img_bgr_2])

Model Zoo

Model Name and Weights Training Dataset Config File Outputs Expected input
[NEW]Paramnet-360Cities-edina-centered 360cities and EDINA paramnet_360cities_edina_rpf.yaml Perspective Field + camera parameters (roll, pitch, vfov) Uncropped, indoor🏠, outdoor🏙️, natural🌳, and egocentric👋 data
[NEW]Paramnet-360Cities-edina-uncentered 360cities and EDINA paramnet_360cities_edina_rpfpp.yaml Perspective Field + camera parameters (roll, pitch, vfov, cx, cy) Cropped, indoor🏠, outdoor🏙️, natural🌳, and egocentric👋 data
PersNet-360Cities 360cities cvpr2023.yaml Perspective Field Indoor🏠, outdoor🏙️, and natural🌳 data.
PersNet_paramnet-GSV-centered GSV paramnet_gsv_rpf.yaml Perspective Field + camera parameters (roll, pitch, vfov) Uncropped, street view🏙️ data.
PersNet_Paramnet-GSV-uncentered GSV paramnet_gsv_rpfpp.yaml Perspective Field + camera parameters (roll, pitch, vfov, cx, cy) Cropped, street view🏙️ data.

Coordinate Frame

alt text

yaw / azimuth: camera rotation about the y-axis pitch / elevation: camera rotation about the x-axis roll: camera rotation about the z-axis

Extrinsics: rotz(roll).dot(rotx(elevation)).dot(roty(azimuth))

Camera Parameters to Perspective Fields

Checkout Jupyter Notebook. Perspective Fields can be calculated from camera parameters. If you prefer, you can also manually calculate the corresponding Up-vector and Latitude map by following Equations 1 and 2 in our paper. Our code currently supports:

  1. Pinhole model [Hartley and Zisserman 2004] (Perspective Projection)
from perspective2d.utils.panocam import PanoCam
# define parameters
roll = 0
pitch = 20
vfov = 70
width = 640
height = 480
# get Up-vectors.
up = PanoCam.get_up(np.radians(vfov), width, height, np.radians(pitch), np.radians(roll))
# get Latitude.
lati = PanoCam.get_lat(np.radians(vfov), width, height, np.radians(pitch), np.radians(roll))
  1. Unified Spherical Model [Barreto 2006; Mei and Rives 2007] (Distortion).
xi = 0.5 # distortion parameter from Unified Spherical Model

x = -np.sin(np.radians(vfov/2))
z = np.sqrt(1 - x**2)
f_px_effective = -0.5*(width/2)*(xi+z)/x
crop, _, _, _, up, lat, xy_map = PanoCam.crop_distortion(equi_img,
                                             f=f_px_effective,
                                             xi=xi,
                                             H=height,
                                             W=width,
                                             az=yaw, # degrees
                                             el=-pitch,
                                             roll=-roll)

Visualize Perspective Fields

We provide a one-line code to blend Perspective Fields onto input image.

import matplotlib.pyplot as plt
from perspective2d.utils import draw_perspective_fields
# Draw up and lati on img. lati is in radians.
blend = draw_perspective_fields(img, up, lati)
# visualize with matplotlib
plt.imshow(blend)
plt.show()

Perspective Fields can serve as an easy visual check for correctness of the camera parameters.

  • For example, we can visualize the Perspective Fields based on calibration results from this awesome repo.

alt text

  • Left: We plot the perspective fields based on the numbers printed on the image, they look accurate😊;

  • Mid: If we try a number that is 10% off (0.72*0.9=0.648), we see mismatch in Up directions at the top right corner;

  • Right: If distortion is 20% off (0.72*0.8=0.576), the mismatch becomes more obvious.

Citation

If you find this code useful, please consider citing:

@inproceedings{jin2023perspective,
      title={Perspective Fields for Single Image Camera Calibration},
      author={Linyi Jin and Jianming Zhang and Yannick Hold-Geoffroy and Oliver Wang and Kevin Matzen and Matthew Sticha and David F. Fouhey},
      booktitle = {CVPR},
      year={2023}
}

Acknowledgment

This work was partially funded by the DARPA Machine Common Sense Program. We thank authors from A Deep Perceptual Measure for Lens and Camera Calibration for releasing their code on Unified Spherical Model.