- 1) Pubilc Datasets and Challenges
- 2) Pioneers and Experts
- 3) Related Materials (Papers, Sources Code, Blogs, Videos and Applications)
- Flickr-Faces-HQ (FFHQ) Dataset: Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for
generative adversarial networks (GAN)
. The dataset consists of70,000
high-quality PNG images at 1024×1024 resolution and contains considerable variation in terms of age, ethnicity and image background. It also has good coverage of accessories such as eyeglasses, sunglasses, hats, etc. The images were crawled fromFlickr
, thus inheriting all the biases of that website, and automatically aligned and cropped usingdlib
. (CVPR2019) A Style-Based Generator Architecture for Generative Adversarial Networks
- BIWI RGBD-ID Dataset: The BIWI RGBD-ID Dataset is a RGB-D dataset of people targeted to long-term people re-identification from RGB-D cameras. It contains 50 training and 56 testing sequences of 50 different people.
- 300W-LP & AFLW2000-3D: 300W-LP has the synthesized large-pose face images from 300W. AFLW2000-3D is the fitted 3D faces of the first 2000 AFLW samples, which can be used for 3D face alignment evaluation.
- CMU Panoptic Studio Dataset: Currently, 480 VGA videos, 31 HD videos, 3D body pose, and calibration data are available. PointCloud DB from 10 Kinects (with corresponding 41 RGB videos) is also available (6+ hours of data). Please refer the official website for details. Dataset paper link Panoptic studio: A massively multiview system for social interaction capture.
- HollywoodHead dataset: HolleywoodHeads dataset is a head detection datset. HollywoodHeads dataset contains 369846 human heads annotated in 224740 video frames from 21 Hollywood movies.
- Brainwash dataset: Brainwash dataset is related for face detection. Brainwash dataset contains 11917 images with 91146 labeled people.
- SCUT-HEAD-Dataset-Release: SCUT-HEAD is a large-scale head detection dataset, including 4405 images labeld with 111251 heads. The dataset consists of two parts. PartA includes 2000 images sampled from monitor videos of classrooms in an university with 67321 heads annotated. PartB includes 2405 images crawled from Internet with 43930 heads annotated.
- ShanghaiTech dataset: Dataset appeared in Single Image Crowd Counting via Multi Column Convolutional Neural Network(MCNN) in CVPR2016. 【情况介绍】:包含标注图片 1198 张,共 330165 人,分为 A 和 B 两个部分,A 包含 482 张图片,均为网络下载的含高度拥挤人群的场景图片,人群数量从 33 到 3139 个不等,训练集包含 300 张图片和测试集包含 182 张图片。B 包含 716 张图片,这些图片的人流场景相对稀疏,拍摄于街道的固定摄像头,群体数量从 12 到 578 不等。训练集包含 400 张图像,测试集包含 316 张图像。
- UCF-QNRF - A Large Crowd Counting Data Set: It contains 1535 images which are divided into train and test sets of 1201 and 334 images respectively. Paper is published in ECCV2018. 【情况介绍】:这是最新发布的最大人群数据集。它包含 1535 张来自 Flickr、网络搜索和 Hajj 片段的密集人群图像。数据集包含广泛的场景,拥有丰富的视角、照明变化和密度多样性,计数范围从 49 到 12865 不等,这使该数据库更加困难和现实。此外,图像分辨率也很大,因此导致头部尺寸出现大幅变化。
- UCSD Pedestrian Dataset: Video of people on pedestrian walkways at UCSD, and the corresponding motion segmentations. Currently two scenes are available. 【情况介绍】:由 2000 帧监控摄像机拍摄的照片组成,尺寸为 238×158。这个数据集的密度相对较低,每幅图像 11 到 46 人不等,平均约 25 人。在所有帧中,帧 601 到 1400 为训练集,其余帧为测试集。
- Megvii CrowdHuman: CrowdHuman is a benchmark dataset to better evaluate detectors in crowd scenarios. The CrowdHuman dataset is large, rich-annotated and contains high diversity. CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively. There are a total of 470K human instances from train and validation subsets and 23 persons per image, with various kinds of occlusions in the dataset. Each human instance is annotated with a head bounding-box, human visible-region bounding-box and human full-body bounding-box. We hope our dataset will serve as a solid baseline and help promote future research in human detection tasks.
👍Michael Black; 👍Jian Sun; 👍Gang YU; 👍Yuliang Xiu 修宇亮; 👍(website) face-rec
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
-
TUD(CVPR2010) Monocular 3D Pose Estimation and Tracking by Detection [paper link][
TUD Dataset
] -
(ICCV2015) Uncovering Interactions and Interactors: Joint Estimation of Head, Body Orientation and F-Formations From Surveillance Videos [paper link]
-
AKRF-VW(IJCV2017) Growing Regression Tree Forests by Classification for Continuous Object Pose Estimation [paper link]
-
CPOEHK(ISCAS2019) Continuous Pedestrian Orientation Estimation using Human Keypoints [paper link]
-
❤ MEBOW(CVPR2020) MEBOW: Monocular Estimation of Body Orientation in the Wild [paper link][project link][codes|official][
COCO-MEBOW dataset, Body Orientation Estimation
] -
PedRecNet(IV2022) PedRecNet: Multi-task deep neural network for full 3D human pose and orientation estimation [paper link][codes|official]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
-
DM-Count(NIPS2020) Distribution Matching for Crowd Counting [paper link][arxiv link][code|official][CVLab@StonyBrook]
-
LearningToCountEverything(CVPR2021) Learning To Count Everything [arxiv link][code|official][CVLab@StonyBrook]
-
CrowdCounting-P2PNet(ICCV2021 Oral) Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework [paper link][code|official][
Tencent Youtu Research
] -
ZeroShotCounting(CVPR2023) Zero-shot Object Counting [arxiv link] [code|official][CVLab@StonyBrook]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
-
HGM(CVPR2018) A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation [paper link]
-
ETH-XGaze(ECCV2020) ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation [arxiv link][project link][Codes|PyTorch(official)]
-
EVE(ECCV2020) Towards End-to-end Video-based Eye-tracking [arxiv link][project link][Codes|PyTorch(official)]
-
MTGLS(WACV2022) MTGLS: Multi-Task Gaze Estimation With Limited Supervision [paper link]
-
RUDA(CVPR2022) Generalizing Gaze Estimation With Rotation Consistency [paper link]
-
❤ GazeOnce/MPSGaze(CVPR2022) GazeOnce: Real-Time Multi-Person Gaze Estimation [paper link][codes|official][
The MPSGaze is a synthetic dataset (ETH-XGaze + WiderFace) containing full images (instead of only cropped faces) that provides ground truth 3D gaze directions for multiple people in one image.
] -
❤ GAFA(CVPR2022) Dynamic 3D Gaze From Afar: Deep Gaze Estimation From Temporal Eye-Head-Body Coordination [paper link][project link][codes|official][
The GAze From Afar (GAFA) dataset consists of surveillance videos of freely moving people with automatically annotated 3D gaze, head, and body orientations.
] -
NeRF-Gaze(arxiv2022) NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation [paper link][
HKVision
] -
GazeNeRF(arxiv2022) GazeNeRF: 3D-Aware Gaze Redirection with Neural Radiance Fields [paper link][
ETH
] -
PARKS-Gaze(arxiv2023) Towards Precision in Appearance-based Gaze Estimation in the Wild [paper link][code|official][
PARKS-Gaze
dataset] -
CUDA-GHR(WACV2023) CUDA-GHR: Controllable Unsupervised Domain Adaptation for Gaze and Head Redirection [paper link][code|official]
-
👍PJAE(ICCV2023) Interaction-aware Joint Attention Estimation Using People Attributes [paper link][arxiv link link][project link][code|official][
Japan
,Toyota Technological Institute and University of Hyogo
]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
- (jianshu) 人脸关键点对齐
- Procrustes Analysis [CSDN blog][wikipedia][scipy.spatial.procrustes][github]
- (website) Procrustes Analysis and its application in computer graphaics
- (github) ASM-for-human-face-feature-points-matching
- (github) align_dataset_mtcnn
- (Website) Face Alignment Across Large Poses: A 3D Solution (official website)
- (github) 🔥🔥The pytorch implement of the head pose estimation(yaw,roll,pitch) and emotion detection
-
300-W(ICCV2013) 300 Faces In-the-Wild Challenge (300-W), ICCV 2013 [project link] [(IMAVIS) 300 faces In-the-wild challenge: Database and results] [(ICCV-W) 300 Faces in-the-Wild Challenge: The first facial landmark localization Challenge]
-
FaceSynthetics(ICCV2021) Fake It Till You Make It: Face analysis in the wild using synthetic data alone [paper link][project link][code|official]
-
Dlib(CVPR2014) One Millisecond Face Alignment with an Ensemble of Regression Trees [paper link][codes|official C++][
pip install dlib
] -
3000FPS(CVPR2014) Face Alignment at 3000 FPS via Regressing Local Binary Features [paper link][Codes|opencv(offical)][Codes|liblinear(unoffical)][CSDN blog]
-
❤3DDFA(CVPR2016) Face Alignment Across Large Poses: A 3D Solution [paper link][project link][codes|PyTorch 3DDFA]
-
FAN(ICCV2017) How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks) [paper link][Adrian Bulat][Codes|PyTorch(offical)][CSDN blogs][
pip install face-alignment
] -
PRNet(ECCV2018) Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network [arxiv link][Codes|TensorFlow(offical)]
-
❤3DDFA_V2(ECCV2020) Towards Fast, Accurate and Stable 3D Dense Face Alignment [paper link][codes|PyTorch 3DDFA_V2]
-
❤SPIGA(BMVC2022) Shape Preserving Facial Landmarks with Graph Attention Networks [paper link][project link][codes|official PyTorch]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
- (github) A-Light-and-Fast-Face-Detector-for-Edge-Devices
- (website) FDDB: Face Detection Data Set and Benchmark Home
- (CSDN blogs) 人脸检测(十八)--TinyFace(S3FD,SSH,HR,RSA,Face R-CNN,PyramidBox)
- (github) e2e-joint-face-detection-and-alignment
- (github) libfacedetection in PyTorch
- (github) 1MB lightweight face detection model (1MB轻量级人脸检测模型)
- (blog) LFFD 再升级!新增行人和人头检测模型,及优化的C++实现
- (github) YOLO-FaceV2: A Scale and Occlusion Aware Face Detector[paper link]
- WIDER FACE(CVPR2016) WIDER FACE: A Face Detection Benchmark [paper link][project link origin][project link new]
-
❤MTCNN(SPL2016) Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks [paper link][project link][Codes|Caffe&Matlab(offical)][Codes|MXNet(unoffical)][Codes|Tensorflow(unoffical)][CSDN blog]
-
TinyFace(CVPR2017) Finding Tiny Faces [arxiv link][preject link][Codes|MATLAB(offical)][Codes|PyTorch(unoffical)][Codes|MXNet(unoffical)][Codes|Tensorflow(unoffical)]
-
FaceBoxes(IJCB2017) FaceBoxes: A CPU Real-time Face Detector with High Accuracy [arxiv link][Codes|Caffe(offical)][Codes|PyTorch(unoffical)]
-
SSH(ICCV2017) SSH: Single Stage Headless Face Detector [arxiv link][Codes|Caffe(offical)][Codes|MXNet(unoffical SSH with Alignment)][Codes|(unoffical enhanced-ssh-mxnet)]
-
❤S3FD(ICCV2017) S³FD: Single Shot Scale-invariant Face Detector [arxiv link][Codes|Caffe(offical)]
-
RSA(ICCV2017) Recurrent Scale Approximation (RSA) for Object Detection [arxiv link][Codes|Caffe(offical)]
-
DSFD(CVPR2019) DSFD: Dual Shot Face Detector [arxiv link][Codes|PyTorch(offical)][CSDN blog]
-
LFFD(arxiv2019) LFFD: A Light and Fast Face Detector for Edge Devices [arxiv link][Codes|PyTorch, offical V1][Codes|PyTorch, offical V2]
-
❤RetinaFace(CVPR2020) RetinaFace: Single-shot Multi-level Face Localisation in the Wild [paper link][Github - insightface][Project - insightface][codes|PyTorch(not official)][codes|MXNet(official)][
RetinaFace: Single-stage Dense Face Localisation in the Wild
is the same work released in Arxiv2019]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
- (website) EyeKey 眼神科技
- (CSDN blogs) 人脸比对(1:N)
- (github) Face Recognition (dlib with deep learning reaching 99.38% acc in LFW)
- (website) face_recognition package
###3 Papers
-
ArcFace/InsightFace(CVPR2019) ArcFace: Additive Angular Margin Loss for Deep Face Recognition [arxiv link][Codes|MXNet(offical insightface)][Codes|MXNet(offical ArcFace)][CSDN blog]
-
SubCenter-ArcFace(ECCV2020) Sub-center ArcFace: Boosting Face Recognition by Large-scale Noisy Web Faces [paper link][Codes|MXNet(offical SubCenter-ArcFace)][CSDN blogs]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
- (CSDNblogs) 3D人脸重建--学习笔记
- (CSDNblogs) PRNet人脸重建学习笔记
- (github) Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.
- (zhihu) 1.利用3D mesh生成2D图像 2.人脸3DMM 3. 2D图像的3D重建(3DMM)
- (website) searching '3D Face Reconstruction' in the website catalyzex
- (github) Awesome-Talking-Face (papers, code and projects)
- (github) awesome 3d human reconstruction --> 3d_human_face
- [Papers With Code Ranks][NoW Benchmark] [FaceScape] [D3DFACS] [AFLW2000-3D]
- [CelebA] (ICCV2015) Deep Learning Face Attributes in the Wild [project link] [zhihu-zhuanlan] [(ICLR2018 by NVIDIA) CelebA-HQ (paperswithcode), CelebA-HQ (tensorflow-download), CelebA-HQ (how to generate this dataset?), CelebA-HQ (upload by somebody)] [(CVPR2020 by MMLab) CelebAMask-HQ (codes)] [(CVPR2021 by MMLab) Multi-Modal-CelebA-HQ (codes)] [
not a face reconstruction dataset
] - [Feng et al. using Stirling meshes (Stirling/ESRC Benchmark)] (FG2018) Evaluation of Dense 3D Reconstruction from 2D Face Images in the Wild [pdf page]
- [NoW ("Not quite in-the-Wild")] RingNet(CVPR2019) Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision [NoW Challenge]
- [FaceScape] FaceScape(CVPR2020) FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
- [FaceSynthetics] FaceSynthetics(ICCV2021) Fake It Till You Make It: Face analysis in the wild using synthetic data alone [
synthetic face image with 70 landmarks
] - [DAD-3DHeads] DAD-3DNet(CVPR2022) DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image
- [Multiface] Multiface(arxiv2022.07) Multiface: A Dataset for Neural Face Rendering [github link] [
Facebook
]
-
Survey of optimization-based methods(CGFroum2018) State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications [paper link][pdf page]
-
Survey of face models(TOG2020) 3D Morphable Face Models—Past, Present, and Future [paper link]pdf page]
-
Survey of regression-based methods(CSReview2021) Survey on 3D face reconstruction from uncalibrated images [paper link][pdf page]
-
Survey on SOTA 3D reconstruction with single RGB image (arxiv2022) State of the Art in Dense Monocular Non-Rigid 3D Reconstruction [paper link]
-
Blanz et al.(SIGGRAPH1999) A morphable model for the synthesis of 3D faces [paper link][
3DMM of face/head
][The seminal work of 3DMM
] -
⭐BFM(AVSS2009) A 3D Face Model for Pose and Illumination Invariant Face Recognition [paper link][project link][bfm2019 model downloading][Basel Face Model 2019 Viewer][
3DMM of face/head (BFM)
][Well-known 3DMM byUniversity of Basel, Switzerland
] -
LSFM(CVPR2016) A 3D Morphable Model learnt from 10,000 faces [paper link][project link][code|official][(IJCV2017) Large Scale 3D Morphable Models][
3DMM of face/head (LSFM)
][Bythe iBUG group at Imperial, UK
] -
⭐FLAME(SIGGRAPH2017) Learning a model of facial shape and expression from 4D scans [paper link][project link][code|official Chumpy FLAME fitting][code|official FLAME_PyTorch][code|official FLAME texture fitting][
3DMM of face/head (FLAME)
][MPII 马普所
] -
3DMM-CNN(CVPR2017) Regressing Robust and Discriminative 3D Morphable Models with a very Deep Neural Network [paper link][code|official]
-
MoFA(ICCV2017) MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction [paper link]
-
VRN(ICCV2017) Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression [arxiv link][project link][online website][Codes|Torch7(offical)]
-
BIP(IJCV2018) Occlusion-Aware 3D Morphable Models and an Illumination Prior for Face Image Analysis [paper link][project link][code|official][
Basel Illumination Prior 2017
] -
PRNet(ECCV2018) Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network [arxiv link][Codes|TensorFlow(offical)]
-
LYHM(IJCV2019) Statistical Modeling of Craniofacial Shape and Texture [paper link][project link][
3DMM of face/head (LYHM)
][ByLiverpool-York: Liverpool (UK) and the University of York (UK)
] -
⭐Syn&Real(ICCV2019) 3D Face Modeling From Diverse Raw Scan Data [paper link][codes|official][
A subset of Stirling/ESRC 3D face database
] -
👍Deep3DFaceRecon(CVPRW2019) Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set [paper link][code|official][code|not official, a better version using PyTorch]
-
⭐RingNet(CVPR2019) Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision [paper link][project link][codes|official Tensorflow ][NoW evaluation code][NoW challenge page][
NoW dataset
][FLAME based
][MPII 马普所
] -
FaceScape(CVPR2020) FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction [paper link][project link][codes|official][
3DMM of face/head (FaceScape)
and3D face dataset (FaceScape)
][ByNJU
] -
UMDFA(ECCV2020) “Look Ma, no landmarks!”–Unsupervised, model-based dense face alignment [paper link][code|official (not released)]
-
MGCNet(ECCV2020) Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency [paper link][code|official]
-
⭐3DDFA_V2(ECCV2020) Towards Fast, Accurate and Stable 3D Dense Face Alignment [paper link][codes|PyTorch 3DDFA_V2]
-
⭐SynergyNet(3DV2021) Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry [paper link][project link][codes|PyTorch]
-
👍H3D-Net(ICCV2021) H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction [paper link][arxiv link][project link][
H3DS Dataset
] -
(ICCV2021) Towards High Fidelity Monocular Face Reconstruction with Rich Reflectance using Self-supervised Learning and Ray Tracing [paper link][
MPII 马普所
] -
ToFu(ICCV2021) ToFu: Topologically Consistent Multi-View Face Inference Using Volumetric Sampling [paper link][arxiv link][project link][code|official][
USC Institute for Creative Technologies
andMPII 马普所
] -
HIFI3D(TOG2021) High-Fidelity 3D Digital Human Head Creation from RGB-D Selfies [paper link][project link][codes|official][
3DMM of face/head (HIFI3D)
][ByTencent
] -
⭐DECA(TOG2021)(SIGGRAPH2021) Learning an animatable detailed 3D face model from in-the-wild images [paper link][project link][code|official][
MPII 马普所
] -
(TOG2021) Semi-supervised video-driven facial animation transfer for production [paper link][
Digital Domain
,transfer of facial expressions
, based onunsupervised image-to-image translation
] -
👍FOCUS(arxiv2021) To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision [paper link][code|official]
-
⭐DAD-3DNet(CVPR2022) DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image [paper link][project link👍][codes|official PyTorch][benchmark challenge👍][
DAD-3DHeads dataset
][Bypinatafarm
] -
ImFace(CVPR2022) ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations [paper link][arxiv link][code | official][
Beihang University
] -
REALY(ECCV2022) REALY: Rethinking the Evaluation of 3D Face Reconstruction [paper link][project link][codes|official][blogs|zhihu][
3DMM of face/head (HIFI3D++)
and3D face dataset (REALY)
][ByTsinghua
] -
DenseLandmarks(ECCV2022) 3D Face Reconstruction with Dense Landmarks [paper link][project link][
Microsoft
] -
MICA(ECCV2022) Towards Metrical Reconstruction of Human Faces [paper link][project link][code|official][used multiple datasets][
SoTA results in NoW
][MPII 马普所
] -
JMLR(ECCVW2022) Perspective Reconstruction of Human Faces by Joint Mesh and Landmark Regression [paper link][code|official]
-
⭐DSFNet(CVPR2023) DSFNet: Dual Space Fusion Network for Occlusion-Robust Dense 3D Face Alignment [paper link][arxiv link][paperwithcode link][code|official][
Head Pose Estimation
+Face Alignment
+3D Face Reconstruction
] -
TEMPEH(CVPR2023) Instant Multi-View Head Capture Through Learnable Registration [paper link][arxiv link][project link][code|official][
MPII 马普所
, based onToFu(ICCV2021)
] -
3DDFA+ & DAD-3DNet+ (CVPR2023) 3D-Aware Facial Landmark Detection via Multi-View Consistent Training on Synthetic Data [paper link][project link][
Texas A&M University
, new datasetDAD-3DHeads-Syn
based onNeRF
] -
FOCUS(CVPR2023) Robust Model-based Face Reconstruction through Weakly-Supervised Outlier Segmentation [paper link][arxiv link][code|official][the accepted paper of
FOCUS(arxiv2021)
,Weakly-Supervised Learning
] -
TokenHead (ICCV2023) Accurate 3D Face Reconstruction with Facial Component Tokens [paper link][
THU(Shenzhen)
+IDEA
] -
HiFace (ICCV2023) HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details [paper link][arxiv link][project link][
MicroSoft
, based onDenseLandmarks(ECCV2022)
] -
SIRA++(arxiv2023.10) Implicit Shape and Appearance Priors for Few-Shot Full Head Reconstruction [arxiv link][
few-shot learning
][extended journal onSIRA(WACV2023)
--> SIRA: Relightable Avatars From a Single Image [paper link][arxiv link]] -
3DDFA-V3(arxiv2023.12)(CVPR2024) 3D Face Reconstruction with the Geometric Guidance of Facial Part Segmentation [arxiv link][code|official][tested on the dataset
REALY
] -
PPR-CNet(Computers & Graphics 2023)(CCF C) 3D face reconstruction from a single image based on hybrid-level contextual information with weak supervision [paper link][
no code is available
,Xinjiang University
] -
ImFace++(arxiv2023.12) ImFace++: A Sophisticated Nonlinear 3D Morphable Face Model with Implicit Neural Representations [arxiv link][code|official][
Beihang University
, the extended journal version ofImFace
]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
- (zhihu) 一文读懂YOLO V5 与 YOLO V4
- (zhihu) 如何评价YOLOv5?
- (csdn blog) YOLO/V1、V2、V3目标检测系列介绍
- (csdn blog) 睿智的目标检测26——Pytorch搭建yolo3目标检测平台
- (csdn blog) 睿智的目标检测30——Pytorch搭建YoloV4目标检测平台
- YOLOv5(2020) YOLOv5 is from the family of object detection architectures YOLO and has no paper [YOLOv5 Docs]
-
ThroughHand (CHI2021) ThroughHand: 2D Tactile Interaction to Simultaneously Recognize and Touch Multiple Objects [paper link][
a novel tactile interaction that enables users with visual impairments to interact with multiple dynamic objects in real time
,utilize the potential of the
human tactile sense,enable users to perceive the objects using the
palm] -
👍SoloFinger (CHI2021) SoloFinger: Robust Microgestures while Grasping Everyday Objects [paper link][project link][
Input / Spatial Interaction / Practice Support
,36 everyday hand-object actions
,simple SoloFinger gestures can relieve the need for complex finger configurations or delimiting gestures
] -
Gaze-Supported (CHI2021) Gaze-Supported 3D Object Manipulation in Virtual Reality [paper link][
Input / Spatial Interaction / Practice Support
,investigates integration, coordination, and transition strategies of gaze and hand input for 3D object manipulation in VR
,help guide the design of future VR systems that incorporate gaze input for 3D object manipulation
] -
ARnnotate (UIST2022)(CCF-A) ARnnotate: An Augmented Reality Interface for Collecting Custom Dataset of 3D Hand-Object Interaction Pose Estimation [paper link][pdf link][
Purdue University
, application inAugmented Reality
] -
Ubi-TOUCH (UIST2023)(CCF-A) Ubi-TOUCH: Ubiquitous Tangible Object Utilization through Consistent Hand-object interaction in Augmented Reality [paper link][
Purdue University
, application inAugmented Reality
] -
InstruMentAR (CHI2023) InstruMentAR: Auto-Generation of Augmented Reality Tutorials for Operating Digital Instruments Through Recording Embodied Demonstration [paper link][pdf link][
Purdue University
, application inAugmented Reality
]
including Crowd Person Detection
, Pedestrian Detection
, Crowded Pedestrian Detection
-
ReInspect, Lhungarian(CVPR2016) End-To-End People Detection in Crowded Scenes [arxiv link]
-
PRNet(ECCV2020) Progressive Refinement Network for Occluded Pedestrian Detection [paper link][code|official][for
Crowded Human Detection
] -
Pedestron(CVPR2021) Generalizable Pedestrian Detection: The Elephant In The Room [paper link][code|official][
Pedestrian Detection
] -
OTP-NMS(TIP2023) OTP-NMS: Towards Optimal Threshold Prediction of NMS for Crowded Pedestrian Detection [paper link][
CrowdHuman and CityPersons datasets
,HNU
] -
VLPD(CVPR2023) VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision [arxiv link][code|official][
Vision-Language semantic self-supervision for context-aware Pedestrian Detection
] -
LSFM (Localized Semantic Feature Mixers)(CVPR2023) Localized Semantic Feature Mixers for Efficient Pedestrian Detection in Autonomous Driving [paper link][
Caltech, CityPersons, Euro City Persons, and TJU-Traffic-Pedestrian datasets
][LSFM beats the human baseline for the first time in the history of pedestrian detection
] -
SSCP (Sample Selection for Crowded Pedestrians)(arxiv2023.05) Selecting Learnable Training Samples is All DETRs Need in Crowded Pedestrian Detection [arxiv link][
Crowdhuman and Citypersons datasets
] -
OPL (Optimal Proposal Learning)(CVPR2023) Optimal Proposal Learning for Deployable End-to-End Pedestrian Detection [paper link][code is not available][
BUPT
] -
LOAF (ICCV2023) Large-Scale Person Detection and Localization Using Overhead Fisheye Cameras [paper link][project link][arxiv link][code|official][
dataset
,BUPT-PRIV
]
including Hand Detection
, Hand Tracking
, Hand-Object Contact
, Hand Pressure Estimation
, Hand-Object Interaction
, Hand Contact Reconstruction
and Hand-Object Manipulation
-
Hand_detection_rotation_estimation(TIP2017) Joint Hand Detection and Rotation Estimation Using CNN [paper link][arxiv link]
-
Hand-CNN(hand_det_attention)(ICCV2019) Contextual Attention for Hand Detection in the Wild [paper link][project][code|official]
-
⭐BodyHands(CVPR2022) Whose Hands Are These? Hand Detection and Hand-Body Association in the Wild [paper link][project link][code|official][CVLab@StonyBrook][
joint detection of person body and hands
][BodyHands
dataset] -
⭐HandLer(CVPR2022) Forward Propagation, Backward Regression, and Pose Association for Hand Tracking in the Wild [paper link][project link][code|official][CVLab@StonyBrook][
YoutubeHands
dataset, Hand-tracking]
including Head Detection
, Head Counting
-
HollywoodHeads(ICCV2015) Context-Aware CNNs for Person Head Detection [paper link][project link][
It introduces a large dataset with 369,846 human heads annotated in 224,740 movie frames.
] -
DA-RCNN(arxiv2018) Double Anchor R-CNN for Human Detection in a Crowd [arxiv link][CSDN blog1][CSDN blog2]
-
FCHD(arxiv2018,ICIP2019) FCHD: Fast and accurate head detection in crowded scenes [arxiv link][Codes|PyTorch(official)][CSDN blog]
-
LSC-CNN(TPAMI2020) Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection [arxiv link][Codes|Pytorch(official)]
-
PedHunter(AAAI2020) PedHunter: Occlusion Robust Pedestrian Detector in Crowded Scenes [paper link][
joint body-head detection
] -
⭐JointDet(AAAI2020) Relational Learning for Joint Head and Human Detection [paper link][codes|not released]
-
FastNFusion(PRCV2021) Fast and Fusion: Real-Time Pedestrian Detector Boosted by Body-Head Fusion [paper link][
Pedestrian Detector using Body-Head Association
] -
⭐BFJDet(ICCV2021) Body-Face Joint Detection via Embedding and Head Hook [paper link][codes|official][
joint detection of person body, head and face
] -
HeadHunter(CVPR2021) Tracking Pedestrian Heads in Dense Crowd [paper link][project link][code|official][Head_Tracking_21 challenge][
Pedestrian Tracking
] -
👍Head-body-Tracking(arxiv2023.04) Handling Heavy Occlusion in Dense Crowd Tracking by Focusing on the Heads [arxiv link]
-
👍👍PanoHead(CVPR2023) PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360∘ [paper link][arxiv link][project link][code|official]
-
VGGHeads(arxiv2024.07) VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads [arxiv link][project link][code|official][
University of Oxford + Ukrainian Catholic University + PiñataFarms AI
][Diffusion-based image generation]
including Human-Parts Detection
, Human Activity Understanding
, Human and Object Reconstruction
, Human-Aware Object Placement
, Human-Scene Contact
, Human-Object Contact
, Human-Object Interaction Tracking
and Close Human Interaction
-
DID-Net(ACCV2018) Detector-in-Detector: Multi-level Analysis for Human-Parts [paper link][code | official][
HumanParts
dataset] -
PROX(ICCV2019) Resolving 3D Human Pose Ambiguities with 3D Scene Constraints [paper link][project link][
MPII
,The contact constraint encourages specific parts of the body to be in contact with scene surfaces if they are close enough in distance and orientation.
] -
⭐Hier-R-CNN(TIP2020) Hier R-CNN: Instance-Level Human Parts Detection and A New Benchmark [paper link][code|official][
Mask R-CNN
as Backbone][FCOS
as Hier Branch which needs many hand-crafted tricks][COCOHumanParts
dataset] -
ContactDynamics(ECCV2020) Contact and Human Dynamics from Monocular Video [paper link][project link][code|official][
Stanford University
,Adobe Research
] -
PaStaNet(CVPR2020) PaStaNet: Toward Human Activity Knowledge Engine [paper link][project link][
SJTU
,body-part state annotations in the context of HOI
][HAKE 1.0
(Human Activity Knowledge Engine) dataset] -
CHORE(ECCV2022) CHORE: Contact, Human and Object Reconstruction from a Single RGB Image [paper link][project link][
MPII
,single-person
, reason the interactions and recover the spatial arrangement, fine-grained contacts between the human and the object] -
MOVER(CVPR2022) Human-Aware Object Placement for Visual Environment Reconstruction [paper link][project link][code|official][
human-scene interactions (HSIs)
,MPII
] -
👍BSTRO(Body-Scene contact TRansfOrmer)(CVPR2022) Capturing and Inferring Dense Full-Body Human-Scene Contact [paper link][project link][code|official][dataset
RICH
,Interaction-Contact-Humans
,MPII
,single-person
] -
HAKE(TPAMI2023) HAKE: A Knowledge Engine Foundation for Human Activity Understanding [paper link][arxiv link][project link][
HAKE 2.0
(Human Activity Knowledge Engine) dataset] -
👍HOT(CVPR2023) Detecting Human-Object Contact in Images [paper link][project link][
马普所
,HOT
dataset,single-person
] -
VisTracker(CVPR2023) Visibility Aware Human-Object Interaction Tracking from Single RGB Camera [arxiv link][project link][
MPII
, An approach to jointly track the human, the object and the contacts between them, in 3D, from a monocular RGB video.] -
Hi4D(Humans interacting in 4D)(CVPR2023) Hi4D: 4D Instance Segmentation of Close Human Interaction [arxiv link][project link][
ETH Zürich
, A dataset of humans in close physical interaction]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
also 2D/3D Hand Keypoints Detection
or Hand Shape Estimation
or 3D Hand Shape and Pose Regression
- 👍 (github)(Hand3DResearch) Recent Progress in 3D Hand Tasks [github link]
- (github) awesome 3d human reconstruction --> 3d_human_hand [github link]
- 👍 (github) Awesome work on hand pose estimation/tracking [github link]
- [HANDS17: (arxiv2017)] The 2017 Hands in the Million Challenge on 3D Hand Pose Estimation [arxiv link][
3D Hand Pose Estimation
,21 joints
] - [FreiHAND: (ICCV2019)] FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape from Single RGB Images [paper link][arxiv link][
University of Freiburg
,A dataset that uses MANO
] - [ObMan: (CVPR2019)] Learning joint reconstruction of hands and manipulated objects [paper link][arxiv link][
MPII
,hand-object manipulations
,A dataset that uses MANO
,A new large-scale synthetic dataset with hand-object manipulations
] - [InterHand2.6M: (ECCV2020)] InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image [paper link][github link][
facebookresearch
,A dataset that uses MANO
] - [GanHand or YCB_Affordance: (CVPR2020)] GanHand: Predicting Human Grasp Affordances in Multi-Object Scenes [paper link][github link (method)][github link (dataset)][
A dataset that uses MANO
,Human Grasp Affordances
] - [HO-3D: (CVPR2020)] HOnnotate: A method for 3D Annotation of Hand and Object Poses [arxiv link][github link 1][github link2][
The first markerless dataset of color images with 3D annotations of both hand and object
,This dataset is currently made of 80,000 frames, 65 sequences, 10 persons, and 10 objects
] - [YouTube 3D Hands: (CVPR2020 Oral)] Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild [arxiv link][github link][
Ariel AI
] - [HanCo: (GCPR2021)] Contrastive Representation Learning for Hand Shape Estimation [paper link][arxiv link][
University of Freiburg
,A dataset that uses MANO
,An extended version of FreiHAND with calibration and multiple-views
] - [DARTset: (NIPS2022)] DART: Articulated Hand Model with Diverse Accessories and Rich Textures [arxiv link][github link][
Alibaba XR Lab + MPII + SJTU
][A dataset (DARTset) that uses MANO and proposes a new hand morphable model DART
,for hand pose estimation & surface reconstruction tasks
,with large-scale (800K), diverse, and high-fidelity hand images, paired with perfect-aligned 3D labels
]
- Survey(IJCV2023) Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey [paper link][arxiv link][slice link][
The code is not available
][University of Tokyo + ETH
, the first authorTake Ohkawa (大川 武彦)
]
-
MANO (TOG2017, SIGGRAPH ASIA 2017) Embodied Hands: Modeling and Capturing Hands and Bodies Together [paper link][arxiv link][project link (keep updating)][
MPII
,It attempts to learn hand shape variation with Linear Blend Skinning (LBS)
[SIGGRAPH 2000]][it learns from a large variety of high-quality hand scans and represents the geometric changes in the low-dimensional pose and shape space
] -
NIMBLE (TOG2022) NIMBLE: A Non-rigid Hand Model with Bones and Muscles [paper link][arxiv link][project link][code|official][
ShanghaiTech University
]
- hand3d(ICCV2017) Learning to Estimate 3D Hand Pose From Single RGB Images [paper link][arxiv link][project link][code|official][
University of Freiburg
, new datasetRendered Hand Pose Dataset (RHD)
,3D Hand Keypoints Detection
]
also 3D Hand Shape and Pose Regression
-
(ECCV2018) Hand Pose Estimation via Latent 2.5D Heatmap Regression [paper link][arxiv link][
No code is available
,NVIDIA
] -
(CVPR2019) 3D Hand Shape and Pose From Images in the Wild [paper link][arxiv link][
No code is available
, based onMANO
] -
(CVPR2019) Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering [paper link][arxiv link][
No code is available
, based onMANO
] -
Hand+Object(CVPR2019) H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions [paper link][arxiv link][
No code is available
,6DoF Object Pose Estimation
+3D Hand Keypoints Detection
] -
👍hand-graph-cnn(CVPR2019) 3D Hand Shape and Pose Estimation From a Single RGB Image [paper link][arxiv link][code|official][based on
MANO
,2D/3D Hand Keypoints Detection
+3D Hand Mesh
] -
HAMR(ICCV2019) End-to-End Hand Mesh Recovery From a Monocular RGB Image [paper link][arxiv link][code|official][based on
MANO
,2D/3D Hand Keypoints Detection
+3D Hand Mesh
] -
👍MobileHand(ICONIP2020) MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image [paper link][project link][code|official][
anyang Technological University
][based onMANO
,2D/3D Hand Keypoints Detection
+3D Hand Mesh
] -
I2L-MeshNet(ECCV2020) I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image [paper link][arxiv link][code|official][
Seoul National University
,whole body and related hands
] -
mesh_hands(CVPR2020) Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild [paper link][arxiv link][project link][based on
MANO
] -
RGB2Hands(SIGGRAPH Asia 2020) RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video [paper link][arxiv link][project link][new dataset
RGB2Hands
, based onMANO
] -
InterShape(ICCV2021) Interacting Two-Hand 3D Pose and Shape Reconstruction From Single Color Image [paper link][pdf link][project link][code|official][
Yangang Wang
, based onMANO
, using the datasetInterHand2.6M
] -
👍MobRecon(CVPR2022) MobRecon: Mobile-Friendly Hand Mesh Reconstruction From Monocular Image [paper link][arxiv link][code|official][
Kuaishou Technology
] -
IntagHand(CVPR2022) Interacting Attention Graph for Single Image Two-Hand Reconstruction [paper link][arxiv link][code|official][based on
MANO
, using the datasetInterHand2.6M
] -
👍HandOccNet(CVPR2022) HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation Network [paper link][arxiv link][code|official][based on
MANO
] -
MeMaHand(CVPR2023) MeMaHand: Exploiting Mesh-Mano Interaction for Single Image Two-Hand Reconstruction [paper link][arxiv link][
ByteDance
,No code is available
, based onMANO
, compared toIntagHand
andInterShape
] -
Im2Hands(CVPR2023) Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes [paper link][arxiv link][project link][code|official][
KAIST
, compared toIntagHand
, based onHALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands (3DV 2021)
andOccupancy Networks
] -
ACR(CVPR2023) ACR: Attention Collaboration-Based Regressor for Arbitrary Two-Hand Reconstruction [paper link][arxiv link][code|official][
Tencent AI Lab
, based onMANO
, compared toIntagHand
] -
InterWild(CVPR2023) Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild [paper link][arxiv link][code|official][
facebookresearch
, single authorGyeongsik Moon
, based onMANO
, compared toIntagHand
] -
H2ONet(CVPR2023) H2ONet: Hand-Occlusion-and-Orientation-aware Network for Real-time 3D Hand Mesh Reconstruction [paper link][code|official][
CUHK
, first authorHao XU (徐昊)
, tested on datasetsDexYCB
andHO3D
] -
DIR (ICCV2023 Oral) Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image [paper link][arxiv link][project link][code|official][
PICO IDL ByteDance
+BUPT
]
including Sign Language Recognition
and Sign Language Translation
-
BSL(ECCV2020) BSL-1K: Scaling Up Co-articulated Sign Language Recognition Using Mouthing Cues [paper link]
-
HMA(AAAI2021) Hand-Model-Aware Sign Language Recognition [paper link][
Sign Language Recognition (SLR)
] -
SignBERT (ICCV2021) SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign Language Recognition [paper link][arxiv link][
Sign Language Recognition (SLR)
] -
👍SignBERT+ (TPAMI2023) SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign Language Understanding [paper link][arxvi link][project link][
Sign Language Understanding (SLU)
]
-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-|-+-
- (tutorial & blog) Head Pose Estimation using OpenCV and Dlib
- (blogs) 基于Dlib和OpenCV的人脸姿态估计(HeadPoseEstimation))
- (blogs) 使用opencv和dlib进行人脸姿态估计(python)
- (cnblogs) paper 154:姿态估计(Hand Pose Estimation)相关总结
- (blogs) solvepnp三维位姿估算 | PnP 单目相机位姿估计(一、二、三)
- (github) OpenFace 2.2.0: a facial behavior analysis toolkit
- (github) Deepgaze contains useful packages including Head Pose Estimation
- (github) [Suggestion] Annotate rigid objects in 2D image with standard 3D cube
- (github) head pose estimation system based on 3d facial landmarks (3DDFA_v2)
- (paper-CVPR2019) On the Continuity of Rotation Representations in Neural Networks (6D表征头姿最合适)
- (blogs) What is The Difference Between 2D and 3D Image Annotations: Use Cases
- (zhihu) 如何通俗地解释欧拉角?之后为何要引入四元数?
- (blogs) 四元数与欧拉角(Yaw、Pitch、Roll)的转换
- (blogs) 四元数(Quaternion)和旋转 + 欧拉角
- (blogs) Understanding Quaternions 中文翻译《理解四元数》
- [Head Pose Estimation on AFLW2000], [Head Pose Estimation on BIWI ranking]
- BIWI Kinect Head Pose Database: (IJCV2013) Random forests for real time 3d face analysis[
pitch-yaw-roll
] - 300W-LP & AFLW2000: (CVPR2016) Face Alignment Across Large Poses: A 3D Solution[
pitch-yaw-roll
] - LPHD: (ICME2019) LPHD: A Large-Scale Head Pose Dataset for RGB Images[
pitch-yaw-roll
][un-released
] - S-HOCK: (CVIU2017) The S-Hock dataset: A new benchmark for spectator crowd analysis[paper link][
far left, left, frontal, right, far right, away, down
] - SynHead: (CVPR2017) Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network[paper link][
NVIDIA Synthetic Head Dataset (SynHead)
]
-
⭐Survey(TPAMI2009) Head Pose Estimation in Computer Vision: A Survey [paper link][CSDN blog]
-
Survey(SPI2021) Head pose estimation: A survey of the last ten years [paper link]
-
Survey(PR2022) Head pose estimation: An extensive survey on recent techniques and applications [paper link]
-
HyperFace(TPAMI2017) HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition [paper link]
-
(Neurocomputing2018) Appearance based pedestrians head pose and body orientation estimation using deep learning [paper link][
eight orientation bins
] -
HeadFusion(TPAMI2018) HeadFusion: 360 Head Pose Tracking Combining 3D Morphable Model and 3D Reconstruction [paper link]
-
⭐QuatNet(TMM2019) Quatnet: Quaternion-based head pose estimation with multiregression loss [paper link][
unit quaternion representation
] -
(IVC2020) Improving head pose estimation using two-stage ensembles with top-k regression [paper link]
-
MLD(TPAMI2020) Head Pose Estimation Based on Multivariate Label Distribution [paper link]
-
⭐MNN(TPAMI2021) Multi-Task Head Pose Estimation in-the-Wild [paper link][codes|Tensorflow / C++]
-
⭐MFDNet(TMM2021) MFDNet: Collaborative Poses Perception and Matrix Fisher Distribution for Head Pose Estimation [paper link][
matrix representation
] -
⭐2DHeadPose(NN2023) 2DHeadPose: A simple and effective annotation method for the head pose in RGB images and its dataset [paper link][codes|official][
annotation tool, dataset, and source code
] -
6dof_face(TIP2023) Towards 3D Face Reconstruction in Perspective Projection: Estimating 6DoF Face Pose from Monocular Image [paper link][code|official]
-
CIT(IJCV2023) Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose [paper link][code|official][
SYSU
,Facial Landmark
+Head Pose
] -
TokenHPE(TIP2023) Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer [paper link][The journal version of the conference paper
TokenHPE(CVPR2023)
] -
OPAL(SRHP+WRHP)(PR2024) On the representation and methodology for wide and short range head pose estimation [paper link][arxiv link][
Universidad Politécnica de Madrid
] -
HeadDiff(TIP2024) HeadDiff: Exploring Rotation Uncertainty with Diffusion Models for Head Pose Estimation [paper link][
Ningxia University
] -
HHP-Net-Plus(CVIU2024) Head pose estimation with uncertainty and an application to dyadic interaction detection [paper link][code|official][
Università degli Studi di Genova, Italy
, the extended journal ofHHP-Net(WACV2022)
]
-
(ITSC2014) Head detection and orientation estimation for pedestrian safety [paper link]
-
Dlib(68 points)(CVPR2014) One Millisecond Face Alignment with an Ensemble of Regression Trees [paper link]
-
⭐3DDFA(CVPR2016) Face Alignment Across Large Poses: A 3D Solution [paper link]
-
⭐FAN(12 points)(ICCV2017) How Far Are We From Solving the 2D & 3D Face Alignment Problem? (And a Dataset of 230,000 3D Facial Landmarks) [paper link]
-
KEPLER(FG2017) KEPLER: Keypoint and Pose Estimation of Unconstrained Faces by Learning Efficient H-CNN Regressors [paper link]
-
FasterRCNN+regression(ACCV2018) Simultaneous Face Detection and Head Pose Estimation: A Fast and Unified Framework [paper link][dataset|AFW and ALFW dataset: from coarse face pose by using Subcategory to generate 12 clusters to fine Euler angles prediction][
following the HyperFace
] -
WNet(ACCVW2018) WNet: Joint Multiple Head Detection and Head Pose Estimation from a Spectator Crowd Image [paper link][dataset|spectator crowd S-HOCK dataset: rough orientation labels]
-
SSR-Net-MD(IJCAI2018) SSR-Net: A Compact Soft Stagewise Regression Network for Age Estimation [paper link][codes|Tensorflow+Dlib+MTCNN][
Inspiring the FSA-Net
] -
⭐HopeNet(CVPRW2018) Fine-Grained Head Pose Estimation Without Keypoints [arxiv link][Codes|PyTorch(official)][CSDN blog]
-
HeadPose(FG2019) Improving Head Pose Estimation with a Combined Loss and Bounding Box Margin Adjustment [paper link][codes|TensorFlow]
-
⭐FSA-Net(CVPR2019) FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation from a Single Image [paper link][Codes|Keras&Tensorflow(official)][Codes|PyTorch(unofficial)]
-
PADACO(ICCV2019) Deep Head Pose Estimation Using Synthetic Images and Partial Adversarial Domain Adaption for Continuous Label Spaces [paper link][project link][
SynHead and BIWI --> SynHead++, SynBiwi+, Biwi+
] -
⭐WHENet(BMVC2020) WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose [arxiv link][Codes|Kears&tensorflow(official)][codes|PyTorch(unofficial)][codes|DMHead(unofficial)]
-
RAFA-Net(ACCV2020) Rotation Axis Focused Attention Network (RAFA-Net) for Estimating Head Pose [paper link][codes|keras+tensorflow]
-
⭐FDN(AAAI2020) FDN: Feature decoupling network for head pose estimation [paper link]
-
Rankpose(BMVC2020) RankPose: Learning Generalised Feature with Rank Supervision for Head Pose Estimation [paper link][codes|PyTorch][
vector representation
] -
⭐3DDFA_V2(ECCV2020) Towards Fast, Accurate and Stable 3D Dense Face Alignment [paper link][codes|PyTorch 3DDFA_V2][
3D Dense Face Alignment
,3D Face Reconstruction
,3DMM
,Lightweight
] -
EVA-GCN(CVPRW2021) EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks [paper link][codes|PyTorch]
-
⭐TriNet(WACV2021) A Vector-Based Representation to Enhance Head Pose Estimation [paper link][codes|Tensorflow+Keras][
vector representation
] -
⭐img2pose(CVPR2021) img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation [paper link][codes|PyTorch]
-
⭐OsGG-Net(ACMMM2021) OsGG-Net: One-step Graph Generation Network for Unbiased Head Pose Estimation [paper link][codes|PyTorch]
-
(KSE2021) Simultaneous face detection and 360 degree head pose estimation [paper link]【文章使用了FPN+Multi-task的方式,同时检测人头和识别人头姿态,数据集主要使用了CMU-Panoptic,300WLP和BIWI。头姿表示形式上,除了欧拉角,还使用了Rotation Matrix】
-
(KSE2021) UET-Headpose: A sensor-based top-view head pose dataset [paper link] 【全文均在阐述获取数据集的硬件系统,但数据集未公布;HPE算法为FSA-Net,并根据WHENet中的思路拓展为full-range 360°单人头部姿态估计方法】
-
(FG2021) Relative Pose Consistency for Semi-Supervised Head Pose Estimation [paper link][pdf link][
Semi-Supervised
] -
HeadPosr(FG2021) HeadPosr: End-to-end Trainable Head Pose Estimation using Transformer Encoders [paper link][arxiv link][
Naina Dhingra
] -
⭐SynergyNet(3DV2021) Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry [paper link][project link][codes|PyTorch]
-
⭐MOS(BMVC2021) MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation [paper link][codes|PyTorch][
re-annotate the WIDER FACE with head pose label
] -
LwPosr(WACV2022) LwPosr: Lightweight Efficient Fine Grained Head Pose Estimation [paper link][
Naina Dhingra
] -
HHP-Net(WACV2022) HHP-Net: A Light Heteroscedastic Neural Network for Head Pose Estimation With Uncertainty [paper link][codes|TensorFlow]
-
⭐6DRepNet(ICIP2022) 6D Rotation Representation For Unconstrained Head Pose Estimation [paper link][codes|PyTorch+RepVGG][Journal Version (6DRepNet360) -->
Towards Robust and Unconstrained Full Range of Rotation Head Pose Estimation
][vector representation
] -
⭐DAD-3DNet(CVPR2022) DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image [paper link][project link👍][codes|official PyTorch][benchmark challenge👍][
DAD-3DHeads dataset
, bypinatafarm
][used as anoff-the-shelf head pose estimator
in HairNeRF(ICCV2023)] -
TokenHPE(CVPR2023) TokenHPE: Learning Orientation Tokens for Efficient Head Pose Estimation via Transformers [paper link][code|official][
Transformer-based method
] -
⭐DSFNet(CVPR2023) DSFNet: Dual Space Fusion Network for Occlusion-Robust Dense 3D Face Alignment [paper link][arxiv link][paperwithcode link][code|official][
Head Pose Estimation
+Face Alignment
+3D Face Reconstruction
] -
PFA(arxiv2023.08) 3D Face Alignment Through Fusion of Head Pose Information and Features [arxiv link][
Soongsil University
] -
OrdinalRegression(ICASSP2024) Language-Driven Ordinal Learning for Imbalanced Head Pose Estimation [paper link][
Ningxia University
] -
StructuredLight(ICASSP2024) Adaptive Head Pose Estimation with Real-Time Structured Light [paper link][
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, China
] -
FaceXFormer(arxiv2024.03) FaceXFormer: A Unified Transformer for Facial Analysis [arxiv link][project link][code|official][
Johns Hopkins University
] -
HPE-CogVLM(arxiv2024.06) HPE-CogVLM: New Head Pose Grounding Task Exploration on Vision Language Model [arxiv link][
Docomo Innovations + Santa Clara University
] -
TRG(ECCV2024)(arxiv2024.07) 6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry [arxiv link][code|official][
Kwangwoon University
] -
Lercpose(ICIP2024) Lercpose: Learned Ranking and Contrastive Loss for Robust Head Pose Estimation [paper link][code|official][
Mercedes benz R&D, India
]