fatigue-driving-detection-archived

English | 简体中文

Project for the Challenge Cup 2023 Huawei Industrial Contest: Fatigue Driving Detection

Knowledge stacks involved include: Face detection, Facial keypoint recognition, Sequential judgement.

Model Description

For the key models, we've tried the following:

YOLOv5 + Dlib
YOLOv5 + SPIGA
YOLOv7 + SPIGA
YOLOv7 + Rentina (baseline)
YOLOv8 + SPIGA
YOLOv8 + Retina (last submit)

Parameter Tuning Experience

The YOLOv5s model is slightly inferior to YOLOv8n in terms of both accuracy and speed.
The YOLOv7 model has a training time several times longer than a similar parameter model of v8.
SPIGA is a SOTA model for facial keypoint detection, but obviously, the SOTA is the accuracy on large datasets, not accuracy/performance. Its average inference speed per frame on the official required hardware (an old 2-core 8GB cpu) reached an astonishing 1.404s per frame(*😓). It was eventually eliminated due to its poor edge deployment capability.

Just so you know, at 640 x 640

YOLOv8: 230ms per frame

Renita: 56ms per frame

Renita is a model reference provided by the official Baseline, and is the model we finally chose. This model performs well in a car interior environment where the occlusion and lighting conditions are not complicated.

A Quick Tour of Our Files

BaselineLandmark/detectionx.2: Our final bow

submit/detection8: The last act of YOLOv8 + SPIGA

BaselineLandmark/makeup: Our backstage pass for tweaking Retina + YOLOv8

Fun Facts

Why does the main code look like a huge pile? Don't we do any encapsulation?

Our main program, in a bid to stick close to the baseline submission format and make tuning easier, stayed away from encapsulation. Even though it looked similar to the baseline, the main program was written without any peeking at it, which gave the person who started porting quite a headache.
And with this whole project being a solo gig and a linear job, everyone knew their roles inside out, so there was no need to worry about others not getting it. Everyone simply didn't have the time or energy~~and let's be honest, nobody was reading it except me~~.

The best way to divide the work? One person covers all bases.

How'd we fare in the competition?

The competition judged us on F1-Score (accuracy) and speed. On the facial keypoint front, by pulling in sequential pattern recognition, we managed to cover all bases. Top 50 now!

Any hurdles during the competition?

The size of the dataset was a beast (2066 * 8s video sequences)
Doc of Huawei Cloud ModelArts' online deployment feature (which we had to use for submission) was as clear as mud - practically nothing was there. We had to play detective and figure out how it works before we could make a stable submission 😓.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.github		.github
BaselineLandmark		BaselineLandmark
README		README
modelarts-release/model		modelarts-release/model
submit		submit
v5		v5
v8		v8
video_classification/model		video_classification/model
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
file_list.txt		file_list.txt
file_list1.txt		file_list1.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fatigue-driving-detection-archived

Model Description

Parameter Tuning Experience

A Quick Tour of Our Files

Fun Facts

About

Contributors 2

Languages

License

KaerMorh/fatigue-driving-detection

Folders and files

Latest commit

History

Repository files navigation

fatigue-driving-detection-archived

Model Description

Parameter Tuning Experience

A Quick Tour of Our Files

Fun Facts

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages