Details | |
---|---|
Programming Language: | Python 3.6+ |
Intel OpenVINO ToolKit: | 2020.2.120 |
Docker (Ubuntu OpenVINO pre-installed): | mmphego/intel-openvino |
Hardware Used: | Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz |
Device: | CPU |
Blog Post | |
Visitors |
In this project 3 of 3, I used an Intel® OpenVINO Gaze Detection model to control the mouse pointer of my computer. Using the Gaze Estimation model to estimate the gaze of the user's eyes and change the mouse pointer position accordingly. This project demonstrates the ability of running multiple models in the same machine and coordinate the flow of data between those models.
Used the Inference Engine API from Intel's OpenVino ToolKit to build the project.
The gaze estimation model used requires three inputs:
- The head pose
- The left eye image
- The right eye image.
To get these inputs, use the three other OpenVino models model below:
Implementation:
computer-pointer-controller/src/model.py
Line 156 in bb5f13c
Implementation:
computer-pointer-controller/src/model.py
Line 239 in bb5f13c
Implementation:
computer-pointer-controller/src/model.py
Line 305 in bb5f13c
Implementation:
computer-pointer-controller/src/model.py
Line 422 in bb5f13c
Coordinate the flow of data from the input, and then amongst the different models and finally to the mouse controller. The flow of data looks like this:
tree && du -sh
.
├── LICENSE
├── main.py
├── models
│ ├── face-detection-adas-binary-0001.bin
│ ├── face-detection-adas-binary-0001.xml
│ ├── gaze-estimation-adas-0002.bin
│ ├── gaze-estimation-adas-0002.xml
│ ├── head-pose-estimation-adas-0001.bin
│ ├── head-pose-estimation-adas-0001.xml
│ ├── landmarks-regression-retail-0009.bin
│ └── landmarks-regression-retail-0009.xml
├── README.md
├── requirements.txt
├── resources
└── src
├── __init__.py
├── input_feeder.py
├── model.py
└── mouse_controller.py
3 directories, 16 files
37M .
There are two (2) ways of running the project.
-
Download and install Intel OpenVINO Toolkit and install.
- After you've cloned the repo, you need to install the dependecies using this command:
pip3 install -r requirements.txt
- After you've cloned the repo, you need to install the dependecies using this command:
-
Run the project in the Docker image that I have baked Intel OpenVINO and dependencies in.
- Run:
docker pull mmphego/intel-openvino
Not sure what Docker is, watch this
For this project I used the latter method.
I have already downloaded the Models, which are located in ./models/
.
Should you wish to download your own models run:
MODEL_NAME=<<name of model to download>>
docker run --rm -ti \
--volume "$PWD":/app \
mmphego/intel-openvino \
bash -c "\
/opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/downloader.py \
--name ${MODEL_NAME}"
Models used in this project:
- Face Detection Model
- Facial Landmarks Detection Model
- Head Pose Estimation Model
- Gaze Estimation Model
$ python main.py -h
usage: main.py [-h] -fm FACE_MODEL -hp HEAD_POSE_MODEL -fl
FACIAL_LANDMARKS_MODEL -gm GAZE_MODEL [-d DEVICE]
[-pt PROB_THRESHOLD] -i INPUT [--out] [-mp [{high,low,medium}]]
[-ms [{fast,slow,medium}]] [--enable-mouse] [--show-bbox]
[--debug] [--stats]
optional arguments:
-h, --help show this help message and exit
-fm FACE_MODEL, --face-model FACE_MODEL
Path to an xml file with a trained model.
-hp HEAD_POSE_MODEL, --head-pose-model HEAD_POSE_MODEL
Path to an IR model representative for head-pose-model
-fl FACIAL_LANDMARKS_MODEL, --facial-landmarks-model FACIAL_LANDMARKS_MODEL
Path to an IR model representative for facial-
landmarks-model
-gm GAZE_MODEL, --gaze-model GAZE_MODEL
Path to an IR model representative for gaze-model
-d DEVICE, --device DEVICE
Specify the target device to infer on: CPU, GPU, FPGA
or MYRIAD is acceptable. Sample will look for a
suitable plugin for device specified (Default: CPU)
-pt PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
Probability threshold for detections
filtering(Default: 0.8)
-i INPUT, --input INPUT
Path to image, video file or 'cam' for Webcam.
--out Write video to file.
-mp [{high,low,medium}], --mouse-precision [{high,low,medium}]
The precision for mouse movement (how much the mouse
moves). [Default: low]
-ms [{fast,slow,medium}], --mouse-speed [{fast,slow,medium}]
The speed (how fast it moves) by changing [Default:
fast]
--enable-mouse Enable Mouse Movement
--show-bbox Show bounding box and stats on screen [debugging].
--debug Show output on screen [debugging].
--stats Verbose OpenVINO layer performance stats [debugging].
In order to run the application run the following code (Assuming you have docker installed.):
xhost +;
docker run --rm -ti \
--volume "$PWD":/app \
--env DISPLAY=$DISPLAY \
--volume=$HOME/.Xauthority:/root/.Xauthority \
--volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
--device /dev/video0 \
mmphego/intel-openvino \
bash -c "\
source /opt/intel/openvino/bin/setupvars.sh && \
python main.py \
--face-model models/face-detection-adas-binary-0001 \
--head-pose-model models/head-pose-estimation-adas-0001 \
--facial-landmarks-model models/landmarks-regression-retail-0009 \
--gaze-model models/gaze-estimation-adas-0002 \
--input resources/demo.mp4 \
--debug \
--show-bbox \
--enable-mouse \
--mouse-precision low \
--mouse-speed fast"
We can use the Deployment Manager present in OpenVINO to create a runtime package from our application. These packages can be easily sent to other hardware devices to be deployed.
To deploy the application to various devices using the Deployment Manager run the steps below.
Note: Choose from the devices listed below.
DEVICE='cpu' # or gpu, vpu, gna, hddl
docker run --rm -ti \
--volume "$PWD":/app \
mmphego/intel-openvino bash -c "\
python /opt/intel/openvino/deployment_tools/tools/deployment_manager/deployment_manager.py \
--targets ${DEVICE} \
--user_data /app \
--output_dir . \
--archive_name computer_pointer_controller_${DEVICE}"
Queries performance measures per layer to get feedback of what is the most time consuming layer: Read docs.
xhost +;
docker run --rm -ti \
--volume "$PWD":/app \
--env DISPLAY=$DISPLAY \
--volume=$HOME/.Xauthority:/root/.Xauthority \
--volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
--device /dev/video0 \
mmphego/intel-openvino \
bash -c "\
source /opt/intel/openvino/bin/setupvars.sh && \
python main.py \
--face-model models/face-detection-adas-binary-0001 \
--head-pose-model models/head-pose-estimation-adas-0001 \
--facial-landmarks-model models/landmarks-regression-retail-0009 \
--gaze-model models/gaze-estimation-adas-0002 \
--input resources/demo.mp4 \
--stat"
- Multiple People Scenario: If we encounter multiple people in the video frame, it will always use and give results one face even though multiple people detected,
- No Face Detection: it will skip the frame and inform the user
- Intel® DevCloud: Benchmark the application on various devices to ensure optimum perfomance.
- Intel® VTune™ Profiler: Profile my application and locate any bottlenecks.
- Gaze estimations: We could revisit the logic of determining and calculating the coordinates as it is a bit flaky.
- Lighting condition: We might use HSV based pre-processing steps to minimize error due to different lighting conditions.