The Monkeytyping Solution to the Youtube-8M Video Understanding Challenge

This is the solution repository of the 2nd place team monkeytyping, licensed under the Apache License 2.0.

Dependencies

Python 2.7
Tensorflow 1.0
Numpy 1.12
GNU Bash

Resources

For an understanding of our system, read the report of our solution:

https://arxiv.org/abs/1706.05150

Our source code:

https://github.com/wangheda/youtube-8m

Useful scripts

Training scripts (training a model may take 3-5 days) are in

youtube-8m-wangheda/training_scripts
youtube-8m-zhangteng/train_scripts

Eval scripts for selecting best performing checkpoints

youtube-8m-wangheda/eval_scripts
youtube-8m-zhangteng/eval_scripts

Infer scripts for generating intermediate files used by ensemble scripts

youtube-8m-wangheda/infer_scripts
youtube-8m-zhangteng/infer_scripts

Ensemble scripts

youtube-8m-ensemble/ensemble_scripts

Paths of models and data

There are some conventions that we use in our code:

models are saved in

./model

train1 data is saved in

/Youtube-8M/data/frame/train
/Youtube-8M/data/video/train

validate1 data is saved in

/Youtube-8M/data/frame/validate
/Youtube-8M/data/video/validate

test data is saved in

/Youtube-8M/data/frame/test
/Youtube-8M/data/video/test

train2 data is saved in

/Youtube-8M/data/frame/ensemble_train
/Youtube-8M/data/video/ensemble_train

validate2 data is saved in

/Youtube-8M/data/frame/ensemble_validate
/Youtube-8M/data/video/ensemble_validate

intermediate results are stored in

/Youtube-8M/model_predictions/ensemble_train/[method]
/Youtube-8M/model_predictions/ensemble_validate/[method]
/Youtube-8M/model_predictions/test/[method]

How to generate a solution

Single model

Train a single model
evaluate the checkpoints to get the best one
infer the checkpoint to get intermediate result.

Ensemble model

Write a configuration file
train a stacking model
evaluate the stacking model and pick the best checkpoint
infer the checkpoint to get a submission file

Note

Some of the single models are developed by Heda and some by Teng, so they are distributed in two folders.

Bagging models are in youtube-8m-wangheda/bagging_scripts.

Boosting and distillation models are in youtube-8m-wangheda/bagging_scripts.

Cascade models are in youtube-8m-wangheda/cascade_scripts.

Stacking models are in youtube-8m-ensemble/ensemble_scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 548 Commits
eda		eda
model		model
youtube-8m-ensemble		youtube-8m-ensemble
youtube-8m-wangheda		youtube-8m-wangheda
youtube-8m-zhangteng		youtube-8m-zhangteng
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Monkeytyping Solution to the Youtube-8M Video Understanding Challenge

Dependencies

Resources

Useful scripts

Paths of models and data

How to generate a solution

Single model

Ensemble model

Note

About

Releases

Packages

Contributors 2

Languages

License

wangheda/youtube-8m

Folders and files

Latest commit

History

Repository files navigation

The Monkeytyping Solution to the Youtube-8M Video Understanding Challenge

Dependencies

Resources

Useful scripts

Paths of models and data

How to generate a solution

Single model

Ensemble model

Note

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages