Use following model configs.
model_configs/i3d_resnet50.py
model_configs/i3dnonlocal_resnet50.py
Reference: facebookresearch/SlowFast
architecture | depth | pretrain | frame length x sample rate | top1 | top5 | model | config |
---|---|---|---|---|---|---|---|
I3D | R50 | - | 8 x 8 | 73.5 | 90.8 | link |
Kinetics/c2/I3D_8x8_R50 |
I3D NLN | R50 | - | 8 x 8 | 74.0 | 91.1 | link |
Kinetics/c2/I3D_NLN_8x8_R50 |
Reference: facebookresearch/video-nonlocal-net
script | input frames | freeze bn? | 3D conv? | non-local? | top1 | in paper | top5 | model | logs |
---|---|---|---|---|---|---|---|---|---|
run_i3d_baseline_400k_32f.sh | 32 | - | Yes | - | 73.6 | 73.3 | 90.8 | link |
link |
run_i3d_nlnet_400k_32f.sh | 32 | - | Yes | Yes | 74.9 | 74.9 | 91.6 | link |
link |
This model is trained with PyVideoAI.
Top1/5 accuracy is calculated using 1 spatial centre crop and 5 temporal crops.
dataset_configs/hmdb.py
model_configs/i3d_resnet50.py
exp_configs/hmdb/i3d_resnet50-crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1.py
architecture | Pretrain | frame length x sampling stride | Top1 (highest/last) | Top5 (highest/last) | config, log | model (last) | TensorBoard |
---|---|---|---|---|---|---|---|
I3D-ResNet50 | Kinetics | 8 x 8 | 73.20 / 72.94 | 94.05 / 94.05 | link |
link |
link |
Reference: facebookresearch/SlowFast
architecture | depth | Top1 | Top5 | model |
---|---|---|---|---|
ResNet | R50 | 23.6 | 6.8 | link |