Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Fix configs for detection #1903

Merged
merged 12 commits into from
Sep 8, 2022
5 changes: 4 additions & 1 deletion configs/detection/_base_/models/slowonly_r50.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
backbone=dict(
type='ResNet3dSlowOnly',
depth=50,
pretrained=None,
pretrained=(
'https://download.openmmlab.com/mmaction/recognition/slowonly/'
'slowonly_r50_4x16x1_256e_kinetics400_rgb/'
'slowonly_r50_4x16x1_256e_kinetics400_rgb_20200704-a69556c6.pth'),
pretrained2d=False,
lateral=False,
num_stages=4,
Expand Down
29 changes: 14 additions & 15 deletions configs/detection/acrn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,25 +20,23 @@ Current state-of-the-art approaches for spatio-temporal action localization rely

### AVA2.1

| Model | Modality | Pretrained | Backbone | Input | gpus | mAP | log | ckpt |
| :-------------------------------------------------------------------------------: | :------: | :----------: | :------: | :---: | :--: | :---: | :------------------------------------: | :-------------------------------------: |
| [slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava_rgb](/configs/detection/acrn/slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava_rgb.py) | RGB | Kinetics-400 | ResNet50 | 32x2 | 8 | 27.58 | [log](https://download.openmmlab.com/) | [ckpt](https://download.openmmlab.com/) |
| frame sampling strategy | resolution | gpus | backbone | pretrain | mAP | gpu_mem(M) | config | ckpt | log |
| :---------------------: | :--------: | :--: | :---------------: | :----------: | :---: | :--------: | :---------------------------------------: | :-------------------------------------: | :-------------------------------------: |
| 8x8x1 | raw | 8 | SlowFast ResNet50 | Kinetics-400 | 27.58 | 15263 | [config](/configs/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb_20220906-0dae1a90.pth) | [log](https://download.openmmlab.com/mmaction/v1.0/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb.log) |

### AVA2.2

| Model | Modality | Pretrained | Backbone | Input | gpus | mAP | log | ckpt |
| :-------------------------------------------------------------------------------: | :------: | :----------: | :------: | :---: | :--: | :---: | :------------------------------------: | :-------------------------------------: |
| [slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava22_rgb](/configs/detection/acrn/slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava22_rgb.py) | RGB | Kinetics-400 | ResNet50 | 32x2 | 8 | 27.63 | [log](https://download.openmmlab.com/) | [ckpt](https://download.openmmlab.com/) |
| frame sampling strategy | resolution | gpus | backbone | pretrain | mAP | gpu_mem(M) | config | ckpt | log |
| :---------------------: | :--------: | :--: | :---------------: | :----------: | :---: | :--------: | :---------------------------------------: | :-------------------------------------: | :-------------------------------------: |
| 8x8x1 | raw | 8 | SlowFast ResNet50 | Kinetics-400 | 27.63 | 15263 | [config](/configs/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb.py) | [ckpt](https://download.openmmlab.com/mmaction/v1.0/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb_20220906-0dae1a90.pth) | [log](https://download.openmmlab.com/mmaction/v1.0/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb.log) |

:::{note}
Note:

1. The **gpus** indicates the number of gpu we used to get the checkpoint.
According to the [Linear Scaling Rule](https://arxiv.org/abs/1706.02677), you may set the learning rate proportional to the batch size if you use different GPUs or videos per GPU,
e.g., lr=0.01 for 4 GPUs x 2 video/gpu and lr=0.08 for 16 GPUs x 4 video/gpu.

:::

For more details on data preparation, you can refer to AVA in [Data Preparation](/docs/data_preparation.md).
For more details on data preparation, you can refer to to [AVA Data Preparation](/tools/data/ava/README.md).

## Train

Expand All @@ -51,11 +49,11 @@ python tools/train.py ${CONFIG_FILE} [optional arguments]
Example: train ACRN with SlowFast backbone on AVA in a deterministic option.

```shell
python tools/train.py configs/detection/acrn/slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava_rgb.py \
python tools/train.py configs/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb.py \
--cfg-options randomness.seed=0 randomness.deterministic=True
```

For more details and optional arguments infos, you can refer to **Training setting** part in [getting_started](/docs/getting_started.md#training-setting).
For more details and optional arguments infos, you can refer to the **Training** part in the [Training and Test Tutorial](/docs/en/user_guides/4_train_test.md).

## Test

Expand All @@ -65,13 +63,14 @@ You can use the following command to test a model.
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
```

Example: test ACRN with SlowFast backbone.
Example: test ACRN with SlowFast backbone on AVA and dump the result to a pkl file.

```shell
python tools/test.py configs/detection/acrn/slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava_rgb.py checkpoints/SOME_CHECKPOINT.pth
python tools/test.py configs/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb.py \
checkpoints/SOME_CHECKPOINT.pth --dump result.pkl
```

For more details and optional arguments infos, you can refer to **Test a dataset** part in [getting_started](/docs/getting_started.md#test-a-dataset) .
For more details and optional arguments infos, you can refer to the **Test** part in the [Training and Test Tutorial](/docs/en/user_guides/4_train_test.md).

## Citation

Expand Down
81 changes: 0 additions & 81 deletions configs/detection/acrn/README_zh-CN.md

This file was deleted.

12 changes: 8 additions & 4 deletions configs/detection/acrn/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ Collections:
Title: "Actor-Centric Relation Network"

Models:
- Name: slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava_rgb
Config: configs/detection/ava/slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava_rgb.py
- Name: slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb
Config: configs/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb.py
In Collection: ACRN
Metadata:
Architecture: ResNet50
Expand All @@ -23,9 +23,11 @@ Models:
Task: Action Detection
Metrics:
mAP: 27.58
Training Log: https://download.openmmlab.com/mmaction/v1.0/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb.log
Weights: https://download.openmmlab.com/mmaction/v1.0/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava21-rgb_20220906-0dae1a90.pth

- Name: slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava22_rgb
Config: configs/detection/ava/slowfast_acrn_kinetics400_pretrained_r50_8x8x1_cosine_10e_8xb8_ava22_rgb.py
- Name: slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb
Config: configs/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb.py
In Collection: ACRN
Metadata:
Architecture: ResNet50
Expand All @@ -41,3 +43,5 @@ Models:
Task: Action Detection
Metrics:
mAP: 27.63
Training Log: https://download.openmmlab.com/mmaction/v1.0/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb.log
Weights: https://download.openmmlab.com/mmaction/v1.0/detection/acrn/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb/slowfast-acrn_kinetics400-pretrained-r50_8xb8-8x8x1-cosine-10e_ava22-rgb_20220906-0dae1a90.pth
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,10 @@
_delete_=True,
type='ResNet3dSlowFast',
_scope_='mmaction',
pretrained=None,
pretrained=(
'https://download.openmmlab.com/mmaction/recognition/slowfast/'
'slowfast_r50_8x8x1_256e_kinetics400_rgb/'
'slowfast_r50_8x8x1_256e_kinetics400_rgb_20200716-73547d2b.pth'),
resample_rate=4,
speed_ratio=4,
channel_ratio=8,
Expand Down Expand Up @@ -134,7 +137,3 @@
optim_wrapper = dict(
optimizer=dict(type='SGD', lr=0.1, momentum=0.9, weight_decay=0.00001),
clip_grad=dict(max_norm=40, norm_type=2))

load_from = ('https://download.openmmlab.com/mmaction/recognition/slowfast/'
'slowfast_r50_8x8x1_256e_kinetics400_rgb/'
'slowfast_r50_8x8x1_256e_kinetics400_rgb_20200716-73547d2b.pth')
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
_base_ = [('slowfast-acrn_kinetics400-pretrained-r50'
'_8xb8-8x8x1-cosine-10e_ava21-rgb.py')]

dataset_type = 'AVADataset'
data_root = 'data/ava/rawframes'
anno_root = 'data/ava/annotations'

ann_file_train = f'{anno_root}/ava_train_v2.2.csv'
ann_file_val = f'{anno_root}/ava_val_v2.2.csv'

exclude_file_train = f'{anno_root}/ava_train_excluded_timestamps_v2.2.csv'
exclude_file_val = f'{anno_root}/ava_val_excluded_timestamps_v2.2.csv'

label_file = f'{anno_root}/ava_action_list_v2.2_for_activitynet_2019.pbtxt'

proposal_file_train = (f'{anno_root}/ava_dense_proposals_train.FAIR.'
'recall_93.9.pkl')
proposal_file_val = f'{anno_root}/ava_dense_proposals_val.FAIR.recall_93.9.pkl'

train_pipeline = [
dict(type='SampleAVAFrames', clip_len=32, frame_interval=2),
dict(type='RawFrameDecode'),
dict(type='RandomRescale', scale_range=(256, 320)),
dict(type='RandomCrop', size=256),
dict(type='Flip', flip_ratio=0.5),
dict(type='FormatShape', input_format='NCTHW', collapse=True),
dict(type='PackActionInputs')
]
# The testing is w/o. any cropping / flipping
val_pipeline = [
dict(
type='SampleAVAFrames', clip_len=32, frame_interval=2, test_mode=True),
dict(type='RawFrameDecode'),
dict(type='Resize', scale=(-1, 256)),
dict(type='FormatShape', input_format='NCTHW', collapse=True),
dict(type='PackActionInputs')
]

train_dataloader = dict(
batch_size=8,
num_workers=8,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type=dataset_type,
ann_file=ann_file_train,
exclude_file=exclude_file_train,
pipeline=train_pipeline,
label_file=label_file,
proposal_file=proposal_file_train,
data_prefix=dict(img=data_root)))
val_dataloader = dict(
batch_size=1,
num_workers=8,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
ann_file=ann_file_val,
exclude_file=exclude_file_val,
pipeline=val_pipeline,
label_file=label_file,
proposal_file=proposal_file_val,
data_prefix=dict(img=data_root),
test_mode=True))
test_dataloader = val_dataloader

val_evaluator = dict(
type='AVAMetric',
ann_file=ann_file_val,
label_file=label_file,
exclude_file=exclude_file_val)
test_evaluator = val_evaluator
Loading