diff --git a/docs/en/advanced_guides/customize_dataset.md b/docs/en/advanced_guides/customize_dataset.md index 31a6e16b2b..690ce3859c 100644 --- a/docs/en/advanced_guides/customize_dataset.md +++ b/docs/en/advanced_guides/customize_dataset.md @@ -1,34 +1,34 @@ -# Customize Datasets +# Customize Dataset In this tutorial, we will introduce some methods about how to customize your own dataset by online conversion. -- [Customize Datasets](#customize-datasets) +- [Customize Dataset](#customize-dataset) - [General understanding of the Dataset in MMAction2](#general-understanding-of-the-dataset-in-mmaction2) - [Customize new datasets](#customize-new-datasets) - [Customize keypoint format for PoseDataset](#customize-keypoint-format-for-posedataset) ## General understanding of the Dataset in MMAction2 -MMAction2 provides specific Dataset class according to the task, e.g. `VideoDataset`/`RawframeDataset` for action recognition, `AVADataset` for spatio-temporal action detection, `PoseDataset` for skeleton-based action recognition. All these specific datasets only need to implement `get_data_info(self, idx)` to build a data list from the annotation file, while other functions are handled by the superclass. The following table shows the inherent relationship and the main function of the modules. +MMAction2 provides task-specific `Dataset` class, e.g. `VideoDataset`/`RawframeDataset` for action recognition, `AVADataset` for spatio-temporal action detection, `PoseDataset` for skeleton-based action recognition. These task-specific datasets only require the implementation of `load_data_list(self)` for generating a data list from the annotation file. The remaining functions are automatically handled by the superclass (i.e., `BaseActionDataset` and `BaseDataset`). The following table shows the inherent relationship and the main method of the modules. -| Class Name | Functions | -| ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| MMAction2::VideoDataset | `load_data_list(self)`
Build data list from the annotation file. | -| MMAction2::BaseActionDataset | `get_data_info(self, idx)`
Given the `idx`, return the corresponding data sample from data list | -| MMEngine::BaseDataset | `__getitem__(self, idx)`
Given the `idx`, call `get_data_info` to get data sample, then call the `pipeline` to perform transforms and augmentation in `train_pipeline` or `val_pipeline` | +| Class Name | Class Method | +| ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `MMAction2::VideoDataset` | `load_data_list(self)`
Build data list from the annotation file. | +| `MMAction2::BaseActionDataset` | `get_data_info(self, idx)`
Given the `idx`, return the corresponding data sample from the data list. | +| `MMEngine::BaseDataset` | `__getitem__(self, idx)`
Given the `idx`, call `get_data_info` to get the data sample, then call the `pipeline` to perform transforms and augmentation in `train_pipeline` or `val_pipeline` . | ## Customize new datasets -For most scenarios, we don't need to customize a new dataset class, offline conversion is recommended way to use your data. But customizing a new dataset class is also easy in MMAction2. As above mentioned, a dataset for a specific task usually only needs to implement `load_data_list(self)` to generate the data list from the annotation file. It is worth noting that elements in the `data_list` are `dict` with fields required in the following pipeline. +Although offline conversion is the preferred method for utilizing your own data in most cases, MMAction2 offers a convenient process for creating a customized `Dataset` class. As mentioned previously, task-specific datasets only require the implementation of `load_data_list(self)` for generating a data list from the annotation file. It is noteworthy that the elements in the `data_list` are `dict` with fields that are essential for the subsequent processes in the `pipeline`. -Take `VideoDataset` as an example, `train_pipeline`/`val_pipeline` requires `'filename'` in `DecordInit` and `'label'` in `PackActionInput`, so data samples in the data list have 2 fields: `'filename'` and `'label'`. -you can refer to [customize pipeline](customize_pipeline.md) for more details about the pipeline. +Taking `VideoDataset` as an example, `train_pipeline`/`val_pipeline` require `'filename'` in `DecordInit` and `'label'` in `PackActionInputs`. Consequently, the data samples in the `data_list` must contain 2 fields: `'filename'` and `'label'`. +Please refer to [customize pipeline](customize_pipeline.md) for more details about the `pipeline`. ``` data_list.append(dict(filename=filename, label=label)) ``` -While `AVADataset` is more complex, elements in the data list consist of several fields about video data, and it further overwrites `get_data_info(self, idx)` to convert keys, which are required in spatio-temporal action detection pipeline. +However, `AVADataset` is more complex, data samples in the `data_list` consist of several fields about the video data. Moreover, it overwrites `get_data_info(self, idx)` to convert keys that are indispensable in the spatio-temporal action detection pipeline. ```python @@ -60,21 +60,21 @@ class AVADataset(BaseActionDataset): ## Customize keypoint format for PoseDataset -MMAction2 currently supports three kinds of keypoint formats: `coco`, `nturgb+d` and `openpose`. If your use one of them, just specify the corresponding format in the following modules: +MMAction2 currently supports three keypoint formats: `coco`, `nturgb+d` and `openpose`. If you use one of these formats, you may simply specify the corresponding format in the following modules: -For Graph Convolutional Networks, such as AAGCN, STGCN... +For Graph Convolutional Networks, such as AAGCN, STGCN, ... -- transform: argument `dataset` in `JointToBone`. -- backbone: argument `graph_cfg` in Graph Convolutional Networks. +- `pipeline`: argument `dataset` in `JointToBone`. +- `backbone`: argument `graph_cfg` in Graph Convolutional Networks. -And for PoseC3D: +For PoseC3D: -- transform: In `Flip`, specify `left_kp` and `right_kp` according to the keypoint symmetrical relationship, or remove the transform for asymmetric keypoints structure. -- transform: In `GeneratePoseTarget`, specify `skeletons`, `left_limb`, `right_limb` if `with_limb` is `true`, and `left_kp`, `right_kp` if `with_kp` is `true`. +- `pipeline`: In `Flip`, specify `left_kp` and `right_kp` based on the symmetrical relationship between keypoints. +- `pipeline`: In `GeneratePoseTarget`, specify `skeletons`, `left_limb`, `right_limb` if `with_limb` is `True`, and `left_kp`, `right_kp` if `with_kp` is `True`. -For a custom format, you need to add a new graph layout into models and transforms, which defines the keypoints and their connection relationship. +If using a custom keypoint format, it is necessary to include a new graph layout in both the `backbone` and `pipeline`. This layout will define the keypoints and their connection relationship. -Take the coco dataset as an example, we define a layout named `coco` in `Graph`, and set its `inward` as followed, which includes all connections between nodes, each connection is a pair of nodes from far to near. The order of connections does not matter. Other settings about coco are to set the number of nodes to 17, and set node 0 as the center node. +Taking the `coco` dataset as an example, we define a layout named `coco` in `Graph`. The `inward` connections of this layout comprise all node connections, with each **centripetal** connection consisting of a tuple of nodes. Additional settings for `coco` include specifying the number of nodes as `17` the `node 0` as the central node. ```python @@ -85,7 +85,7 @@ self.inward = [(15, 13), (13, 11), (16, 14), (14, 12), (11, 5), self.center = 0 ``` -Similarly, we define the `pairs` in `JointToBone`, adding a bone of `(0, 0)` to align the number of bones to the nodes. The `pairs` of coco dataset is as followed, same as above mentioned, the order of pairs does not matter. +Similarly, we define the `pairs` in `JointToBone`, adding a bone of `(0, 0)` to align the number of bones to the nodes. The `pairs` of coco dataset are shown below, and the order of `pairs` in `JointToBone` is irrelevant. ```python @@ -94,10 +94,9 @@ self.pairs = ((0, 0), (1, 0), (2, 0), (3, 1), (4, 2), (5, 0), (12, 0), (13, 11), (14, 12), (15, 13), (16, 14)) ``` -For your custom format, just define the above setting as your graph structure, and specify in your config file as followed, we take `STGCN` as an example, assuming you already define a `custom_dataset` in `Graph` and `JointToBone`, and num_classes is n. +To use your custom keypoint format, simply define the aforementioned settings as your graph structure and specify them in your config file as shown below, In this example, we will use `STGCN`, with `n` denoting the number of classes and `custom_dataset` defined in `Graph` and `JointToBone`. ```python - model = dict( type='RecognizerGCN', backbone=dict( diff --git a/docs/en/advanced_guides/customize_logging.md b/docs/en/advanced_guides/customize_logging.md index aabaad949f..ccbbeeafed 100644 --- a/docs/en/advanced_guides/customize_logging.md +++ b/docs/en/advanced_guides/customize_logging.md @@ -1,6 +1,6 @@ # Customize Logging -MMAction2 produces a lot of logs during the running process, such as loss, iteration time, learning rate, etc. In this section, we will introduce you how to output custom log. More details about the logging system, please refer to [MMEngine](https://mmengine.readthedocs.io/en/latest/advanced_tutorials/logging.html). +MMAction2 produces a lot of logs during the running process, such as loss, iteration time, learning rate, etc. In this section, we will introduce you how to output custom log. More details about the logging system, please refer to [MMEngine Tutorial](https://mmengine.readthedocs.io/en/latest/advanced_tutorials/logging.html). - [Customize Logging](#customize-logging) - [Flexible Logging System](#flexible-logging-system) @@ -9,13 +9,13 @@ MMAction2 produces a lot of logs during the running process, such as loss, itera ## Flexible Logging System -MMAction2 configures the logging system by LogProcessor in [default_runtime](/configs/_base_/default_runtime.py) in default, which is equivalent to: +The MMAction2 logging system is configured by the `LogProcessor` in [default_runtime](/configs/_base_/default_runtime.py) by default, which is equivalent to: ```python log_processor = dict(type='LogProcessor', window_size=20, by_epoch=True) ``` -Defaultly, LogProcessor catches all filed start with `loss` return by `model.forward`. For example in the following model, `loss1` and `loss2` will be logged automatically without additional configuration. +By default, the `LogProcessor` captures all fields that begin with `loss` returned by `model.forward`. For instance, in the following model, `loss1` and `loss2` will be logged automatically without any additional configuration. ```python from mmengine.model import BaseModel @@ -32,14 +32,14 @@ class ToyModel(BaseModel): return dict(loss1=loss1, loss2=loss2) ``` -The format of the output log is as followed: +The output log follows the following format: ``` 08/21 02:58:41 - mmengine - INFO - Epoch(train) [1][10/25] lr: 1.0000e-02 eta: 0:00:00 time: 0.0019 data_time: 0.0004 loss1: 0.8381 loss2: 0.9007 loss: 1.7388 08/21 02:58:41 - mmengine - INFO - Epoch(train) [1][20/25] lr: 1.0000e-02 eta: 0:00:00 time: 0.0029 data_time: 0.0010 loss1: 0.1978 loss2: 0.4312 loss: 0.6290 ``` -LogProcessor will output the log in the following format: +`LogProcessor` will output the log in the following format: - The prefix of the log: - epoch mode(`by_epoch=True`): `Epoch(train) [{current_epoch}/{current_iteration}]/{dataloader_length}` @@ -55,11 +55,11 @@ LogProcessor will output the log in the following format: log_processor outputs the epoch based log by default(`by_epoch=True`). To get an expected log matched with the `train_cfg`, we should set the same value for `by_epoch` in `train_cfg` and `log_processor`. ``` -Based on the rules above, the code snippet will count the average value of the loss1 and the loss2 every 20 iterations. More types of statistical methods, please refer to [MMEngine.LogProcessor](mmengine.runner.LogProcessor). +Based on the rules above, the code snippet will count the average value of the loss1 and the loss2 every 20 iterations. More types of statistical methods, please refer to [mmengine.runner.LogProcessor](mmengine.runner.LogProcessor). ## Customize log -The logging system could not only log the loss, lr, .etc but also collect and output the custom log. For example, if we want to statistic the intermediate loss: +The logging system could not only log the `loss`, `lr`, .etc but also collect and output the custom log. For example, if we want to statistic the intermediate loss: The `ToyModel` calculate `loss_tmp` in forward, but don't save it into the return dict. @@ -108,7 +108,7 @@ The `loss_tmp` will be added to the output log: ## Export the debug log -To export the debug log to the `work_dir`, you can set log_level in config file as followed: +To export the debug log to the `work_dir`, you can set log_level in config file as follows: ``` log_level='DEBUG' diff --git a/docs/en/advanced_guides/customize_pipeline.md b/docs/en/advanced_guides/customize_pipeline.md index 632216ba10..ba7a6a8e5b 100644 --- a/docs/en/advanced_guides/customize_pipeline.md +++ b/docs/en/advanced_guides/customize_pipeline.md @@ -1,21 +1,20 @@ # Customize Data Pipeline -In this tutorial, we will introduce some methods about how to build the data pipeline (i.e., data transformations)for your tasks. +In this tutorial, we will introduce some methods about how to build the data pipeline (i.e., data transformations) for your tasks. - [Customize Data Pipeline](#customize-data-pipeline) - - [Design of Data pipelines](#design-of-data-pipelines) - - [Modify the training/test pipeline](#modify-the-trainingtest-pipeline) + - [Design of Data Pipeline](#design-of-data-pipeline) + - [Modify the Training/Testing Pipeline](#modify-the-trainingtest-pipeline) - [Loading](#loading) - - [Sampling frames and other processing](#sampling-frames-and-other-processing) + - [Sampling Frames and Other Processing](#sampling-frames-and-other-processing) - [Formatting](#formatting) - - [Add new data transforms](#add-new-data-transforms) + - [Add New Data Transforms](#add-new-data-transforms) -## Design of Data pipelines +## Design of Data Pipeline -The data pipeline means how to process the sample dict when indexing a sample from the dataset. And it -consists of a sequence of data transforms. Each data transform takes a dict as input, processes it, and outputs a dict for the next data transform. +The data pipeline refers to the procedure of handling the data sample dict when indexing a sample from the dataset, and comprises a series of data transforms. Each data transform accepts a `dict` as input, processes it, and produces a `dict` as output for the subsequent data transform in the sequence. -Here is a data pipeline example for SlowFast training on Kinetics for `VideoDataset`. It first use [`decord`](https://github.com/dmlc/decord) to read the raw videos and randomly sample one video clip (the clip has 32 frames, and the interval between frames is 2). Next it applies the random resized crop and random horizontal flip to all frames. Finally the data shape is formatted as `NCTHW`. +Below is an example data pipeline for training SlowFast on Kinetics using `VideoDataset`. The pipeline initially employs [`decord`](https://github.com/dmlc/decord) to read the raw videos and randomly sample one video clip, which comprises `32` frames with a frame interval of `2`. Subsequently, it applies random resized crop and random horizontal flip to all frames before formatting the data shape as `NCTHW`, which is `(1, 3, 32, 224, 224)` in this example. ```python train_pipeline = [ @@ -31,18 +30,17 @@ train_pipeline = [ ] ``` -All available data transforms in MMAction2 can be found in the [data transforms docs](mmaction.datasets.transforms). +A comprehensive list of all available data transforms in MMAction2 can be found in the [mmaction.datasets.transforms](mmaction.datasets.transforms). -## Modify the training/test pipeline +## Modify the Training/Testing Pipeline -The data pipeline in MMAction2 is pretty flexible. You can control almost every step of the data -preprocessing from the config file, but on the other hand, you may be confused facing so many options. +The data pipeline in MMAction2 is highly adaptable, as nearly every step of the data preprocessing can be configured from the config file. However, the wide array of options may be overwhelming for some users. -Here is a common practice and guidance for action recognition tasks. +Below are some general practices and guidance for building a data pipeline for action recognition tasks. ### Loading -At the beginning of a data pipeline, we usually need to load videos. But if you already extract the frames, you should use `RawFrameDecode` and change the dataset type to `RawframeDataset`: +At the beginning of a data pipeline, it is customary to load videos. However, if the frames have already been extracted, you should utilize `RawFrameDecode` and modify the dataset type to `RawframeDataset`. ```python train_pipeline = [ @@ -57,14 +55,13 @@ train_pipeline = [ ] ``` -If you want to load data from files with special formats or special locations, you can [implement a new loading -transform](#add-new-data-transforms) and add it at the beginning of the data pipeline. +If you need to load data from files with distinct formats (e.g., `pkl`, `bin`, etc.) or from specific locations, you may create a new loading transform and include it at the beginning of the data pipeline. Please refer to [Add New Data Transforms](#add-new-data-transforms) for more details. -### Sampling frames and other processing +### Sampling Frames and Other Processing During training and testing, we may have different strategies to sample frames from the video. -For example, during testing of SlowFast, we sample multiple clips uniformly: +For instance, when testing SlowFast, we uniformly sample multiple clips as follows: ```python test_pipeline = [ @@ -79,9 +76,9 @@ test_pipeline = [ ] ``` -In the above example, 10 clips of 32-frame video clips will be sampled for each video. We use `test_mode=True` to uniformly sample these clips (as opposed to randomly sample during training). +In the above example, 10 video clips, each comprising 32 frames, will be uniformly sampled from each video. `test_mode=True` is employed to accomplish this, as opposed to random sampling during training. -Another example is that TSN/TSM models sample multiple segments from the video: +Another example involves `TSN/TSM` models, which sample multiple segments from the video: ```python train_pipeline = [ @@ -91,20 +88,15 @@ train_pipeline = [ ] ``` -```{note} -Usually, the data augmentation part in the data pipeline handles only video-wise transforms, but not transforms -like video normalization or mixup/cutmix. It's because we can do image normalization and mixup/cutmix on batch data -to accelerate with GPUs. To configure video normalization and mixup/cutmix, please use the [data preprocessor] -(mmaction.models.utils.data_preprocessor). -``` +Typically, the data augmentations in the data pipeline handles only video-level transforms, such as resizing or cropping, but not transforms like video normalization or mixup/cutmix. This is because we can do video normalization and mixup/cutmix on batched video data +to accelerate processing using GPUs. To configure video normalization and mixup/cutmix, please use the [mmaction.models.utils.data_preprocessor](mmaction.models.utils.data_preprocessor). ### Formatting -The formatting is to collect training data from the data information dict and convert these data to -model-friendly format. +Formatting involves collecting training data from the data information dict and converting it into a format that is compatible with the model. -In most cases, you can simply use [`PackActionInputs`](mmaction.datasets.transforms.PackActionInputs), and it will -convert the image in NumPy array format to PyTorch tensor, and pack the ground truth categories information and +In most cases, you can simply employ [`PackActionInputs`](mmaction.datasets.transforms.PackActionInputs), and it will +convert the image in `NumPy Array` format to `PyTorch Tensor`, and pack the ground truth category information and other meta information as a dict-like object [`ActionDataSample`](mmaction.structures.ActionDataSample). ```python @@ -114,12 +106,10 @@ train_pipeline = [ ] ``` -## Add new data transforms +## Add New Data Transforms -1. Write a new data transform in any file, e.g., `my_transform.py`, and place it in - the folder `mmaction/datasets/transforms/`. The data transform class needs to inherit - the [`mmcv.transforms.BaseTransform`](mmcv.transforms.BaseTransform) class and override - the `transform` method which takes a dict as input and returns a dict. +1. To create a new data transform, write a new transform class in a python file named, for example, `my_transforms.py`. The data transform classes must inherit + the [`mmcv.transforms.BaseTransform`](mmcv.transforms.BaseTransform) class and override the `transform` method which takes a `dict` as input and returns a `dict`. Finally, place `my_transforms.py` in the folder `mmaction/datasets/transforms/`. ```python from mmcv.transforms import BaseTransform @@ -127,9 +117,12 @@ train_pipeline = [ @TRANSFORMS.register_module() class MyTransform(BaseTransform): + def __init__(self, msg): + self.msg = msg def transform(self, results): # Modify the data information dict `results`. + print(msg, 'MMAction2.') return results ``` @@ -149,7 +142,7 @@ train_pipeline = [ ```python train_pipeline = [ ... - dict(type='MyTransform'), + dict(type='MyTransform', msg='Hello!'), ... ] ``` diff --git a/docs/en/get_started/overview.md b/docs/en/get_started/overview.md index 8bdccd3451..498b093830 100644 --- a/docs/en/get_started/overview.md +++ b/docs/en/get_started/overview.md @@ -2,18 +2,18 @@ ## What is MMAction2 -MMAction2 is an open source toolkit based on PyTorch, supporting numerous video understanding models, including action recognition, skeleton-based action recognition, spatio-temporal action detection and temporal action localization. In addition, it supports widely-used academic datasets and provides many useful tools, assisting users in exploring various aspects of models and datasets and implementing high-quality algorithms. Generally, it has the following features. +MMAction2 is an open source toolkit based on PyTorch, supporting numerous video understanding models, including **action recognition, skeleton-based action recognition, spatio-temporal action detection and temporal action localization**. Moreover, it supports widely-used academic datasets and offers many useful tools, assisting users in exploring various aspects of models and datasets, as well as implementing high-quality algorithms. Generally, the toolkit boasts the following features: -One-stop, Multi-model: MMAction2 supports various video understanding tasks and implements the latest models for action recognition, localization, detection. +**One-stop, Multi-model**: MMAction2 supports various video understanding tasks and implements state-of-the-art models for action recognition, localization, detection. -Modular Design: MMAction2’s modular design allows users to define and reuse modules in the model on demand. +**Modular Design**: The modular design of MMAction2 enables users to define and reuse modules in the model as required. -Various Useful Tools: MMAction2 provides many analysis tools, including visualizers, validation scripts, evaluators, etc., to help users troubleshoot, finetune or compare models. +**Various Useful Tools**: MMAction2 provides an array of analysis tools, such as visualizers, validation scripts, evaluators, etc., to aid users in troubleshooting, fine-tuning, or comparing models. -Powered by OpenMMLab: Like other algorithm libraries in OpenMMLab family, MMAction2 follows OpenMMLab’s rigorous development guidelines and interface conventions, significantly reducing the learning cost of users familiar with other projects in OpenMMLab family. In addition, benefiting from the unified interfaces among OpenMMLab, you can easily call the models implemented in other OpenMMLab projects (e.g. MMClassification) in MMAction2, facilitating cross-domain research and real-world applications. +**Powered by OpenMMLab**: Similar to other algorithm libraries in the OpenMMLab family, MMAction2 adheres to OpenMMLab's rigorous development guidelines and interface conventions, considerably reducing the learning cost for users familiar with other OpenMMLab projects. Furthermore, due to the unified interfaces among OpenMMLab projects, it is easy to call models implemented in other OpenMMLab projects (such as MMClassification) in MMAction2, which greatly facilitates cross-domain research and real-world applications. - @@ -21,7 +21,7 @@ Powered by OpenMMLab: Like other algorithm libraries in OpenMMLab family, MMActi
+

Action Recognition


Skeleton-based Action Recognition

-

Spatio-Temporal Action Detection


+

Spatio-Temporal Action Detection

@@ -34,23 +34,23 @@ We have prepared a wealth of documents to meet your various needs: - [Installation](installation.md) - [Quick Run](quick_run.md) -- [Inference](../user_guides/inference.md) +- [Inference with existing models](../user_guides/inference.md)
For training on supported dataset -- [learn about configs](../user_guides/config.md) -- [prepare dataset](../user_guides/prepare_dataset.md) -- [Training and testing](../user_guides/train_test.md) +- [Learn about Configs](../user_guides/config.md) +- [Prepare Dataset](../user_guides/prepare_dataset.md) +- [Training and Test](../user_guides/train_test.md)
For looking for some common issues -- [FAQs](faq.md) +- [FAQ](faq.md) - [Useful tools](../useful_tools.md)
@@ -58,19 +58,19 @@ We have prepared a wealth of documents to meet your various needs:
For a general understanding about MMAction2 -- [20-minute tour to MMAction2](guide_to_framework.md) -- [Data flow in MMAction2](../advanced_guides/dataflow.md) +- [A 20-Minute Guide to MMAction2 FrameWork](guide_to_framework.md) +- [Dataflow in MMAction2](../advanced_guides/dataflow.md)
For advanced usage about custom training -- [Customize models](../advanced_guides/customize_models.md) -- [Customize datasets](../advanced_guides/customize_dataset.md) -- [Customize data transformation and augmentation](../advanced_guides/customize_pipeline.md) -- [Customize optimizer and scheduler](../advanced_guides/customize_optimizer.md) -- [Customize logging](../advanced_guides/customize_logging.md) +- [Customize Model](../advanced_guides/customize_models.md) +- [Customize Dataset](../advanced_guides/customize_dataset.md) +- [Customize Data Pipeline](../advanced_guides/customize_pipeline.md) +- [Customize Optimizer](../advanced_guides/customize_optimizer.md) +- [Customize Logging](../advanced_guides/customize_logging.md)
@@ -92,6 +92,6 @@ We have prepared a wealth of documents to meet your various needs:
For researchers and developers who are willing to contribute to MMAction2 -- [Contribution Guide](contribution_guide.md) +- [How to contribute to MMAction2](contribution_guide.md)
diff --git a/docs/en/get_started/quick_run.md b/docs/en/get_started/quick_run.md index 84ae5b985f..619e904370 100644 --- a/docs/en/get_started/quick_run.md +++ b/docs/en/get_started/quick_run.md @@ -1,6 +1,6 @@ # Quick Run -This chapter will take you through the basic functions of MMAction2. And we assume you [installed MMAction2 from source](../installation#best-practices). +This chapter will introduce you to the fundamental functionalities of MMAction2. We assume that you have [installed MMAction2 from source](../installation#best-practices). - [Quick Run](#quick-run) - [Inference](#inference) @@ -15,7 +15,7 @@ This chapter will take you through the basic functions of MMAction2. And we assu ## Inference -Run the following in MMAction2's root directory: +Run the following command in the root directory of MMAction2: ```shell python demo/demo_inferencer.py demo/demo.mp4 \ @@ -36,10 +36,10 @@ You should be able to see a pop-up video and the inference result printed out in ``` ```{note} -If you are running MMAction2 on a server without GUI or via SSH tunnel with X11 forwarding disabled, you may not see the pop-up window. +If you are running MMAction2 on a server without a GUI or via an SSH tunnel with X11 forwarding disabled, you may not see the pop-up window. ``` -A detailed description of MMAction2's inference interface can be found [here](/demo/README#inferencer) +A detailed description of MMAction2's inference interface can be found [here](/demo/README.md#inferencer). In addition to using our well-provided pre-trained models, you can also train models on your own datasets. In the next section, we will take you through the basic functions of MMAction2 by training TSN on the tiny [Kinetics](https://download.openmmlab.com/mmaction/kinetics400_tiny.zip) dataset as an example. @@ -51,7 +51,7 @@ Since the variety of video dataset formats are not conducive to switching datase But here, efficiency means everything. ``` -Here, we have prepared a lite version of Kinetics dataset for demonstration purposes. Download our pre-prepared [zip](https://download.openmmlab.com/mmaction/kinetics400_tiny.zip) and extract it to the `data/` directory under mmaction2 to get our prepared video and annotation file. +To get started, please download our pre-prepared [kinetics400_tiny.zip](https://download.openmmlab.com/mmaction/kinetics400_tiny.zip) and extract it to the `data/` directory in the root directory of MMAction2. This will provide you with the necessary videos and annotation file. ```Bash wget https://download.openmmlab.com/mmaction/kinetics400_tiny.zip @@ -61,7 +61,7 @@ unzip kinetics400_tiny.zip -d data/ ## Modify the Config -Once the dataset is prepared, we will then specify the location of the training set and the training parameters by modifying the config file. +After preparing the dataset, the next step is to modify the config file to specify the location of the training set and training parameters. In this example, we will train a TSN using resnet50 as its backbone. Since MMAction2 already has a config file for the full Kinetics400 dataset (`configs/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb.py`), we just need to make some modifications on top of it. @@ -78,18 +78,17 @@ ann_file_val = 'data/kinetics400_tiny/kinetics_tiny_val_video.txt' ### Modify Runtime Config -Also, because of the reduced dataset size, we'd better reduce training batchsize to 4 and the number of training epochs to 10 accordingly, shorten the validation interval as well as the weight storage interval to 1 rounds, and modify the learning rate decay strategy. Modify corresponding keys in `configs/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb.py` as following lines to take effect. +Additionally, due to the reduced size of the dataset, we recommend decreasing the training batch size to 4 and the number of training epochs to 10 accordingly. Furthermore, we suggest shortening the validation and weight storage intervals to 1 round each, and modifying the learning rate decay strategy. Modify the corresponding keys in `configs/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb.py` as following lines to take effect. -```Python +```python # set training batch size to 4 train_dataloader['batch_size'] = 4 # Save checkpoints every epoch, and only keep the latest checkpoint default_hooks = dict( - checkpoint=dict(type='CheckpointHook', interval=3, max_keep_ckpts=1,), - ) -# Set the maximum number of epochs to 10, and validate the model every 3 epochs -train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=10, val_interval=3) + checkpoint=dict(type='CheckpointHook', interval=1, max_keep_ckpts=1)) +# Set the maximum number of epochs to 10, and validate the model every 1 epochs +train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=10, val_interval=1) # adjust learning rate schedule according to 10 epochs param_scheduler = [ dict( @@ -104,10 +103,9 @@ param_scheduler = [ ### Modify Model Config -Further, due to the small size of tiny kinetics dataset, we'd better to load a pre-trained model on original Kinetics dataset. We also need to modify the model according to the actual number of classes. Just directly put the following lines into `configs/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb.py`. - -```Python +Further, due to the small size of tiny Kinetics dataset, it is recommended to load a pre-trained model on the original Kinetics dataset. Additionally, the model needs to be modified according to the actual number of classes. Please directly add the following lines to `configs/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb.py`. +```python model = dict( cls_head=dict(num_classes=2)) load_from = 'https://download.openmmlab.com/mmaction/v1.0/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb_20220906-cd10898e.pth' @@ -121,7 +119,7 @@ For a more detailed description of config, please refer to [here](../user_guides ## Browse the Dataset -Before we start the training, we can also visualize the frames processed by training-time [data transforms](<>). It's quite simple: pass the config file we need to visualize into the [browse_dataset.py](/tools/analysis_tools/browse_dataset.py) script. +Before we start the training, we can also visualize the frames processed by training-time data transforms. It's quite simple: pass the config file we need to visualize into the [browse_dataset.py](/tools/analysis_tools/browse_dataset.py) script. ```Bash python tools/visualizations/browse_dataset.py \ diff --git a/docs/en/merge_docs.sh b/docs/en/merge_docs.sh index 0d3c90ef0e..a2a4e0ba6c 100644 --- a/docs/en/merge_docs.sh +++ b/docs/en/merge_docs.sh @@ -2,11 +2,11 @@ # gather models mkdir -p model_zoo -cat ../../configs/localization/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed '1i\# Action Localization Models' | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/latest/=g' |sed "s/getting_started.html##t/getting_started.html#t/g" > model_zoo/localization_models.md -cat ../../configs/recognition/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed '1i\# Action Recognition Models' | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/latest/=g' | sed "s/getting_started.html##t/getting_started.html#t/g" > model_zoo/recognition_models.md -cat ../../configs/recognition_audio/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/latest/=g' | sed "s/getting_started.html##t/getting_started.html#t/g" >> model_zoo/recognition_models.md -cat ../../configs/detection/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed '1i\# Spatio Temporal Action Detection Models' | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/latest/=g' | sed "s/getting_started.html##t/getting_started.html#t/g" > model_zoo/detection_models.md -cat ../../configs/skeleton/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed '1i\# Skeleton-based Action Recognition Models' | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/latest/=g' | sed "s/getting_started.html##t/getting_started.html#t/g" > model_zoo/skeleton_models.md +cat ../../configs/localization/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed '1i\# Action Localization Models' | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/main/=g' |sed "s/getting_started.html##t/getting_started.html#t/g" > model_zoo/localization_models.md +cat ../../configs/recognition/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed '1i\# Action Recognition Models' | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/main/=g' | sed "s/getting_started.html##t/getting_started.html#t/g" > model_zoo/recognition_models.md +cat ../../configs/recognition_audio/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/main/=g' | sed "s/getting_started.html##t/getting_started.html#t/g" >> model_zoo/recognition_models.md +cat ../../configs/detection/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed '1i\# Spatio Temporal Action Detection Models' | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/main/=g' | sed "s/getting_started.html##t/getting_started.html#t/g" > model_zoo/detection_models.md +cat ../../configs/skeleton/*/README.md | sed "s/md#t/html#t/g" | sed "s/#/#&/" | sed '1i\# Skeleton-based Action Recognition Models' | sed 's/](\/docs\/en/](../g' | sed 's=](/=](https://github.com/open-mmlab/mmaction2/tree/main/=g' | sed "s/getting_started.html##t/getting_started.html#t/g" > model_zoo/skeleton_models.md # gather projects # TODO: generate table of contents for project zoo @@ -40,9 +40,10 @@ sed -i 's/(\/tools\/data\/skeleton\/README.md/(#skeleton-dataset/g' datasetzoo.m cat prepare_data.md >> datasetzoo.md - sed -i 's=](/=](https://github.com/open-mmlab/mmaction2/tree/latest/=g' *.md +sed -i 's=](/=](https://github.com/open-mmlab/mmaction2/tree/main/=g' *.md +sed -i 's=](/=](https://github.com/open-mmlab/mmaction2/tree/main/=g' */*.md sed -i 's/](\/docs\/en\//](g' datasetzoo.md -sed -i 's/](\/docs\/en\//](g' changelog.md +sed -i 's/](\/docs\/en\//](g' notes/changelog.md sed -i 's/](\/docs\/en\//](..g' ./get_stated/*.md sed -i 's/](\/docs\/en\//](..g' ./tutorials/*.md