Skip to content

Guides | Exchanging Data among Containers

ML-TANGO edited this page Jul 12, 2024 · 60 revisions

TOC


Volume for Exchanging Data

TANGO uses Docker host volume for exchanging project-level data among containers as follows;

shared/common/<user_id>/<project_id>/
  • Note that shared volume is a Docker host volume defined at the volumes: section of top-level docker-compose.yml file.

You can check the physical file system path (mount point) of the shared volume by docker volume inspect command.

  • note that shared volume name is prefixed by tango_ in the following command.
# list docker host volumes used.
TANGO $ docker volume ls | grep tango*
local     tango_postgreSQL
local     tango_shared

How to inspect shared volume info

TANGO $ docker volume inspect tango_shared
[
    {
        "CreatedAt": "2023-04-14T16:36:31+09:00",
        "Driver": "local",
        "Labels": {
            "com.docker.compose.project": "tango",
            "com.docker.compose.version": "2.6.1",
            "com.docker.compose.volume": "shared"
        },
        "Mountpoint": "/var/lib/docker/volumes/tango_shared/_data",
        "Name": "tango_shared",
        "Options": null,
        "Scope": "local"
    }
]

Check structure of shared volume

TANGO $ sudo tree /var/lib/docker/volumes/tango_shared/_data

Hence, all containers in TANGO, including Project Manager container, should support above format of volume by using -v option when the containers starts.

Overall shared volume structure is as follows:

shared  (docker volume name shared by all TANGO containers)
   ├── common
   │   └── user_id
   │       └── project_id
   │           ├── project_info.yaml     // generated by project manager
   │           ├── basemodel.yaml        // generated by Base Model Selector
   │           ├── neural_net_info.yaml  // generated by AutoNN
   │           ├── bestmodel.onnx        // generated by AutoNN
   │           ├── bestmodel.pt          // generated by AutoNN
   │           ├── model.py              // generated by AutoNN
   │           ├── deployment.yaml       // generated by Code_Gen
   │           ├── nn_model              // folder genearted by Code_gen
   │           └── nn_model.zip          // zip of nn_model foler for OnDevice developers
   │  
   └── datasets
       └── coco
           ├── dataset.yaml // dataset info on current folder
           ├── train        // folder for train images
           ├── train.txt
           ├── test         // folder for test images
           ├── test.txt
           ├── val          // foler for val images
           └── val.txt

Following information will be shared among containers at project level.

Data Format Owner Description Status
project_info.yaml YAML Project Manager User requirements and target info file structure should be finalized
dataset.yaml YAML Dataset Creator Information on labelled dataset used as input to
AutoNN(bb_nas/neck_nas/yolo7_e)
basemodel.yaml YAML Base Model Selectgor info on the base NN model AutoNN(bb_nas/neck_nas)
neural_net_info.yaml YAML AutoNN Info NN model from AutoNN file structure finalized
best.onnx ONNX NN Model file AutoNNN NN Model from AutoNN
best.pt PyTorch Model Format NN Model file AutoNNN NN Model from AutoNN
model.py Python NN Model file AutoNNN NN Model from AutoNN
deployment.yaml YAML Code_Gen Info required for deployment file structure finalized
nn_model Folder Code_Gen Folder for generated code from Code_Gen
template code also included
-
nn_model.zip ZIP Code_Gen zip of nn_model folder,
for OnDevice developers
-

DataSet Folder for Integration Test

For TANGO integration test, detection task with coco dataset used under following shared volume path:

shared/datasets/coco
  • Note thta shared means docker host volume name and it's physical path couldb be /var/lib/docker/volumes/tango_shared/_data

Shared Data in TANGO workflow

In this section, shared data among the containers of TANGO undergoing project workflow are explained.

project_info.yaml

This YAML file is generated by Project Manager for sharing information on user requirement including deployment target for specific project of users.

Example project_info.yaml

# common
task_type : classification     # classification/detection
learning_type : normal         # normal/incremental/transfer
weight_file :                  # weight file path if learning_type is transfer 
target_info : Cloud            # Cloud/K8S/K8S_Jetson_Nano/PC_Web/PC/Jetson_AGX_Orin/Jetson_AGX_Xavier/Jetson_Nano/Galaxy_S22/Odroid_N2
                               # K8S_Jetson means the deployment with Jetson Nano boards + K8S
                               # PC_Web means a PC with a Web Server and PC means a standalone PC with a console application
cpu : x86                      # arm/x86
acc : cpu                      # cuda/opencl/cpu
memory: 32                     # GByte unit
os : ubuntu                    # windows/ubuntu/android
engine : pytorch               # acl/pytorch/tvm/tensorrt/tflite
target_hostip : 1.2.3.4        # default value = "" but this value must be set if a user wants to make a server/client application
                               # optional, (only applicable for target_info == Cloud, K8S, K8S_Jetson_Nano, or PC_Web)
target_hostport : 8080         # default value = "" but this value must be set if a user wants to make a server/client application
                               # optional, (only applicable for target_info == Cloud, K8S, K8S_Jetson_Nano, or PC_Web)
target_serviceport : 5051      # default value = "" but this value must be set if a user wants to make a server/client application
                               # optional, (only applicable for target_info == Cloud, K8S, K8S_Jetson_Nano, or PC_Web)
nfs_ip: 1.2.3.4                # optional, (only applicable for target_info == k8s) requested by KPST  on May. 08, 2023
nfs_path : /tango/common/model # optional, (only applicable for target_info == k8s) requested by KPST  on May. 08, 2023

# for autonn
nas_type : neck_nas            # possible enums are 'bb_nas' | 'neck_nas', default 'neck_nas'
dataset : coco                 # dataset folder path relative to shared/datasets
batchsize: 32                  # Batch size calculated for a host machine (It is generated by BMS)
# basemodel: basemodel.yaml    # removed: fixed file name 'basemodel.yaml'

# for deploy  (optional)
# lightweight_level : 5        # 0 .. 10, that specifies the level of model optimization, 10 means "maximal optimization"
# precision_level : 5          # 0 .. 10, that specifies the level of precision, 10 means "do not modify neural network model"
# input_source : ./images      # url to receive input image stream, file/directory path, or camera device ID number(0-9) : default = 0
# output_method : 0            # 0=screen display, 1=text output, url to send output image stream, or directory path     : default = 0
# user_editing : no            # allow users to modify template codes (yes/no)

dataset.yaml

This YAML file is generated by dataset creator (uploader or labelling tool) for sharing info on the specific dataset.

  • dataset.yaml file should be in the under the top folder of given dataset: for example shared/dataset/coco/dataset.yaml.

Example dataset.yaml

# COCO 2017 dataset http://cocodataset.org

# download command/URL (optional)
download: bash ./scripts/get_coco.sh

# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: train2017.txt  # 118287 images
val: val2017.txt  # 5000 images
test: test-dev2017.txt  # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794

# number of classes
nc: 80

# class names
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
         'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
         'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
         'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
         'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
         'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
         'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
         'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
         'hair drier', 'toothbrush' ]

basemodel.yaml

This YAML file is generated by Model Selector for sharing info on the base NN model.

  • Model Selector should generate YAML formatted files for all combination of targets and datasets specified in the project_info.yaml and dataset.yaml

Example basemodel.yaml

backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [40, 3, 1]],  # 0
  
   [-1, 1, Conv, [80, 3, 2]],  # 1-P1/2      
   [-1, 1, Conv, [80, 3, 1]],
   
   [-1, 1, Conv, [160, 3, 2]],  # 3-P2/4  
   [-1, 1, Conv, [64, 1, 1]],
   [-2, 1, Conv, [64, 1, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [[-1, -3, -5, -7, -8], 1, Concat, [1]],
   [-1, 1, Conv, [320, 1, 1]],  # 13
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [160, 1, 1]],
   [-3, 1, Conv, [160, 1, 1]],
   [-1, 1, Conv, [160, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 18-P3/8  
   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -3, -5, -7, -8], 1, Concat, [1]],
   [-1, 1, Conv, [640, 1, 1]],  # 28
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [320, 1, 1]],
   [-3, 1, Conv, [320, 1, 1]],
   [-1, 1, Conv, [320, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 33-P4/16  
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -7, -8], 1, Concat, [1]],
   [-1, 1, Conv, [1280, 1, 1]],  # 43
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [640, 1, 1]],
   [-3, 1, Conv, [640, 1, 1]],
   [-1, 1, Conv, [640, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 48-P5/32  
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -7, -8], 1, Concat, [1]],
   [-1, 1, Conv, [1280, 1, 1]],  # 58
  ]

head:
  [[-1, 1, SPPCSPC, [640]], # 59
  
   [-1, 1, Conv, [320, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [43, 1, Conv, [320, 1, 1]], # route backbone P4
   [[-1, -2], 1, Concat, [1]],
   
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -7, -8], 1, Concat, [1]],
   [-1, 1, Conv, [320, 1, 1]], # 73
   
   [-1, 1, Conv, [160, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [28, 1, Conv, [160, 1, 1]], # route backbone P3
   [[-1, -2], 1, Concat, [1]],
   
   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -3, -5, -7, -8], 1, Concat, [1]],
   [-1, 1, Conv, [160, 1, 1]], # 87
      
   [-1, 1, MP, []],
   [-1, 1, Conv, [160, 1, 1]],
   [-3, 1, Conv, [160, 1, 1]],
   [-1, 1, Conv, [160, 3, 2]],
   [[-1, -3, 73], 1, Concat, [1]],
   
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -7, -8], 1, Concat, [1]],
   [-1, 1, Conv, [320, 1, 1]], # 102
      
   [-1, 1, MP, []],
   [-1, 1, Conv, [320, 1, 1]],
   [-3, 1, Conv, [320, 1, 1]],
   [-1, 1, Conv, [320, 3, 2]],
   [[-1, -3, 59], 1, Concat, [1]],
   
   [-1, 1, Conv, [512, 1, 1]],
   [-2, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [512, 3, 1]],
   [-1, 1, Conv, [512, 3, 1]],
   [-1, 1, Conv, [512, 3, 1]],
   [-1, 1, Conv, [512, 3, 1]],
   [-1, 1, Conv, [512, 3, 1]],
   [-1, 1, Conv, [512, 3, 1]],
   [[-1, -3, -5, -7, -8], 1, Concat, [1]],
   [-1, 1, Conv, [640, 1, 1]], # 117
   
   [87, 1, Conv, [320, 3, 1]],
   [102, 1, Conv, [640, 3, 1]],
   [117, 1, Conv, [1280, 3, 1]],

   [[118,119,120], 1, IDetect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]

Whole model structure should be included in the file for completness, but 2022 implementation include backbone only; Example basemodel.yaml

backbone:
  [ [ layer #0 ],
    [ layer #1 ],
        :
    [ layer #x ] 
   ]

head:
  [ [ layer #x+1 ],
   [ layer #x+2 ],
      :
    [ layer #z ]
  ]

Layer Notation

The layer in the basemodel.yml uses the following convention;

 [layer <from>, <number>=repetition), <module_name>, <module_args> ]

<from> designates which input is used for this layer.

  • from = -1 : layer index -1 means the output of immediately previous layer are used to this layer.
  • from != -1: layer index != -1 means the output of index layer are used to this layer. Here, layer index starting from 0 and incremented by one.

<number = repetition> designates how many times repeated this layer.


neural_net_info.yaml

This YAML file is generated by Auto NN for sharing info on NN model generated byt the Auto_NN (through Neck NAS or Backbone NAS).

Example neural_net_info.yaml

# 'neural_net_info.yaml'
# meta file from auto_nn

# NN Model
class_file: ['models/yolo.py', 'basemodel.yaml', 'models/common.py', 'models/experimental.py', 'utils/autoanchor.py', 'utils/general.py', 'utils/torch_utils.py', 'utils/loss.py']        # for pytorch model
class_name: 'Model(cfg=basemodel.yaml)'         # for pytorch model, model class name and initial params
weight_file: yoloe.pt
base_dir_autonn : yolo7/utils

# Input
input_tensor_shape: [1, 3, 640, 640]
input_data_type: fp32           # fp32, fp16, int8, etc
anchors:
  - [10,13, 16,30, 33,23]       # P3
  - [30,61, 62,45, 59,119]      # P4
  - [116,90, 156,198, 373,326]  # P5

# Output
output_number: 3                # number of output layers (ex. 3-floor pyramid; P3, P4, P5)
output_size:                    # [batch_size, anchors, pred, height, width]
  [[1, 3, 20, 20, 85],
   [1, 3, 40, 40, 85],
   [1, 3, 80, 80, 85],
  ]
output_pred_format:             # 85 = 4(coordinate) + 1(confidence score) + 80(probability of classes)
  ['x', 'y', 'w', 'h', 'confidence', 'probability_of_classes']

# Post-processing
conf_thres: 0.25                # for NMS
iou_thres: 0.45                 # for NMS

deployment.yaml

This YAML file is generated by Code_Gen and will be used by LablUp and ETRI.

Example deployment.yaml

build:
    architecture: x86
    accelerator: cpu
    os: ubuntu
    components:
        engine: pytorch
        libs: [python==3.9, torch>=1.1.0]
        custom_packages:
            apt:                                             # when build docker image, install the items with apt command
                - vim
                - hello
            pypi:                                            # when build docker image, install the items with pip command
                - flask==1.2.3
                
deploy:
    type: docker #or native
    work_dir: /test/test
    pre_exec: [['tensorrt-converter.py', param1, param2], ['hello.py]] # run these items sequentially when make docker image
                                                             # if engine is tensorrt or else, 
                                                             # pre_exec must be executed before deployment to generate tensorrt model 
                                                             # for the target machine
    entrypoint: [run.sh, -p, "opt1", "arg"]
    network:
        service_host_ip: 1.2.3.4                             # for cloud
        service_host_port: 8088                              # for cloud
    k8s:
        nfsip: 192.168.0.189                                 # for k8s, NFS-server IP
        nfspath: /var/lib/docker/volumes/tango_shared/_data  # for k8s, NFS-server path

optional: 
    nn_file: abc.py 
    weight_file: abc.pt
    annotation_file: coco.dat

nn_model.zip

This zip format compressed file include the generated neural net model code and template codes for user application.