Skip to content

Data collection, saving and publishing code for ARIO dataset. Collect multi-sensor rostopic data and store it in a specific structure.

License

Notifications You must be signed in to change notification settings

ario-dataset/ario-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ARIO Dataset Tools

License ubuntu ROS1


Introduction

These are ARIO dataset tools. Data collection, data saving, and data publishing codes are included. The data is saved in a fixed format. For details on the format, see Directory structure translation.

Testing environment

build PCL rapidyaml rapidjson

Hardware support list

Tested successfully on Agilex Cobot Magic robot.

Installation

Download our source code:

git clone https://github.com/ario-dataset/ario-tools.git

cd ario-tools

git submodule update --init --recursive

Build

cd ario-tools

catkin_make

Required sensors & data types

hardware:

  • Depth camera with RGB x 3
  • Six-degree-of-freedom robotic arm with gripper x 4
  • agilex tracer wheel encoder

sensors type:

  • sensor_msgs::Image
  • sensor_msgs::PointCloud2
  • sensor_msgs::JointState
  • sensor_msgs::CameraInfo
  • geometry_msgs::PoseStamped
  • nav_msgs::Odometry

topic name:

  • /camera_l/color/image_raw
  • /camera_f/color/image_raw
  • /camera_r/color/image_raw
  • /camera_l/depth/image_raw
  • /camera_f/depth/image_raw
  • /camera_r/depth/image_raw
  • /master/joint_left
  • /master/joint_right
  • /puppet/joint_left
  • /puppet/joint_right
  • /master/end_left
  • /master/end_right
  • /puppet/end_left
  • /puppet/end_right

Package file documentation

# package name : collect_data

src/
└── collect_data
    ├── CMakeLists.txt
    ├── config
    │   ├── data_params.yaml
    │   └── information.yaml
    ├── include
    │   ├── blockingDeque.h
    │   ├── dataUtility.h
    │   └── getCurrentPath.h
    ├── launch
    │   ├── run_data_capture.launch # Run the data collection launch
    │   ├── run_data_publish.launch
    │   └── run_data_sync.launch # Synchronize the datasets
    ├── package.xml
    ├── scripts
    │   ├── data_to_hdf5_example.py # Convert raw data to HDF5 format
    │   └── load_data_example.py # load data from an HDF5 file for training
    ├── src
    │   ├── dataCapture.cpp
    │   ├── dataPublish.cpp
    │   ├── dataSync.cpp
    │   └── getCurrentPath.cpp
    └── Thirdparty # git submodule
        ├── rapidjson
        └── rapidyaml

Run

Run the data collection code.

# Activate environment variables.
source source devel/setup.sh
# Assume it is dataset group 0. You need to specify the path where it is located. 
# The path must already exist; if it does not, an error will be reported.
mkdir /workspace/task0
# Note, the path following datasetDir must be an absolute path.
roslaunch collect_data run_data_capture.launch episodeIndex:=0 datasetDir:=/workspace/task0

If successful, the following status will be displayed.

path: /workspace/task0/episode0
total time: 3.00223
topic: frame in 1 second / total frame
/camera_l/color/image_raw: 20 / 65
/camera_f/color/image_raw: 20 / 62
/camera_r/color/image_raw: 20 / 61
/camera_l/depth/image_raw: 19 / 60
/camera_f/depth/image_raw: 21 / 60
/camera_r/depth/image_raw: 23 / 63
/camera_l/depth/color/points: 21 / 59
/camera_f/depth/color/points: 18 / 54
/camera_r/depth/color/points: 22 / 65
/master/joint_left: 87 / 248
/master/joint_right: 84 / 238
/puppet/joint_left: 83 / 248
/puppet/joint_right: 89 / 243
/master/end_left: 88 / 245
/master/end_right: 84 / 233
/puppet/end_left: 83 / 241
/puppet/end_right: 93 / 250
sum total frame: 2495

config topic: total frame
/camera_l/color/camera_info: 0
/camera_f/color/camera_info: 0
/camera_r/color/camera_info: 0
/camera_l/depth/camera_info: 1
/camera_f/depth/camera_info: 1
/camera_r/depth/camera_info: 1
/camera_l/depth/camera_info: 1
/camera_f/depth/camera_info: 1
/camera_r/depth/camera_info: 1

press ENTER to stop capture:

To stop data collection, press ENTER to stop. Writing to the file takes time, so you can try waiting or pressing ENTER several times.

If "Done" is displayed, it means the data just collected has been written to the file.

Done
[collect_data_dataCapture-1] process has finished cleanly
log file: /home/noetic/.ros/log/21114750-1995-11ef-b6f1-578b5ce9ba2e/collect_data_dataCapture-1*.log
all processes on machine have died, roslaunch will exit
shutting down processing monitor...
... shutting down processing monitor complete
done

Note: If there is no data in the topic, the terminal output will be stuck, you need to use Ctrl+C to exit.

Directory structure translation

For example, dataset number 0 collected:

episode0
.
├── arm
│   ├── endPose
│   │   ├── masterLeft
│   │   │   └── 1714373556.360405.json
│   │   ├── masterRight
│   │   │   └── 1714373556.360135.json
│   │   ├── puppetLeft
│   │   │   └── 1714373556.393465.json
│   │   └── puppetRight
│   │       └── 1714373556.386106.json
│   └── jointState
│       ├── masterLeft
│       │   └── 1714373556.360460.json
│       ├── masterRight
│       │   └── 1714373556.360205.json
│       ├── puppetLeft
│       │   └── 1714373556.363036.json
│       └── puppetRight
│           └── 1714373556.363178.json
├── camera
│   ├── color
│   │   ├── front
│   │   │   ├── 1714373556.409885.png
│   │   │   └── config.json
│   │   ├── left
│   │   │   ├── 1714373556.370113.png
│   │   │   └── config.json
│   │   └── right
│   │       ├── 1714373556.358616.png
│   │       └── config.json
│   ├── depth
│   │   ├── front
│   │   │   ├── 1714373556.388914.png
│   │   │   └── config.json
│   │   ├── left
│   │   │   ├── 1714373556.353924.png
│   │   │   └── config.json
│   │   └── right
│   │       ├── 1714373556.376457.png
│   │       └── config.json
│   └── pointCloud
│       ├── front
│       │   ├── 1714373556.248885.pcd
│       │   └── config.json
│       ├── left
│       │   ├── 1714373556.247312.pcd
│       │   └── config.json
│       └── right
│           ├── 1714373556.273297.pcd
│           └── config.json
├── instructions.json
└── robotBase
    └── vel

Example data

Synchronize the datasets

Run the following code, specifying the path of the task and the index of the episodeIndex. When the dataset index is -1, it means that all datasets in the task path will be synchronized.

roslaunch data_tools run_data_sync.launch datasetDir:=/workspace/task0 episodeIndex:=-1

After execution, a sync.txt file will be generated in the path of each specific dataset. For example, the image data synchronization index file path: task0/episode0/camera/color/left/sync.txt.

Convert raw data to HDF5 format

Run the following code to generate a data.hdf5 file in the path of the task0 task, which includes all datasets (episode0, episode1, ..., episodeX). This file contains the synchronized joint information and the synchronized index file of the image data.

python3 data_to_hdf5_example.py --datasetDir /workspace/task0

How to load data from an HDF5 file for training

Here's an example of loading files. The paths in the program need to be manually specified as absolute paths. This code snippet does not have any actual functionality; it's just an example program. You can use this example to integrate into your own training code.

python3 load_data_example.py

How to add sensors tailored to your needs

Licence

The source code is released under Apache-2.0 license.

Contributions

Agilex robotics

TODO list

TODO state
Add support for more hardware devices, such as 2D LiDAR and 3D LiDAR.
Add support for tactile and force control messages.
Add a web UI to support visualizing data collection, reducing the difficulty of collecting data angles.
Add wandb support, make it easy to trace the training status.
Add LeRobot data format into support.

Contact us

alt text

Nick.xie@agilex.ai

About

Data collection, saving and publishing code for ARIO dataset. Collect multi-sensor rostopic data and store it in a specific structure.

Resources

License

Stars

Watchers

Forks