Skip to content

Upload content

Oriol Romaní edited this page Oct 31, 2017 · 13 revisions

Create .json files with the Data Set description

The data set has to be copied to /mnt/asplab-web/asplab-shared/similarity-annotator/import In order to upload your sounds to the web you have to create a .json file to describe where the sounds are stored, which exercises have to be created and which is the reference sound of the exercise. Optionally you can specify which tiers have to be created for each exercise. If there is segmentation data for some sounds, it can be uploaded as well to the website using another .json file placed in the same directory of the corresponding sound file. This file has to have the same file name than the sound file but wiht the .trans_json extension.

Data set description file format

The data set description file must be a .json file and has to be placed in the root directory of the data set.

{
  "exercise 1": {
                  "name": "exercise name to be shown in the website",

                  "recs": [ {
                              "_id": "sound 1 identifier",
                              "path": "sound file path 1"
                            },
                            {
                              "_id": "sound 2 identifier",
                              "path": "sound file path 2"
                            }
                           ...
                          ],

                  "ref_media": "path to the reference sound of the exercise",
                  "tanpura": "path to the pitch reference sound file (optional)"

                  }
    ...
}

Rubric description file format

This file has to be named rubric.json and has to be placed also in the root directory of the dataset.

{
    "tier name 1": {
        "rubric": {
            "ratings": [
                0, 
                1
            ]
        }
    }, 
    "tier name 2": {
        "parent_tier": "parent tier name" (optional), e.g. "tier name 1", 
        "rubric": {
            "ratings": [_value 1_, _value 2_, ...] 
         }
    }, 
    "tier name 3": {
        "rubric": {
            "ratings": [_value 1_, _value 2_...]
        }
    ...
    }
}

Segmentation annotations file format

If there are segmentation annotations of some sounds in the dataset we can also upload them automatically to the website using a .json file. We need to create one of this files for each sound in the dataset for which we have segmentation annotations. This file has to have the same filename than the corresponding audio file and have the .trans_json extension. It also has to be located in the same path than the corresponding audio file.

{
    "tier name 1" (should correspond to tiers defined in previous section):
                [
                   {"start_time": _start_time_value_,
                    "end_time": _end_time_value_,
                    "label": "name of the segment" e.g. A1 note
                   },
                   {"start_time": _start_time_value_,
                    "end_time": _end_time_value_,
                    "label": "name of the segment" e.g. D2 note
                   }
                   ...
                 ],
    "tier name 2":
                 [
                   {"start_time": _start_time_value_,
                    "end_time": _end_time_value_,
                    "label": "name of the segment"
                   },
                   {"start_time": _start_time_value_,
                    "end_time": _end_time_value_,
                    "label": "name of the segment"
                   }
                   ...
                 ],
                 ...
}

Run the script

The script has to run on the server.

ssh to asplab-web1.s.upf.edu

cd /asplab-configuration/similarity-annotator

docker-compose run --rm web python manage.py upload_data_set _data_set_path_ _path_to_data_set description_file_ _data_set_name_  _username_