Mass Labeling is an open source project for data assessment. It may be used to assess data for classification tasks.
The main advantages of this project are:
- easy to deploy,
- easy to manage,
- data is not distributed,
- no need to manually collect results.
The last two advantages are worth describing in detail.
If you decide to use any proprietary online data assessment service, then you should transfer your data to a server belonging to the service. Sometimes it is not comfortable, and sometimes it is not even possible due to the privacy rights to the data. In this case, you need to host an assessment tool on your own server.
If you decide to use an offline data assessment tool, then you should distribute data between assessors. So, each of them gets a whole dataset, which is valuable for you or your company. Also, after the assessment job is done the new problem occurs. You will need to collect all the labels and merge them into one dataset.
So, if you wish to keep all the data and labels on your server together, then mass-labeling is the right choice for you.
Mass Labeling has
- built-in slider mechanism to look throw labeled data,
- statistics to measure assessors work quality,
- multilanguage support.
- node.js
- mongodb
Installation is described in this guide
See the user guide which describes the basic user and administrator operations.
This project was separated from the family of internal projects. So, some variables in the code may be confusing. The refactoring is welcome.