This dataset has two sets of maually-labeled categories, locations and topics. The category names are in locations.txt
and topics.txt
, respectively. Every line in phrase_text.txt
is one document, and every line in label.txt
is the ground-truth labels of the corresponding document. The document labels are only used in classification evaluation and not needed for topic mining.
nyt
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||