collection of datasets
- setup download for books and other public works, use wikisource
- setup for blogs and news, a perdiodical scraper (can use existing crawler)
- Scrape every dictionary
- Winslow
- Fabricius
- develop quasi-schema to merge different dictionaries into one