The final project for Unsupervised Methods for Syntactic Structure Induction, WS 2016
My task was to compare the performance of three unsupervised dependency parsers – DMV, HDP-DEP and UDP.
You need python-2, python-3, NLTK-2, NLTK-3, g++, GNU Make, standard UNIX tools and perhaps some other packages.
You also need to obtain some training data. My evaluation was done on PDT-3.0, but you can use any Czech dataset, as long as it's in CoNLL-X format and uses Hajič's tagset for the part-of-speech tags. CoNLL-2006 dataset is in the required format already and should be readily available. PDT-3.0 is available from LINDAT: https://lindat.mff.cuni.cz/repository/xmlui/handle/11858/00-097C-0000-0023-1AAF-3