umssi-project

The final project for Unsupervised Methods for Syntactic Structure Induction, WS 2016

My task was to compare the performance of three unsupervised dependency parsers – DMV, HDP-DEP and UDP.

Requirements

You need python-2, python-3, NLTK-2, NLTK-3, g++, GNU Make, standard UNIX tools and perhaps some other packages.

You also need to obtain some training data. My evaluation was done on PDT-3.0, but you can use any Czech dataset, as long as it's in CoNLL-X format and uses Hajič's tagset for the part-of-speech tags. CoNLL-2006 dataset is in the required format already and should be readily available. PDT-3.0 is available from LINDAT: https://lindat.mff.cuni.cz/repository/xmlui/handle/11858/00-097C-0000-0023-1AAF-3

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
0001-Load-corpora-from-CWD-instead-of-the-NLTK-repo.patch		0001-Load-corpora-from-CWD-instead-of-the-NLTK-repo.patch
0002-Port-treebank.py-to-new-NLTK-API.patch		0002-Port-treebank.py-to-new-NLTK-API.patch
Makefile		Makefile
README.md		README.md
compare-conll.py		compare-conll.py
filter-sentence-length.py		filter-sentence-length.py
launch-ulk-calculations.sh		launch-ulk-calculations.sh
train-dmvccm.py		train-dmvccm.py
ulk-makefile.patch		ulk-makefile.patch
ulk-map_czech		ulk-map_czech
ulk-string.patch		ulk-string.patch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

umssi-project

Requirements

About

Releases

Packages

Languages

vidraj/umssi-project

Folders and files

Latest commit

History

Repository files navigation

umssi-project

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages