WiDec (Wizard Decoder) 🧙‍♂️

This project aims to implement the decoding step of the SMT, which can reorder phrases.

It is based on HW3 (description, github) from JHU Machine Translation class.

I have implemented two approaches to decoding: Beam Search and Greedy Decoding. Combining two techniques, I was able to significantly improve baseline quality. Detailed evaluation results can be found in the report.

Repository

Structure of folders is the following:

'data'
- input French sentences
- language model in ARPA format
- translation model
'meta' - meta-information, currently here is only report file with full description of the project
'model_translations' - translations produced by different decoding algorithms
'src/cpp' - initial version of cpp code, currently only translation model is implemented
'src/py' - main code repository in Python There are several python programs here (-h for usage):
- decode translates input sentences from French to English using monotone decoding.
- widecode translates input sentences from French to English using beam search decoding.
- widecode_greedy translates input sentences from French to English using greedy decoding.
- compute-model-score computes the model score of a translated sentence.
- helper.py holds common functions for different models in one place.
- models.py implements very simple interfaces for language models and translation models.
These commands work in a pipeline or via files. For example:

python3 decode | python3 compute-model-score
python3 decode > output.txt
python3 compute-model-score < output.txt

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
meta		meta
model_translations		model_translations
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WiDec (Wizard Decoder) 🧙‍♂️

Repository

About

Releases

Packages

Languages

tsimafeip/WiDec

Folders and files

Latest commit

History

Repository files navigation

WiDec (Wizard Decoder) 🧙‍♂️

Repository

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages