!!! Please note that the new official repository of Appraise is now https://github.com/AppraiseDev/Appraise. The newest release available from there includes code used at WMT 2020 and better documentation. !!!
Current release used to run the evaluation of the ACL 2016 First Conference on Machine Translation (WMT16). It has also been used for WMT 2015, 2014 and 2013. Second major release in time for the Seventh MT Marathon 2012 which took place September 3-8, 2012 in Edinburgh, Scotland. Initial import into GitHub on Oct 23, 2011. First versions of this software appeared in summer 2008...
We are currently finishing preparations for WMT16 — Evaluation campaign at http://appraise.cf/. Stay tuned for official kick off.
Appraise has been updated for WMT15 — Evaluation campaign at http://appraise.cf/ — Follow #WMT15
on https://twitter.com/cfedermann/ for updates. Invite tokens have been sent out to participants. For research group registration details or problems drop me an email: cfedermann [at] gmail [dot] com
2015-05-08
Here we go again! #WMT15 evaluation campaign is running!
Happy annotating! appraise.cf -- statmt.org/wmt15/ #WMT #Appraise
— Christian Federmann (@cfedermann) May 8, 2015
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
Follow #WMT14
on https://twitter.com/cfedermann/ — Evaluation campaign at http://appraise.cf/
For research group registration details or problems drop me a note via email: cfedermann [at] gmail [dot] com
2014-03-19 User changeable passwords and new action menu in navigation bar; go to http://www.appraise.cf/password/ when logged in to change the password for your Appraise account. You can also use the lovely new user action menu on the top right of the navigation bar ("Admin", of course, only visible to some):
2014-03-18
Finally! #WMT14 evaluation campaign is live! http://t.co/XaZ1Cfkqsk -- http://t.co/DrzOY2w8KG
— Christian Federmann (@cfedermann) March 18, 2014
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
There's a new release of Appraise for use in the WMT '14; see the new Django app inside appraise.wmt14
for more details. This version also integrates with Amazon's Mechanical Turk, allowing to collect even more manual annotations.
Appraise is an open-source tool for manual evaluation of Machine Translation output. Appraise allows to collect human judgments on translation output, implementing annotation tasks such as
- translation quality checking;
- ranking of translations;
- error classification;
- manual post-editing.
It features an extensible XML import/output format and can easily be adapted to new annotation tasks. The next version of Appraise will also include automatic computation of inter-annotator agreements allowing quick access to evaluation results.
Appraise is available under an open, BSD-style license.
You can see a deployed version of Appraise here. If you want to play around with it, you will need an account in order to login to the system. I’ll be happy to create an account for you, just drop me an email cfedermann [at] gmail [dot] com
.
Appraise is based on the Django framework, version 1.3 or newer. You will need Python 2.7 to run it locally. For deployment, a FastCGI compatible web server such as lighttpd is required.
Assuming you have already installed Python and Django, you can clone a local copy of Appraise using the following command; you can change the folder name Appraise-Software
to anything you like.
$ git clone git://github.com/cfedermann/Appraise.git Appraise-Software
...
After having cloned the GitHub project, you have to initialise Appraise. This is a two-step process:
Initialise the SQLite database:
$ cd Appraise-Software/appraise $ python manage.py syncdb ...
Collect static files and copy them into
Appraise-Software/appraise/static-files
. Answeryes
when asked whether you want to overwrite existing files.$ python manage.py collectstatic ...
More information on handling of static files in Django 1.3+ is available here.
Finally, you can start up your local copy of Django using the runserver
command:
$ python manage.py runserver
You should be greeted with the following output from your terminal:
Validating models...
0 errors found
Django version 1.3.1, using settings 'appraise.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
Point your browser to http://127.0.0.1:8000/appraise/ and there it is…
Users can be added here.
Evaluation tasks can be created here.
You need an XML file in proper format to upload a task; an example file can be found in examples/sample-ranking-task.xml .
You will need to create a customised start-server.sh
script inside Appraise-Software/appraise
. There is a .sample
file available in this folder which should help you get started quickly. In a nutshell, you have to uncomment and edit the last two lines:
# /path/to/bin/python manage.py runfcgi host=127.0.0.1 port=1234 method=threaded pidfile=$DJANGO_PID
The first line tells Django to start up in FastCGI mode, binding to hostname 127.0.0.1
and port 1234
in our example, running a threaded
server and writing the process ID to the file $DJANGO_PID
. The .pid
files will be used by stop-server.sh
to properly shutdown Appraise.
Using Django’s manage.py
with the runfcgi
command requires you to also install flup
into the site-packages
folder of your Python installation. It is available from here.
# /path/to/sbin/lighttpd -f /path/to/lighttpd/etc/appraise.conf
The second line starts up the lighttd
server using an appropriate configuration file appraise.conf
. Have a look at Appraise-Software/examples/appraise-lighttpd.conf
to create your own.
Once the various /path/to/XYZ
settings are properly configured, you should be able to launch Appraise in production mode.
If you use Appraise in your research, please cite the MT Marathon 2012 paper:
Christian Federmann Appraise: An Open-Source Toolkit for Manual Evaluation of Machine Translation Output In The Prague Bulletin of Mathematical Linguistics volume 98, Prague, Czech Republic, 9/2012
@Article{mtm12_appraise,
author = {Christian Federmann},
title = {Appraise: An Open-Source Toolkit for Manual Evaluation of Machine Translation Output},
journal = {The Prague Bulletin of Mathematical Linguistics},
volume = {98},
pages = {25--35},
year = {2012},
address = {Prague, Czech Republic},
month = {September}
}
A previous version of Appraise had been published at LREC 2010:
Christian Federmann Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations In Proceedings of the Seventh Conference on International Language Resources and Evaluation, Valletta, Malta, LREC, 5/2010