UDPipe finds detailed Part-of-speech tags (Noun, Verb, ...) in Swedish sentences. This code makes UDPipe available via a JSON API.
Play with it at: https://json-tagger.sammanfatta.se
JSON-Tagger is built for Python 3.6. I haven't tested it on other versions, so it might work or other 3.x versions, but not on Python 2.
- Clone this project from GitHub:
git clone https://github.com/EmilStenstrom/json-tagger.git json-tagger
- Install dependencies:
cd json-tagger
pip install -r requirements.txt
- Get a UDPipe model file
Download the latest version of the udipe models from http://ufal.mff.cuni.cz/udpipe#download. Pick the language you are interested in, create a data directory in the root of the project, and put the .udipe file there. Now update the path to the file in ud_helper, and in actions.py if you use a language other than Swedish. Done!
- Start the local web server
python run.py --run
- Surf to http://localhost:8000 in your browser!
The trickiest part of delivering an API like JSON Tagger is to handle encodings. I've found that the easiest way to make sure I don't mess them up is to run code that accesses the API from different languages. To run some simple integration tests against a version running locally:
- Install dependencies
The scripts assume you are running them inside a virtualenv with python
pointing to Python 3, and that python2
and curl
is available on the PATH.
pip2 install requests
pip install requests
gem install http
npm install -g request
- Run all the tests
tests/run_all
If any of the tests fail it will output the difference in output between the result and the expected result.