A small service for returning the synonyms for each meaningful word in a sentence using the WordNet 3.1 synset database.
The synonyms service is written in golang. If Go is correctly installed, we should be able to acquire and install the source in the usual way with
go get github.com/deciphernow/synonyms
go install github.com/deciphernow/synonyms
For quickest results, we can use the prebuilt image on Docker Hub:
docker pull deciphernow/synonyms
If $GOPATH/bin is in $PATH, we can launch the synonyms service with
synonyms [port]
Otherwise, run with
$GOPATH/bin/synonyms [port]
The service first checks for the presence of the WordNet database files in $TMPDIR/synonyms-service-wordnet-db/dict, and if not yet present, it downloads them (16Mb download, 53Mb uncompressed) to this location. Thereafter, the service listens on port 8080 (or first argument port
if provided).
We can run the deciphernow/synonyms
image with, e.g.:
docker run --publish 8080:8080 --rm deciphernow/synonyms
This will run the deciphernow/synonyms
image in a container, publishing internal port 8080 on external port 8080, and cleaning up the container filesystem upon exit.
Synonyms supports text and JSON output. A simple GET to localhost:8080?q=Hello, World!
will return
synonyms of 'hello': [hello hullo hi howdy how-do-you-do]
synonyms of 'world': [universe existence creation world cosmos macrocosm domain reality Earth earth globe populace public worldly_concern earthly_concern human_race humanity humankind human_beings humans mankind man]
and a GET to localhost:8080/synonyms.json?q=Hello, World!
will return
[{"word":"hello","synonyms":["hello","hullo","hi","howdy","how-do-you-do"]},{"word":"world","synonyms":["universe","existence","creation","world","cosmos","macrocosm","domain","reality","Earth","earth","globe","populace","public","worldly_concern","earthly_concern","human_race","humanity","humankind","human_beings","humans","mankind","man"]}]
The synonyms service also supports querying instead by header. A GET to localhost:8080/synonyms.json
with the header Q: Hello, World!
returns the same JSON output as above. Similarly, GET to localhost:8080
with that header returns the same text output as above.
A list of features and changes we'd like to make:
- ! Improve error handling (pass to requestor with accurate error code instead of dying)
- Load a third-party stop words list
- More sophisticated tokenization (e.g., better supporting contractions like "It's", which currently tokenizes as "it" and "s")
- A separate endpoint that applies heuristics to guess the intended WordNet sense of each word in a sentence, and return only the synset for that particular sense
-
Princeton University "About WordNet." WordNet. Princeton University. 2010. http://wordnet.princeton.edu
-
http://blog.ralch.com/tutorial/golang-working-with-tar-and-gzip/
© Copyright 2016 Decipher Technology Studios
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.