loving build of wordnet in JSON.
no memory pointers, no python, no DSL, no guff. no crazy-framework stuff at all.
the data is zipped for github, but it automatically unzips when you first use it.
npm install wordnetjs
if you just want the JSON, unzip ./data.zip then you can just do your random shit. it's 6mb -> 32mb
it's the cutest way to use wordnet by a pretty wide margin.
#API
wn= require("wordnetjs")
//generic lookup
wn.lookup('warrant')
//6 results
//pos-specific lookups
wn.verb('warrant') // (1 result)
wn.adjective('cheeky')
wn.adverb('slightly')
wn.noun('grape')
//sugar
wn.synonyms("perverse")
// [{id:"depraved.adjective.01"...}]
wn.antonyms("perverse")
// [{id:"docile.adjective.01"}]
wn.pos("swim")
//[ 'Verb', 'Noun' ]
//unique, alphabetical list of all words
wn.words((arr)=>{
console.log(arr.filter((w)=> w.match(/cool/))
})
//[ 'air-cool', 'air-cooled', 'cool', 'cool down', ...
if the holonym of 'sausage' is 'sausage meat', the reverse (called a 'meronym') is almost always true.
As the beautiful George Miller explains, the meronym-holonym relations, and the hypernym-hyponym relations are symmetric with few exceptions. Ignoring these exceptions reduces the filesize by half, so I did it.
To go from 'sausage meat' to 'sausage', just query the opposite direction.
Adjective synsets in wordnet have no antonyms, but rather each individual word-sense has an antonym. This makes wordnet's antonym data really specific, but for most purposes, that's probably overdoing it. delete.
Given wordnet is a graph, this is just redundant data. delete.
Use only 1 gloss (description) per synset, and split it by semicolon-seperators.
Most Nouns include freebase ids and wikipedia titles. There were reconciled in a mostly-manual process by freebase in 2010.
117,657 synsets in total
##82,113 Noun Synsets
{
id: "candy cane.noun.01",
lexname: "noun.food",
syntactic_category: "Noun",
description: "a hard candy in the shape of a rod (usually with stripes)",
words: ["candy cane"],
relationships: {
type_of: ["candy.noun.01"],
made_with: [],
members: [],
parts: [],
instances: []
},
same_as: {
freebase_topic: "/m/01hrm7",
wikipedia_page: "Candy_cane"
}
}
##13,767 Verb Synsets
{
id: "lean back.verb.01",
lexname: "verb.motion",
syntactic_category: "Verb",
description: "move the upper body backwards and down",
words: ["lean back", "recline"],
assumes: [],
causes: []
}
hypernym: the verb Y is a hypernym of the verb X if the activity X is a (kind of) Y (to perceive is an hypernym of to listen) troponym: the verb Y is a troponym of the verb X if the activity Y is doing X in some manner (to lisp is a troponym of to talk) entailment: the verb Y is entailed by X if by doing X you must be doing Y (to sleep is entailed by to snore)
##18,156 Adjective Synsets
{
id: "phantasmagoric.adjective.01",
lexname: "adj.all",
syntactic_category: "Adjective",
description: "characterized by fantastic imagery and incongruous juxtapositions",
words: ["phantasmagoric", "surreal", "phantasmagorical", "surrealistic"],
similar: ["unrealistic.adjective.01"]
},
related nouns similar to participle of verb
##3,621 Adverb Synsets
{
id: "refreshingly.adverb.01",
lexname: "adv.all",
syntactic_category: "Adverb",
description: "in a manner that relieves fatigue and restores vitality",
words: ["refreshingly", "refreshfully"]
}
to build your own, get a freebase key and put it in ./build/build.js run 'npm install' then 'node ./build.js'