Unix-friendly copy of Moby Words II by Grady Ward.
See https://en.wikipedia.org/wiki/Moby_Project#Words
Also see https://www.gutenberg.org/ebooks/3201
I want there to be NO barriers to using these files. But "public domain" does not have an internationally agreed upon definition, so I use CC0:
Copyright 2021 Steven Ford http://geeky-boy.com and licensed
"public domain" style under
CC0:
To the extent possible under law, the contributors to this project have waived all copyright and related or neighboring rights to this work. In other words, you can use this code for any purpose without any restrictions. This work is published from: United States. The project home is https://github.com/fordsfords/moby_words_2
To contact me, Steve Ford, project owner, you can find my email address at http://geeky-boy.com. Can't see it? Keep looking.
This is a verbatim copy of the Moby Words II ebook from https://www.gutenberg.org/ebooks/3201
My only changes are to convert the file names to lower-case and to remove the carriage returns from the files.
The following is extracted from https://www.gutenberg.org/files/3201/3201.txt
MOBY WORDS II CONTENTS
6,213 acronyms (acronyms.txt)
common acronyms & abbreviations
74,550 common dictionary words (common.txt)
A list of words in common with two or more published dictionaries.
This gives the developer of a custom spelling checker a good
beginning pool of relatively common words.
256,772 compound words (compound.txt)
Over 256,700 hyphenated or other entries containing more than one
word as well as all capitalized words and acronyms. Phrases were
considered 'common' if they or variations of them occur in standard
dictionaries or thesauruses.
113,809 official crosswords (crosswd.txt)
A list of words permitted in crossword games such as Scrabble(tm).
Compatible with the first edition of the Official Scrabble Players
Dictionary(tm). Since this list has all forms: -ing, -ed, -s, and so
on of words, it makes a good addition when building a custom spelling
dictionary.
4,160 official crosswords delta (crswd-d.txt)
When combined with the 113,809 crosswords file, it produces the
official crossword list compatible with the second edition of the
Official Scrabble Players Dictionary. (Scrabble is a registered
trademark of Milton-Bradley licensed to Merriam-Webster.)
467 current fiction substrings (fiction.txt)
The most frequently occurring 467 substrings occurring in a
best-selling novel by Amy Tan in 1990.
1,000 by frequency (freq.txt)
This file consists of the 1,000 most frequently used English words
from a wide variety of common texts listed in decreasing order of
frequency
1,000 by frequency internet (freq-int.txt)
This file consists of the 1,000 most frequently used English words
as used on the Internet computer network in 1992.
1,185 King James Version frequent substrings (KJVfreq.txt)
The most frequently occurring 1,185 substrings in the King James
Version Bible ranked and counted by order of frequency.
21,986 names (names.txt)
This database contains the most common names used in the United
States and Great Britain. Spelling checkers may want to supplement
their basic word list with this one.
4,946 female names (names-f.txt)
frequent given names of females in English speaking countries
3,897 male names (names-m.txt
frequent given names of males in English speaking countries
366 often misspelled words (oftenmis.txt)
many of the most commonly misspelled words in English speaking countries
10,196 places (places.txt)
a large selection of place names in the United States
354,984 single words (single.txt)
Over 354,000 single words, excluding proper names, acronyms, or
compound words and phrases. This list does not exclude archaic words
or significant variant spellings.
USA Constitution (usaconst.txt)
The Constitution of the United States, including the Bill of Rights
and all amendments current to 1993.
NOTE: Accents have been stripped from words, e.g., 'etude' does not mark the accent on the initial 'e'.