A tool to extract plain (unformatted) multilingual text, redirects, links and categories from wikipedia backups (dumps). Designed to prepare clean training data for AI training / Machine Learning software.
machine-learning
training-data
ai-learning
wiki-parser
ai-training
machine-learning-tool
wiki-to-txt
wiki-to-text
wiki-to-plaintext
wikidump-to-txt
wikidump-to-plaintext
wikidump-parser
ai-learning-tool
tool-for-ai
wikidumps-parser
wiki2plaintext
data-parser-for-ai
data-for-robots
plaintext-data-for-ai
wikipedia-to-txt
-
Updated
Nov 11, 2023 - Python