Python based text generator that uses the markovify python library.
Example data can be found in /data/input.jsonl
. In this case, the data has been obtained from Twitter by using either Tweepy or twarc - All we care about is how the text corpus (body) is formatted. You're more than welcome to have a look at my Python Twitter Scraper using Tweepy for a basic implementation
- Load an input file (
.jsonl
) file with (ideally) more than 1000 sentences. More sentences help build stronger texts. - Run
create_text_body
and specify thetext_key
to look for in your input file. - Run
create_markov_chain
with your resulting text_body and pass in the state_size. generate_text
will create a specific amount of sentences by a specified minimum and maximum length of characters. An output file is required, which is where we'll save our newly created sentences.
- What are .jsonl files?
- Markov chains visually explained.