Overview on Medium
After you add CUDA binaries to PATH
and LD_LIBRARY_PATH
you can install this project:
git clone git@github.com:mrapplexz/aiijc-visualstories.git
cd aiijc-visualstories
./install.sh
If your host doesn't provide access to huggingface hub, download
pytorch_model.bin
,tokenizer_config.json
,vocab.json
,config.json
,special_tokens_map.json
, merges.txt
here
and start with --local_model MODEL_PATH
python3_text generate_text.py --device cuda:0 \
--output_filename ./output/texts/text.txt \
--temperature 0.1 \
--top_k 10000 \
--top_p 0.95 \
--repetition_penalty 5.0 \
--max_length 1000 \
--seed 42 \
--start "The kingdom and a princess" \
--genre fairy_tale
Аll the sub-parts of the part 2 can be executed in any order or in parallel
Before start you need to download pretrained model LibriTTS_800000.tar
here
and put it to ./tts_generation/FastSpeech2/output/ckpt/LibriTTS/800000.pth.tar
If your host doesn't provide access to nltk hub, you need to install cmudict
and averaged_perceptron_tagger
packages manually with this instruction
python3_tts generate_tts.py --input_filename ./output/texts/text.txt \
--temp_dir ./tmp \
--speaker_id 205 \
--output_dir ./output/tts
If your host doesn't provide access to OpenAI hub, you need to download RN50.pt , ViT-B-16.pt , ViT-B-32.pt and put them to ~/.cache/clip
python3_image generate_images.py --input_filename ./output/texts/text.txt \
--devices cuda:0,cuda:1 \
--main_dir ./output/frames
If your host doesn't provide access to OpenAI hub, you need to download vqvae.pth.tar, prior_level_0.pth.tar, prior_level_1.pth.tar and put them to ~/.cache/jukebox/models/5b
and prior_level_2.pth.tar to ~/.cache/jukebox/models/5b_lyrics
python3_music generate_music.py --music_genre country \
--artist john_denver \
--save_path ./output/music \
--sample_len 30
After music generation you will have three different musics ./output/music/item_0.wav
, ./output/music/item_1.wav
, ./output/music/item_2.wav
, so you will need to choose one and pass it to --music_filename
python3_video generate_video.py --frame_dir ./output/frames \
--tts_dir ./output/tts \
--music_filename ./output/music/item_0.wav \
--temp_dir ./tmp \
--video_name ./output/video/video.avi \
--quality 6 \
--music_corrector -3