Curious George Adventure Creator: AI Children's Book Generator

Welcome to the future of storytelling with the Curious George Adventure Creator! Dive into the playful and adventurous world of Curious George like never before. Our innovative AI model, designed for children and fans of the lovable monkey, brings George's stories to life with just a starting image and a chosen theme.

Here is a video demo of our Story generated and voice over by our App.

What is it?

The Curious George Adventure Creator allows you to craft unique episodes featuring the sweet and curious African monkey, Curious George, and his ever-patient friend, "The Man in the Yellow Hat." George's adventures, often filled with playful curiosity and unforeseen trouble, are brought to life through our cutting-edge AI technology.

Key Features

Easy Episode Creation: Simply upload a starting image and select a theme, and our AI model will generate a personalized Curious George episode. Watch George explore, learn, and get into his usual delightful mishaps, all tailored to your input.
Learning and Fun: Each episode emphasizes themes of learning, forgiveness, and curiosity. It's not just entertainment; it's an educational experience wrapped in fun and adventure.
Voice Options:
- Loan Your Voice: Bring a personal touch to your episode by lending your own voice to the characters.
- Audio Library: Choose from a variety of pre-recorded voices, including cloned voices that perfectly match the characters.

Technical Aspects

This project is developed using SEED-Story

Seed-Story is a Multimodal Large Language Model (MLLM) capable of generating multimodal long stories consisting of rich and coherent narrative texts, along with images that are consistent in characters and style, based on SEED-X.

How It Works

Stage 1: Visual Tokenization & De-tokenization
- Pre-train an SD-XL-based de-tokenizer to reconstruct images by taking the features of a pre-trained Vision Transformer (ViT) as inputs.
Stage 2: Multimodal Sequence Training
- Sample an interleaved image-text sequence of a random length.
- Train the MLLM by performing next-word prediction and image feature regression between the output hidden states of the learnable queries and ViT features of the target image.
Stage 3: De-tokenizer Adaptation
- The regressed image features from the MLLM are fed into the de-tokenizer for tuning SD-XL, enhancing the consistency of the characters and styles in the generated images.

Given the same initial image but different opening texts, SEED-Story can generate different multimodal stories. For instance, starting with text referencing “the man in the yellow hat” will lead to images that include the character, while omitting this reference will result in a different narrative direction.

Usage

Dependencies

Python >= 3.8 (Recommend to use Anaconda)
[PyTorch >=2.0.1](Recommend to use torch==2.1.2+cu121)(https://pytorch.org/) / (AMD GPU: Recomended torch=2.4.1:rocm6.1)
GPU Vram > 40 GB

Installation

Clone the repo and install dependent packages:

sudo apt-get update
sudo apt-get upgrade
git clone https://github.com/krishnaadithya/AIChildrensBookGen.git
cd AIChildrensBookGen
pip install --ignore-installed blinker
pip install -r requirements.txt
#AMD installation
pip install -U xformers --index-url https://download.pytorch.org/whl/rocm6.1
#NVIDIA installation
pip install xformers==0.0.23.post1

Model Weights

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash
apt-get install git-lfs
#cd AIChildrensBookGen
cd pretrained
git clone https://huggingface.co/Qwen/Qwen-VL-Chat
git clone https://huggingface.co/TencentARC/SEED-Story
mv SEED-Story/* ./
git clone https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
git clone https://krishnavadithya:<hf_token>@huggingface.co/meta-llama/Llama-2-7b-hf
cd ..
touch .project-root

Run the following script to reload the model:

python3 src/tools/reload_qwen_vit.py

App

Experience the magic of Curious George's world like never before. Start creating your own episodes today and let your imagination soar!

Ready to embark on an adventure with Curious George?

python app.py

Video Demo

Here is a video demo of our App.

Based on:

@article{yang2024seedstory,
      title={SEED-Story: Multimodal Long Story Generation with Large Language Model}, 
      author={Shuai Yang and Yuying Ge and Yang Li and Yukang Chen and Yixiao Ge and Ying Shan and Yingcong Chen},
      year={2024},
      journal={arXiv preprint arXiv:2407.08683},
      url={https://arxiv.org/abs/2407.08683}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
StoryStream		StoryStream
assets		assets
configs		configs
pretrained		pretrained
scripts		scripts
src		src
README.md		README.md
Seed_story_README.md		Seed_story_README.md
app.py		app.py
license_Seed-Story.txt		license_Seed-Story.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Curious George Adventure Creator: AI Children's Book Generator

What is it?

Key Features

Technical Aspects

This project is developed using SEED-Story

How It Works

Usage

Dependencies

Installation

Model Weights

App

Video Demo

Based on:

About

Releases

Packages

Languages

krishnaadithya/AIChildrensBookGen

Folders and files

Latest commit

History

Repository files navigation

Curious George Adventure Creator: AI Children's Book Generator

What is it?

Key Features

Technical Aspects

This project is developed using SEED-Story

How It Works

Usage

Dependencies

Installation

Model Weights

App

Video Demo

Based on:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages