AI Novel Prompter can generate writing prompts for novels based on user-specified characteristics.
AI Novel Prompter is a desktop application designed to help writers create consistent and well-structured prompts for AI writing assistants like ChatGPT and Claude. The tool helps manage story elements, character details, and generate properly formatted prompts for continuing your novel.
The Executable is on build/bin Executable
- Task Type Selection: Define and customize different types of writing tasks
- Sample Chapter Management: Store and reference sample chapters for style consistency
- Chapter Content Tabs:
- Story Beats: Plan the main points for your next chapter
- Previous Chapter: Reference the last written chapter
- Future Notes: Keep track of planned future developments
Each category can be edited, saved, and reused across different prompts:
- Rules: Define writing rules and style guidelines
- Characters: Manage character profiles and details
- Locations: Keep track of story locations and their descriptions
- Codex: Store world-building elements and lore
- Dual AI Support:
- ChatGPT-optimized formatting
- Claude-optimized XML formatting
- Real-time Preview: See your formatted prompt as you build it
- Token Counting: Track token usage for AI model limits
- Custom Instructions: Add specific requirements or guidelines
- All data is automatically saved locally
- Categories include:
- Task types
- Sample chapters
- Rules
- Characters
- Locations
- Codex entries
- Clean, Modern Design: Built with shadcn/ui components
- Responsive Layout: Adapts to different window sizes
- Tabbed Interface: Organized content access
- Modal Editors: Easy editing of story elements
-
Frontend:
- React
- TypeScript
- Tailwind CSS
- shadcn/ui components
-
Backend:
- Go
- Wails framework
- Saves data in the user's home directory under
.ai-novel-prompter
- Supports files up to 500KB
- Includes file and folder selection capabilities
# Clone the repository
git clone [repository-url]
# Install frontend dependencies
cd frontend
npm install
# Build and run the application
cd ..
wails dev
To build a redistributable, production mode package, use wails build
.
wails build
The Executable is on build/bin Executable
Or generate it with:
wails build -nsis
This can be done for Mac as well see the latest part of this guide
The built application will be available in the build
directory.
-
Initial Setup:
- Define your task types (e.g., "Write Next Chapter", "Revise Chapter")
- Add sample chapters for style reference
- Set up your rules and guidelines
-
Creating a Prompt:
- Select your task type
- Reference or add your previous chapter
- Write your story beats
- Select relevant rules, characters, and locations
- Add any custom instructions
-
Generating Output:
- Choose between ChatGPT or Claude formatting
- Review the generated prompt
- Copy to clipboard
- Paste into your preferred AI assistant
- The codebase supports easy addition of new selectors and options
- Modal components follow a consistent pattern
- Data persistence is handled automatically
- All components use Tailwind CSS for styling
- UI components can be customized through shadcn/ui
- Prompt formatting can be modified in the promptGenerators utility
- User registration and authentication
- Text creation and management
- Chapter creation and management
- Feedback submission and management
- Prompt generation based on traits
- Integration with a local ollama service
- Based on Berry template (https://codedthemes.gitbook.io/berry)
- Inspired on Jason Hamilton Youtube (https://www.youtube.com/@TheNerdyNovelist)
- Frontend:
- React
- TypeScript
- Axios
- React Router
- React Toastify
- Backend:
- Go
- Gin Web Framework
- GORM (Go ORM)
- PostgreSQL
Before running the application, make sure you have the following installed:
- Node.js (v18 or higher)
- Go (v1.18 or higher)
- PostgreSQL
- Docker
- Docker Compose
- Clone the repository:
git clone https://github.com/danielsobrado/ainovelprompter.git
- Navigate to the project directory:
cd ainovelprompter
- Set up the backend:
-
Navigate to the
server
directory:cd server
-
Install the Go dependencies:
go mod download
-
Update the
config.yaml
file with your database configuration. -
Run the database migrations:
go run cmd/main.go migrate
-
Start the backend server:
go run cmd/main.go
- Set up the frontend:
-
Navigate to the
client
directory:cd ../client
-
Install the frontend dependencies:
npm install
-
Start the frontend development server:
npm start
- Open your web browser and visit
http://localhost:3000
to access the application.
- Clone the repository:
git clone https://github.com/danielsobrado/ainovelprompter.git
- Navigate to the project directory:
cd ainovelprompter
-
Update the
docker-compose.yml
file with your database configuration. -
Start the application using Docker Compose:
docker-compose up -d
- Open your web browser and visit
http://localhost:3000
to access the application.
- Backend configuration can be modified in the
server/config.yaml
file. - Frontend configuration can be modified in the
client/src/config.ts
file.
To build the frontend for production, run the following command in the client
directory:
npm run build
The production-ready files will be generated in the client/build
directory.
This small guide provides instructions on how to install PostgreSQL on the Windows Subsystem for Linux (WSL), along with steps to manage user permissions and troubleshoot common issues.
- Windows 10 or higher with WSL enabled. (Or just Ubuntu)
- Basic familiarity with Linux command line and SQL.
-
Open WSL Terminal: Launch your WSL distribution (Ubuntu recommended).
-
Update Packages:
sudo apt update
-
Install PostgreSQL:
sudo apt install postgresql postgresql-contrib
-
Check Installation:
psql --version
-
Set PostgreSQL User Password:
sudo passwd postgres
-
Create Database:
createdb mydb
-
Access Database:
psql mydb
-
Import Tables from SQL File:
psql -U postgres -q mydb < /path/to/file.sql
-
List Databases and Tables:
\l # List databases \dt # List tables in the current database
-
Switch Database:
\c dbname
-
Create New User:
CREATE USER your_db_user WITH PASSWORD 'your_db_password';
-
Grant Privileges:
ALTER USER your_db_user CREATEDB;
-
Role Does Not Exist Error: Switch to the 'postgres' user:
sudo -i -u postgres createdb your_db_name
-
Permission Denied to Create Extension: Login as 'postgres' and execute:
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-
Unknown User Error: Ensure you are using a recognized system user or correctly refer to a PostgreSQL user within the SQL environment, not via
sudo
.
To generate custom training data for fine-tuning a language model to emulate the writing style of George MacDonald, the process begins by obtaining the full text of one of his novels, "The Princess and the Goblin," from Project Gutenberg. The text is then broken down into individual story beats or key moments using a prompt that instructs the AI to generate a JSON object for each beat, capturing the author, emotional tone, type of writing, and the actual text excerpt.
Next, GPT-4 is used to rewrite each of these story beats in its own words, generating a parallel set of JSON data with unique identifiers linking each rewritten beat to its original counterpart. To simplify the data and make it more useful for training, the wide variety of emotional tones is mapped to a smaller set of core tones using a Python function. The two JSON files (original and rewritten beats) are then used to generate training prompts, where the model is asked to rephrase the GPT-4 generated text in the style of the original author. Finally, these prompts and their target outputs are formatted into JSONL and JSON files, ready to be used for fine-tuning the language model to capture MacDonald's distinctive writing style.
In the previous example, the process of generating paraphrased text using a language model involved some manual tasks. The user had to manually provide the input text, run the script, and then review the generated output to ensure its quality. If the output did not meet the desired criteria, the user would need to manually retry the generation process with different parameters or make adjustments to the input text.
However, with the updated version of the process_text_file
function, the entire process has been fully automated. The function takes care of reading the input text file, splitting it into paragraphs, and automatically sending each paragraph to the language model for paraphrasing. It incorporates various checks and retry mechanisms to handle cases where the generated output does not meet the specified criteria, such as containing unwanted phrases, being too short or too long, or consisting of multiple paragraphs.
The automation process includes several key features:
-
Resuming from the last processed paragraph: If the script is interrupted or needs to be run multiple times, it automatically checks the output file and resumes processing from the last successfully paraphrased paragraph. This ensures that progress is not lost and the script can pick up where it left off.
-
Retry mechanism with random seed and temperature: If a generated paraphrase fails to meet the specified criteria, the script automatically retries the generation process up to a specified number of times. With each retry, it randomly changes the seed and temperature values to introduce variation in the generated responses, increasing the chances of obtaining a satisfactory output.
-
Progress saving: The script saves the progress to the output file every specified number of paragraphs (e.g., every 500 paragraphs). This safeguards against data loss in case of any interruptions or errors during the processing of a large text file.
-
Detailed logging and summary: The script provides detailed logging information, including the input paragraph, generated output, retry attempts, and reasons for failure. It also generates a summary at the end, displaying the total number of paragraphs, successfully paraphrased paragraphs, skipped paragraphs, and the total number of retries.
Generating Custom Training Data to Fine-Tune a Language Model with Local LLM and LM Studio using ORPO
To generate ORPO custom training data for fine-tuning a language model to emulate the writing style of George MacDonald.
The input data should be in JSONL format, with each line containing a JSON object that includes the prompt and chosen response. (From the previous fine tuning) To use the script, you need to set up the OpenAI client with your API key and specify the input and output file paths. Running the script will process the JSONL file and generate a CSV file with columns for the prompt, chosen response, and a generated rejected response. The script saves progress every 100 lines and can resume from where it left off if interrupted. Upon completion, it provides a summary of the total lines processed, written lines, skipped lines, and retry details.
-
Dataset Quality Matters: 95% of outcomes depend on dataset quality. A clean dataset is essential since even a little bad data can hurt the model.
-
Manual Data Review: Cleaning and evaluating the dataset can greatly improve the model. This is a time-consuming but necessary step because no amount of parameter adjusting can fix a defective dataset.
-
Training parameters should not improve but prevent model degradation. In robust datasets, the goal should be to avoid negative repercussions while directing the model. There is no optimal learning rate.
-
Model Scale and Hardware Limitations: Larger models (33b parameters) may enable better fine-tuning but require at least 48GB VRAM, making them impractical for majority of home setups.
-
Gradient Accumulation and Batch Size: Gradient accumulation helps reduce overfitting by enhancing generalisation across different datasets, but it may lower quality after a few batches.
-
The size of the dataset is more important for fine-tuning a base model than a well-tuned model. Overloading a well-tuned model with excessive data might degrade its previous fine-tuning.
-
An ideal learning rate schedule starts with a warmup phase, holds steady for an epoch, and then gradually decreases using a cosine schedule.
-
Model Rank and Generalisation: The amount of trainable parameters affects the model's detail and generalisation. Lower-rank models generalise better but lose detail.
-
LoRA's Applicability: Parameter-Efficient Fine-Tuning (PEFT) is applicable to large language models (LLMs) and systems like Stable Diffusion (SD), demonstrating its versatility.
The Unsloth community has helped resolve several issues with finetuning Llama3. Here are some key points to keep in mind:
-
Double BOS tokens: Double BOS tokens during finetuning can break things. Unsloth automatically fixes this issue.
-
GGUF conversion: GGUF conversion is broken. Be careful of double BOS and use CPU instead of GPU for conversion. Unsloth has built-in automatic GGUF conversions.
-
Buggy base weights: Some of Llama 3's base (not instruct) weights are "buggy" (untrained):
<|reserved_special_token_{0->250}|> <|eot_id|> <|start_header_id|> <|end_header_id|>
. This can cause NaNs and buggy results. Unsloth automatically fixes this. -
System prompt: According to the Unsloth community, adding a system prompt makes finetuning of the Instruct version (and possibly the base version) much better.
-
Quantization issues: Quantization issues are common. See this comparison which shows that you can get good performance with Llama3, but using the wrong quantization can hurt performance. For finetuning, use bitsandbytes nf4 to boost accuracy. For GGUF, use the I versions as much as possible.
-
Long context models: Long context models are poorly trained. They simply extend the RoPE theta, sometimes without any training, and then train on a weird concatenated dataset to make it a long dataset. This approach does not work well. A smooth, continuous long context scaling would have been much better if scaling from 8K to 1M context length.
To resolve some of these issues, use Unsloth for finetuning Llama3.
When fine-tuning a language model for paraphrasing in an author's style, it's important to evaluate the quality and effectiveness of the generated paraphrases.
The following evaluation metrics can be used to assess the model's performance:
-
BLEU (Bilingual Evaluation Understudy):
- BLEU measures the n-gram overlap between the generated paraphrase and the reference text, providing a score between 0 and 1.
- To calculate BLEU scores, you can use the
sacrebleu
library in Python. - Example usage:
from sacrebleu import corpus_bleu; bleu_score = corpus_bleu(generated_paraphrases, [original_paragraphs])
-
ROUGE (Recall-Oriented Understudy for Gisting Evaluation):
- ROUGE measures the overlap of n-grams between the generated paraphrase and the reference text, focusing on recall.
- To calculate ROUGE scores, you can use the
rouge
library in Python. - Example usage:
from rouge import Rouge; rouge = Rouge(); scores = rouge.get_scores(generated_paraphrases, original_paragraphs)
-
Perplexity:
- Perplexity quantifies the uncertainty or confusion of the model when generating text.
- To calculate perplexity, you can use the fine-tuned language model itself.
- Example usage:
perplexity = model.perplexity(generated_paraphrases)
-
Stylometric Measures:
- Stylometric measures capture the writing style characteristics of the target author.
- To extract stylometric features, you can use the
stylometry
library in Python. - Example usage:
from stylometry import extract_features; features = extract_features(generated_paraphrases)
To integrate these evaluation metrics into your Axolotl pipeline, follow these steps:
-
Prepare your training data by creating a dataset of paragraphs from the target author's works and splitting it into training and validation sets.
-
Fine-tune your language model using the training set, following the approach discussed earlier.
-
Generate paraphrases for the paragraphs in the validation set using the fine-tuned model.
-
Implement the evaluation metrics using the respective libraries (
sacrebleu
,rouge
,stylometry
) and calculate the scores for each generated paraphrase. -
Perform human evaluation by collecting ratings and feedback from human evaluators.
-
Analyze the evaluation results to assess the quality and style of the generated paraphrases and make informed decisions to improve your fine-tuning process.
Here's an example of how you can integrate these metrics into your pipeline:
from sacrebleu import corpus_bleu
from rouge import Rouge
from stylometry import extract_features
# Fine-tune the model using the training set
fine_tuned_model = train_model(training_data)
# Generate paraphrases for the validation set
generated_paraphrases = generate_paraphrases(fine_tuned_model, validation_data)
# Calculate evaluation metrics
bleu_score = corpus_bleu(generated_paraphrases, [original_paragraphs])
rouge = Rouge()
rouge_scores = rouge.get_scores(generated_paraphrases, original_paragraphs)
perplexity = fine_tuned_model.perplexity(generated_paraphrases)
stylometric_features = extract_features(generated_paraphrases)
# Perform human evaluation
human_scores = collect_human_evaluations(generated_paraphrases)
# Analyze and interpret the results
analyze_results(bleu_score, rouge_scores, perplexity, stylometric_features, human_scores)
Remember to install the necessary libraries (sacrebleu, rouge, stylometry) and adapt the code to fit your implementation in Axolotl or similar.
In this experiment, I explored the capabilities and differences between various AI models in generating a 1500-word text based on a detailed prompt. I tested models from https://chat.lmsys.org/, ChatGPT4, Claude 3 Opus, and some local models in LM Studio. Each model generated the text three times to observe variability in their outputs. I also created a separate prompt for evaluating the writing of the first iteration from each model and asked ChatGPT 4 and Claude Opus 3 to provide feedback.
Through this process, I observed that some models exhibit higher variability between executions, while others tend to use similar wording. There were also significant differences in the number of words generated and the amount of dialogue, descriptions, and paragraphs produced by each model. The evaluation feedback revealed that ChatGPT suggests a more "refined" prose, while Claude recommends less purple prose. Based on these findings, I compiled a list of takeaways to incorporate into the next prompt, focusing on precision, varied sentence structures, strong verbs, unique twists on fantasy motifs, consistent tone, distinct narrator voice, and engaging pacing. Another technique to consider is asking for feedback and then rewriting the text based on that feedback.
I'm open to collaborating with others to further fine-tune prompts for each model and explore their capabilities in creative writing tasks.
- Direct Instructions:
- Use clean, specific, and direct commands.
- Avoid verbosity and unnecessary phrases.
- Adjective Management:
- Be cautious with adjectives; they may influence the model's response inappropriately.
- Delimiters and Markdown:
- Use backticks, brackets, or markdown to separate distinct parts of the text.
- Markdown helps structure and segregate sections effectively.
- Structured Formats:
- Utilize JSON, markdown, HTML, etc., for input and output.
- Constrain output using JSON schema when necessary.
- Few-shot Examples:
- Provide few-shot examples from various niches to avoid overfitting.
- Use these examples to "teach" the model steps in a process.
- Chain-of-Thought:
- Implement chain-of-thought prompts to improve reasoning and procedural understanding.
- Break down tasks into steps and guide the model through them.
- Description Before Completion:
- Prompt the model to describe entities before answering.
- Ensure that description doesn’t bleed into completion unintentionally.
- Context Management:
- Provide essential context only, avoid unstructured paragraph dumps.
- Direct the model towards the desired answer with sufficient but concise context.
- Testing and Verification:
- Test prompts multiple times to catch unexpected outputs.
- Use completion ranking for relevance, clarity, and coherence.
- Use Stories:
- Control output with storytelling techniques.
- For example, write a narrative that includes the desired output format.
- GBNF Grammars:
- Explore GBNF grammars to constrain and control model output.
- Read and Refine:
- Review and refine generated prompts to remove unnecessary phrases and ensure clarity.
Models have inherent formatting biases. Some models prefer hyphens for lists, others asterisks. When using these models, it's helpful to mirror their preferences for consistent outputs.
-
Formatting Tendencies:
-
Llama 3 prefers lists with bolded headings and asterisks.
-
Example: Bolded Title Case Heading
-
List items with asterisks after two newlines
-
List items separated by one newline
Next List
-
More list items
-
Etc...
-
-
-
Few-shot Examples:
- Llama 3 follows both system prompts and few-shot examples.
- It is flexible with prompting methods but may quote few-shot examples verbatim.
-
System Prompt Adherence:
- Llama 3 responds well to system prompts with detailed instructions.
- Combining system prompts and few-shot examples yields better results.
-
Context Window:
- The current context window is small, limiting the use of extensive few-shot examples.
- This may be addressed in future updates.
-
Censorship:
- The instruct version has some censorship but is less restricted than previous versions.
-
Intelligence:
- Performs well in zero-shot chain-of-thought reasoning.
- Capable of understanding and adapting to varied inputs.
-
Consistency:
- Generally consistent but may directly quote examples.
- Performance can degrade with higher temperatures.
-
Lists and Formatting:
- Use the preferred list format for better accuracy.
- Explicitly instruct Llama 3 on desired output formats if different from its default.
-
Chat Settings:
- Suitable for tasks requiring intelligence and instruction following.
- Limited by context window for large tasks.
-
Pipeline Settings:
- Effective for GPT-4 style pipelines using system prompts.
- Context window limitations restrict some tasks.
Llama 3 is flexible and intelligent but has context and quoting limitations. Adjust prompting methods accordingly.
All comments are welcome. Open an issue or send a pull request if you find any bugs or have recommendations for improvement.
This project is licensed under: Attribution-NonCommercial-NoDerivatives (BY-NC-ND) license See: https://creativecommons.org/licenses/by-nc-nd/4.0/deed.en