ProtosAI

A Study in Artificial Intelligence

This project consist of a collection of scripts that explore capabilities provided by neural networks (NN), generative pre-trained transformers (GPT) and large language models (LLM). Most of these scripts are based on models hosted by Hugging Face (https://huggingface.co/).

Google Colab Example: ProtosAI.ipynb

Classes:

Large Language Models: How do they work? (YouTube Video) (Notebook)
Retrieval Augmented Generation (RAG) (YouTube Video) (Notebook)

Setup

Setup required for these scripts:

# Requirements
pip install transformers datasets
pip install torch

Note that during the fist run, the library will download the required model to process the inputs.

Sentiment Analysis

The sentiment.py script prompts the user for a line of text and uses a model to determine the sentiment of the text (positive, neutral or negative).

Enter some text (or empty to end): I love you.
Sentiment score: [{'label': 'positive', 'score': 0.9286843538284302}]

Enter some text (or empty to end): I am sad.
Sentiment score: [{'label': 'negative', 'score': 0.7978498935699463}]

Enter some text (or empty to end): I hate dirty pots.
Sentiment score: [{'label': 'negative', 'score': 0.9309694170951843}]

Enter some text (or empty to end): Don't move!
Sentiment score: [{'label': 'neutral', 'score': 0.6040788292884827}]

Summarization

The summary.py script takes a text file input and uses the summarization model to produce a single paragraph summary.

$ python3 summary.py pottery.txt                                     
Loading transformer...

Reading pottery.txt...
Number of lines: 14
Number of words: 566
Number of characters: 3416

Summarizing...
Text:  The key to becoming a great artist, writer, musician, etc., is to keep creating!
Keep drawing, keep writing, keep playing! Quality emerges from the quantity of practice
and continuous learning that makes them more perfect . The prize of perfection comes by
delivering and learning, says Jason Cox .
Number of lines: 1
Number of words: 49
Number of characters: 299

Transcribe

The transcribe.py script takes an audio file (mp3 or wav file) and uses a speech model to produce a basic text transcription. A additional tool record.py will use your laptops microphone to record your dictation into audio.wav that can be used by transcribe.py.

# Requirements - MacOS
brew install ffmpeg   

# Requirements - Ubuntu Linux
sudo apt install ffmpeg

$ python3 transcribe.py test.wav

Loading model...

Transcribing test.wav...
HELLO THIS IS A TEST

Text to Speech

The speech.py script converts a text string into an audio file. The script requires additional libraries:

# Requirements MacOS
brew install portaudio  

# Requirements Ubuntu Linux
sudo apt install portaudio19-dev
sudo apt install python3-pyaudio

pip install espnet torchaudio sentencepiece pyaudio

$ python3 speech.py

Loading models...

Converting text to speech...

Writing to audio.wav...

Speaking: Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.

output.mp4

Speech to Text

The advanced OpenAI Whisper model can be used to do transcription. Sample scripts are located in the whisper folder.

Convert MP3 audio files to Text - transcribe-mp3.py
Convert YouTube videos to Text - transcribe-youtube.py

Voice Cloning

There are several models and kits emerging that allow you to build your own speech model based on sample speech. The TTS python package is one, by coqui-ai https://github.com/coqui-ai/TTS

# Install TTS
pip install TTS

Example (TBD)

from TTS.api import TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2", gpu=True)

# generate speech by cloning a voice using default settings
tts.tts_to_file(text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
                file_path="output.wav",
                speaker_wav="/path/to/target/speaker.wav",
                language="en")

Handwriting to Text

The handwriting.py script converts an image of a handwritten single line of text to a string of text.

# Requirements
pip install image

$ python3 handwriting.py test.png
Converting image to text: test.png

Loading transformer...
 * microsoft/trocr-base-handwritten

Analyzing handwriting from test.png...

Resulting text:
This is a test-Can you read this?

Large Language Models (LLM)

The exploration of different LLMs is located in the llm folder. The goal of this section is to explore the different LLM models, specifically related to building, training, tuning and using these models.

BiGram - This experiment uses an introductory training model based on the "Let's build a GPT from scratch" video by Andrej Karpathy.
nanoGPT - Similar to above but using the tiny GPT, Andrej Karpathy's nanoGPT
LLaMA - The llama.cpp project's goal is to run LLaMA models using integer quantization to allow the use of these LLMs on local small scale computers like a MacBook.

OpenAI Test

The openai.py script prompts the OpenAI gpt-3.5 model and prints the response.

# Requirements
pip install openai

# Test
$ python3 gpt.py
What do you want to ask? Can you say something to inspire engineers?

Answer: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Of course! Here's a quote to inspire engineers:\n\n\"Engineering is not only about creating solutions, it's about creating a better world. Every time you solve a problem, you make the world a little bit better.\" - Unknown\n\nAs an engineer, you have the power to make a positive impact on society through your work. Whether you're designing new technologies, improving existing systems, or solving complex problems, your contributions are essential to advancing our world. So keep pushing the boundaries of what's possible, and never forget the impact that your work can have on the world around you.",
        "role": "assistant"
      }
    }
  ],
  "created": 1685856679,
  "id": "chatcmpl-7Nach0z2sJQ5FzZOVl6jZWPU4O6zV",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 117,
    "prompt_tokens": 26,
    "total_tokens": 143
  }
}

GPT-2 Text Generation

The gpt-2.py script uses the gpt2-xl model to generate test based on a prompt.

$ python3 gpt-2.py

[{'generated_text': "Hello, I'm a language model, but what I do you need to know isn't that hard. But if you want to understand us, you"}, {'generated_text': "Hello, I'm a language model, this is my first commit and I'd like to get some feedback to see if I understand this commit.\n"}, {'generated_text': "Hello, I'm a language model, and I'll guide you on your journey!\n\nLet's get to it.\n\nBefore we start"}, {'generated_text': 'Hello, I\'m a language model, not a developer." If everything you\'re learning about code is through books, you\'ll never get to know about'}, {'generated_text': 'Hello, I\'m a language model, please tell me what you think!" – I started out on this track, and now I am doing a lot'}]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProtosAI

Setup

Sentiment Analysis

Summarization

Transcribe

Text to Speech

Speech to Text

Voice Cloning

Handwriting to Text

Large Language Models (LLM)

OpenAI Test

GPT-2 Text Generation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
gpu		gpu
jupyter		jupyter
llm		llm
notebooks		notebooks
whisper		whisper
.gitignore		.gitignore
LICENSE		LICENSE
ProtosAI.ipynb		ProtosAI.ipynb
README.md		README.md
gpt-2.py		gpt-2.py
handwriting.py		handwriting.py
openai.py		openai.py
play.py		play.py
pottery.txt		pottery.txt
record.py		record.py
requirements.txt		requirements.txt
sentiment.py		sentiment.py
speech.py		speech.py
summary.py		summary.py
test.png		test.png
test.wav		test.wav
transcribe.py		transcribe.py
voice_clone.py		voice_clone.py

License

jasonacox/ProtosAI

Folders and files

Latest commit

History

Repository files navigation

ProtosAI

Setup

Sentiment Analysis

Summarization

Transcribe

Text to Speech

Speech to Text

Voice Cloning

Handwriting to Text

Large Language Models (LLM)

OpenAI Test

GPT-2 Text Generation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages