YouTube to notes Convertor #27

Legedith · 2024-09-21T19:28:33Z

Tune AI PyCon India 2024 Contest

YouTube to notes Convertor

new : We can now generate and add Images to notes based off off sections. The prompts for these images are generated by a model hosted on Tune whereas the image gen is performed by a Tune Assistant (tools/image/tune_img.py).
This is done as a post processing step as not all note might need this. But when they do, the flow is:

Download the YT video, and parse the audio using Whisper AI.
Clean the transcript using a LLM (Tune AI)
Parse the transcript and turn then into notes (Tune AI)
Add time-stampped urls to different sections of the notes.
Perform OCR on the input slides/images and add them to the notes.
(optional below)
Derive a line-to-prompt map using Tune AI
Create the images for each prompt.
Add the images to the markdown notes.

note: this is a personal project, recently revamped using Tune. There's only so much you can do with the free context length of 20k when you have to process hours and hours of data. With Tune, I'm able to play around with models, change them on-the-go, and don't have to worry about the context length (as long as I have credits). In my testing, 16k tokens consumed around 2p of credits, which is pretty decent. If I process a 2 hour lecture, some 250k tokens in total, it would cost me around 3 rs. which is way better than the 3 days that I have to wait for Gemini to generate a decent set of notes for me.

Tune AI Products Used (Required)

Tune Studio
Tune Chat
Tune Assisants

Checklist

Includes a README
Includes instructions on how to run the app

Demo

Refer these sample notes

- Added a new file `desktop.ini` to the `audio_extractor/samples` directory. - Created a new file `extractor.py` in the `audio_extractor` directory, which contains an abstract class `AudioExtractor` with an abstract method `extract_text`. - Added a new file `text_formatter.py` to the `text_formatter` directory, which contains a class `TextFormatter` with an `__init__` method and a `format_text` method. - Created a new file `img_handler.py` in the `slide_processor/extractors` directory, which contains a class `ImageHandler` with an `__init__` method and a `process_images` method. - Added a new file `text_cleaner.py` to the `text_formatter` directory, which contains a class `TextCleaner` with a static method `clean_text`. - Created a new file `markdown_formatter.py` in the `text_formatter` directory, which contains a class `MarkdownFormatter` with an `__init__` method, a `fix_formatting` method, and a `save_file` method. - Added a new file `fuzzy.py` to the `text/matching` directory, which contains a class `FuzzyMatcher` with a static method `fuzzy_matching`. - Created a new file `timestamp.py` in the `core/post_processing` directory, which contains a class `TimestampedNoteProcessor` with an `__init__` method and a `process_notes` method. - Added a new file `pdf_handler.py` to the `slide_processor/extractors` directory, which contains a function `extract_text_from_pdf` that extracts text from a PDF file using PyMuPDF. - Added a new file `pyproject.toml` to the root directory, which contains project metadata and dependencies.

Legedith and others added 2 commits September 22, 2024 00:55

Update README.md

e5d3836

Legedith changed the title ~~Add audio and text processing modules~~ YouTube to notes Convertor Sep 21, 2024

Add image support using Tune Assistant

23f8c89

abhishekmishragithub added the mergeable label Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YouTube to notes Convertor #27

YouTube to notes Convertor #27

Legedith commented Sep 21, 2024 •

edited

Loading

YouTube to notes Convertor #27

Are you sure you want to change the base?

YouTube to notes Convertor #27

Conversation

Legedith commented Sep 21, 2024 • edited Loading

Tune AI PyCon India 2024 Contest