Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YouTube to notes Convertor #27

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

YouTube to notes Convertor #27

wants to merge 3 commits into from

Conversation

Legedith
Copy link

@Legedith Legedith commented Sep 21, 2024

Tune AI PyCon India 2024 Contest

YouTube to notes Convertor

new : We can now generate and add Images to notes based off off sections. The prompts for these images are generated by a model hosted on Tune whereas the image gen is performed by a Tune Assistant (tools/image/tune_img.py).
This is done as a post processing step as not all note might need this. But when they do, the flow is:

  • Download the YT video, and parse the audio using Whisper AI.
  • Clean the transcript using a LLM (Tune AI)
  • Parse the transcript and turn then into notes (Tune AI)
  • Add time-stampped urls to different sections of the notes.
  • Perform OCR on the input slides/images and add them to the notes.
    (optional below)
  • Derive a line-to-prompt map using Tune AI
  • Create the images for each prompt.
  • Add the images to the markdown notes.

note: this is a personal project, recently revamped using Tune. There's only so much you can do with the free context length of 20k when you have to process hours and hours of data. With Tune, I'm able to play around with models, change them on-the-go, and don't have to worry about the context length (as long as I have credits). In my testing, 16k tokens consumed around 2p of credits, which is pretty decent. If I process a 2 hour lecture, some 250k tokens in total, it would cost me around 3 rs. which is way better than the 3 days that I have to wait for Gemini to generate a decent set of notes for me.

Tune AI Products Used (Required)

  • Tune Studio
  • Tune Chat
  • Tune Assisants

Checklist

  • Includes a README
  • Includes instructions on how to run the app

Demo

Refer these sample notes

Legedith and others added 2 commits September 22, 2024 00:55
- Added a new file `desktop.ini` to the `audio_extractor/samples` directory.
- Created a new file `extractor.py` in the `audio_extractor` directory, which contains an abstract class `AudioExtractor` with an abstract method `extract_text`.
- Added a new file `text_formatter.py` to the `text_formatter` directory, which contains a class `TextFormatter` with an `__init__` method and a `format_text` method.
- Created a new file `img_handler.py` in the `slide_processor/extractors` directory, which contains a class `ImageHandler` with an `__init__` method and a `process_images` method.
- Added a new file `text_cleaner.py` to the `text_formatter` directory, which contains a class `TextCleaner` with a static method `clean_text`.
- Created a new file `markdown_formatter.py` in the `text_formatter` directory, which contains a class `MarkdownFormatter` with an `__init__` method, a `fix_formatting` method, and a `save_file` method.
- Added a new file `fuzzy.py` to the `text/matching` directory, which contains a class `FuzzyMatcher` with a static method `fuzzy_matching`.
- Created a new file `timestamp.py` in the `core/post_processing` directory, which contains a class `TimestampedNoteProcessor` with an `__init__` method and a `process_notes` method.
- Added a new file `pdf_handler.py` to the `slide_processor/extractors` directory, which contains a function `extract_text_from_pdf` that extracts text from a PDF file using PyMuPDF.
- Added a new file `pyproject.toml` to the root directory, which contains project metadata and dependencies.
@Legedith Legedith changed the title Add audio and text processing modules YouTube to notes Convertor Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants