Convert Audio to meaningful JSON.
This project is a proof of concept.
- Define extractors, which basically means specifying JSON keys and prompts to fill those keys with.
- Upload an audio file and choose an extractor.
- Receive JSON at your webhook of choice. (not implemented)
jaison-demo.mp4
Behind the scenes, whenever an audio is uploaded, it's getting sent to the speech-to-text service from rev.ai.
Once we get the transcription back, we generate a prompt that includes the information the user specified in the extractor, the complete transcript, and custom instructions for GPT to generate the response we need in the correct format.
Once we get a response from GPT, we run some validations on it, and if validations fail, we re-submit the prompt for a total of 5 max attempts until we get a valid response or give up.