Hi there! I am Saravanan Chandrasekaran

Contributed to Tango

Hi there! I am Saravanan Chandrasekaran

AI Engineer @ AgriFood (Govt. of Canada)

I mostly use my own GitLab Server because most of my personal projects are monetized :)

🧑‍💻 Professional Summary:

I am a full-stack developer, machine learning engineer, and Android app developer with over five years of experience in financial technology and software development at TD Bank, Verizon, and Wipro. I've contributed to various projects that have made significant impacts, including:

Verizon: Led the R&D team on the "Video assist using Blue jeans" project, generating an additional $1M in revenue for Verizon. Developed agent-assist chatbots for over 75,000 retail store agents. Developed a virtual assistant with Python, React, and AWS that increased customer satisfaction by 74%.
TD Bank: Enhanced the TD Bank Android application with improved performance and user experience. Automated test cases with Selenium for Android and iOS devices.
Freelancing: Engineered a web application for CoxPHIT LLC using React, Python, MongoDB, and MySQL to integrate advanced data management for psychological health treatment analysis, reducing manual workload by 80%.

🧑‍💻 Active Contribution:

Tango: LLM-guided Diffusion-based Text-to-Audio Generation and DPO-based Alignment

🎵 🔥 🎉 🎉 We are releasing Tango 2 built upon Tango for text-to-audio generation. Tango 2 was initialized with the Tango-full-ft checkpoint and underwent alignment training using DPO on audio-alpaca, a pairwise text-to-audio preference dataset. Download the model, Access the demo. Trainer is available in the tango2 directory🎶

Quickstart on Google Colab

Colab	Info
	Tango_2_Google_Colab_demo.ipynb

Description

TANGO is a latent diffusion model (LDM) for text-to-audio (TTA) generation. TANGO can generate realistic audios including human sounds, animal sounds, natural and artificial sounds and sound effects from textual prompts. We use the frozen instruction-tuned LLM Flan-T5 as the text encoder and train a UNet based diffusion model for audio generation. We perform comparably to current state-of-the-art models for TTA across both objective and subjective metrics, despite training the LDM on a 63 times smaller dataset. We release our model, training, inference code, and pre-trained checkpoints for the research community.

🌱 Interests and Hobbies:

Exploring new technologies in AI/ML, and software development.
Contributing to Open Source community.

🌐 Passionate about leveraging technology to drive positive social impact and address real-world challenges.
🌱 Eager to explore new technologies and domains, with a passion for pushing boundaries and exploring innovative solutions.
🌟 Recognized for exceptional problem-solving skills and the ability to thrive in fast-paced, collaborative environments.
🔍 Seeking Fall 2024 internships