Skip to content

saravananbcs/saravananbcs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 

Repository files navigation

Contributed to Tango

GitHub stars GitHub forks

Hi there! I am Saravanan Chandrasekaran

AI Engineer @ AgriFood (Govt. of Canada)

I mostly use my own GitLab Server because most of my personal projects are monetized :)

πŸ§‘β€πŸ’» Professional Summary:

I am a full-stack developer, machine learning engineer, and Android app developer with over five years of experience in financial technology and software development at TD Bank, Verizon, and Wipro. I've contributed to various projects that have made significant impacts, including:

  • Verizon: Led the R&D team on the "Video assist using Blue jeans" project, generating an additional $1M in revenue for Verizon. Developed agent-assist chatbots for over 75,000 retail store agents. Developed a virtual assistant with Python, React, and AWS that increased customer satisfaction by 74%.

  • TD Bank: Enhanced the TD Bank Android application with improved performance and user experience. Automated test cases with Selenium for Android and iOS devices.

  • Freelancing: Engineered a web application for CoxPHIT LLC using React, Python, MongoDB, and MySQL to integrate advanced data management for psychological health treatment analysis, reducing manual workload by 80%.

πŸ§‘β€πŸ’» Active Contribution:

GitHub stars GitHub forks

Tango: LLM-guided Diffusion-based Text-to-Audio Generation and DPO-based Alignment Tango

🎡 πŸ”₯ πŸŽ‰ πŸŽ‰ We are releasing Tango 2 built upon Tango for text-to-audio generation. Tango 2 was initialized with the Tango-full-ft checkpoint and underwent alignment training using DPO on audio-alpaca, a pairwise text-to-audio preference dataset. Download the model, Access the demo. Trainer is available in the tango2 directory🎢

Quickstart on Google Colab

Colab Info
Open In Colab Tango_2_Google_Colab_demo.ipynb

Description

TANGO is a latent diffusion model (LDM) for text-to-audio (TTA) generation. TANGO can generate realistic audios including human sounds, animal sounds, natural and artificial sounds and sound effects from textual prompts. We use the frozen instruction-tuned LLM Flan-T5 as the text encoder and train a UNet based diffusion model for audio generation. We perform comparably to current state-of-the-art models for TTA across both objective and subjective metrics, despite training the LDM on a 63 times smaller dataset. We release our model, training, inference code, and pre-trained checkpoints for the research community.

🌱 Interests and Hobbies:

  • Exploring new technologies in AI/ML, and software development.
  • Contributing to Open Source community.
  • 🌐 Passionate about leveraging technology to drive positive social impact and address real-world challenges.

  • 🌱 Eager to explore new technologies and domains, with a passion for pushing boundaries and exploring innovative solutions.

  • 🌟 Recognized for exceptional problem-solving skills and the ability to thrive in fast-paced, collaborative environments.

  • πŸ” Seeking Fall 2024 internships

Discover more about me and feel welcome to connect with me here:

πŸ“« How to reach me:

  • LinkedIn Badge

⚑ Tech Stack

πŸš€ Languages

Python JavaScript Java C C++ SQL Shell Scripting

🧩 Frameworks

Spring Boot ReactJS VueJS Django Flask FastAPI Laravel Express

πŸ—ƒοΈ Database Systems & Management

MySQL MongoDB PostgreSQL Amazon RDS Firebase

πŸ§ͺ Testing

RestAssured TestNG Selenium Appium Robot Framework

πŸ”„ CI/CD

Docker Jenkins GitHub Actions

πŸ’» Other Tools

Postman Unity AWS OpenAPI Grafana Android Studio Git Jira Linux Microsoft Word Microsoft Excel Microsoft Outlook Microsoft PowerPoint SDLC

πŸ–₯️ Workspace

Windows Linux macOS

About

Config files for my GitHub profile.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published