SumanthRH

hmmmst

Sumanth R Hegde SumanthRH

hmmmst

Software Engineer, Anyscale. Intensity is all you need.

60 followers · 0 following

Anyscale
San Francisco, CA
sumanthrh.com
@sumanthrh

Achievements

x3 x3 x2

Achievements

x3 x3 x2

Highlights

Organizations

SkyThought Public
Forked from NovaSky-AI/SkyThought

Sky-T1: Train your own O1 preview model within $450

Python Apache License 2.0 Updated Jan 29, 2025
FastChat Public
Forked from lm-sys/FastChat

Fork of FastChat, an open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.

Python Apache License 2.0 Updated Jan 23, 2025
gorilla Public
Forked from ShishirPatil/gorilla

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

Python Apache License 2.0 Updated Jan 19, 2025
personal-website Public

My personal website, built on top of Wowchemy

SCSS Updated Jan 16, 2025
learnings Public

Learning dump that I couldn't place anywhere else

Python 1 Updated Jan 7, 2025
trl Public
Forked from huggingface/trl

Train transformer language models with reinforcement learning.

Python Apache License 2.0 Updated Dec 27, 2024
accelerate Public
Forked from huggingface/accelerate

🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

Python Apache License 2.0 Updated Dec 19, 2024
open-instruct Public
Forked from allenai/open-instruct

Python Apache License 2.0 Updated Dec 3, 2024
SumanthRH Public

Updated Nov 29, 2024
DeepSpeed Public
Forked from microsoft/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python Apache License 2.0 Updated Nov 28, 2024
verl Public
Forked from volcengine/verl

veRL: Volcano Engine Reinforcement Learning for LLM

Python Apache License 2.0 Updated Nov 28, 2024
ray Public
Forked from ray-project/ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python Apache License 2.0 Updated Nov 20, 2024
Liger-Kernel Public
Forked from linkedin/Liger-Kernel

Efficient Triton Kernels for LLM Training

Python BSD 2-Clause "Simplified" License Updated Nov 3, 2024
entropix Public
Forked from xjdr-alt/entropix

Entropy Based Sampling and Parallel CoT Decoding

TypeScript Apache License 2.0 Updated Oct 13, 2024
tokenization Public

A comprehensive deep dive into the world of tokens

Python 215 9 MIT License Updated Jun 24, 2024
finetune-benchmarking Public

Python Updated Mar 22, 2024
pygloo Public
Forked from ray-project/pygloo

Pygloo provides Python bindings for Gloo.

C++ 1 Apache License 2.0 Updated Feb 15, 2024
transformers Public
Forked from huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python Apache License 2.0 Updated Feb 6, 2024
ICL_Support_Example Public
Forked from LeeSureman/ICL_Support_Example

The official implementation of the paper "Finding Support Examples for In-Context Learning".

Python Updated Jan 31, 2024
peft Public
Forked from huggingface/peft

Fork of 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning. Our implementation for IA3, a new fine-tuning method is now a part of the official Huggingface library!

Python Apache License 2.0 Updated Jan 30, 2024
ecco Public
Forked from jalammar/ecco

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, B…

Jupyter Notebook BSD 3-Clause "New" or "Revised" License Updated Jan 30, 2024
python-mastery Public
Forked from dabeaz-course/python-mastery

My solutions for Advanced Python Mastery (course by @dabeaz)

Python 11 1 Creative Commons Attribution Share Alike 4.0 International Updated Jan 29, 2024
nanotron Public
Forked from huggingface/nanotron

Minimalistic large language model 3D-parallelism training

Python Apache License 2.0 Updated Jan 19, 2024
cudamodelecture1 Public
Forked from gpu-mode/profiling-cuda-in-torch

Python Updated Jan 13, 2024
llmperf Public
Forked from ray-project/llmperf

LLMPerf is a library for validating and benchmarking LLMs

Jupyter Notebook Apache License 2.0 Updated Jan 12, 2024
cuda-resource-stream Public
Forked from gpu-mode/resource-stream

CUDA related news and material links

1 MIT License Updated Dec 28, 2023
TuPaTE Public
Forked from JetRunner/TuPaTE

Code for EMNLP 2022 paper "Efficiently Tuned Parameters are Task Embeddings"

Python 1 Apache License 2.0 Updated Dec 13, 2023
unsloth Public
Forked from unslothai/unsloth

5X faster 50% less memory LLM finetuning

Python Apache License 2.0 Updated Dec 1, 2023
text-to-meme Public

A Text to Meme model that can generate a full meme given user text.

Python 4 2 Updated Nov 23, 2023
ia_3_test Public
Forked from ChaoGaoUCR/ia_3_test

Fork of Chao's test with peftt ia^3. Trying to get to the bottom of IA3 training errors.

Python Updated Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sumanth R Hegde SumanthRH

Achievements

Achievements

Highlights

Organizations

Block or report SumanthRH

SkyThought Public

FastChat Public

gorilla Public

personal-website Public

learnings Public

trl Public

accelerate Public

open-instruct Public

SumanthRH Public

DeepSpeed Public

verl Public

ray Public

Liger-Kernel Public

entropix Public

tokenization Public

finetune-benchmarking Public

pygloo Public

transformers Public

ICL_Support_Example Public

peft Public

ecco Public

python-mastery Public

nanotron Public

cudamodelecture1 Public

llmperf Public

cuda-resource-stream Public

TuPaTE Public

unsloth Public

text-to-meme Public

ia_3_test Public