Skip to content
View SumanthRH's full-sized avatar
:shipit:
hmmmst
:shipit:
hmmmst

Highlights

  • Pro

Organizations

@anyscale

Block or report SumanthRH

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • Sky-T1: Train your own O1 preview model within $450

    Python Apache License 2.0 Updated Jan 29, 2025
  • FastChat Public

    Forked from lm-sys/FastChat

    Fork of FastChat, an open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.

    Python Apache License 2.0 Updated Jan 23, 2025
  • gorilla Public

    Forked from ShishirPatil/gorilla

    Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

    Python Apache License 2.0 Updated Jan 19, 2025
  • My personal website, built on top of Wowchemy

    SCSS Updated Jan 16, 2025
  • learnings Public

    Learning dump that I couldn't place anywhere else

    Python 1 Updated Jan 7, 2025
  • trl Public

    Forked from huggingface/trl

    Train transformer language models with reinforcement learning.

    Python Apache License 2.0 Updated Dec 27, 2024
  • 🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

    Python Apache License 2.0 Updated Dec 19, 2024
  • Python Apache License 2.0 Updated Dec 3, 2024
  • SumanthRH Public

    Updated Nov 29, 2024
  • DeepSpeed Public

    Forked from microsoft/DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Python Apache License 2.0 Updated Nov 28, 2024
  • verl Public

    Forked from volcengine/verl

    veRL: Volcano Engine Reinforcement Learning for LLM

    Python Apache License 2.0 Updated Nov 28, 2024
  • ray Public

    Forked from ray-project/ray

    Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Python Apache License 2.0 Updated Nov 20, 2024
  • Efficient Triton Kernels for LLM Training

    Python BSD 2-Clause "Simplified" License Updated Nov 3, 2024
  • entropix Public

    Forked from xjdr-alt/entropix

    Entropy Based Sampling and Parallel CoT Decoding

    TypeScript Apache License 2.0 Updated Oct 13, 2024
  • A comprehensive deep dive into the world of tokens

    Python 215 9 MIT License Updated Jun 24, 2024
  • Python Updated Mar 22, 2024
  • pygloo Public

    Forked from ray-project/pygloo

    Pygloo provides Python bindings for Gloo.

    C++ 1 Apache License 2.0 Updated Feb 15, 2024
  • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Python Apache License 2.0 Updated Feb 6, 2024
  • The official implementation of the paper "Finding Support Examples for In-Context Learning".

    Python Updated Jan 31, 2024
  • peft Public

    Forked from huggingface/peft

    Fork of 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning. Our implementation for IA3, a new fine-tuning method is now a part of the official Huggingface library!

    Python Apache License 2.0 Updated Jan 30, 2024
  • ecco Public

    Forked from jalammar/ecco

    Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, B…

    Jupyter Notebook BSD 3-Clause "New" or "Revised" License Updated Jan 30, 2024
  • My solutions for Advanced Python Mastery (course by @dabeaz)

    Python 11 1 Creative Commons Attribution Share Alike 4.0 International Updated Jan 29, 2024
  • nanotron Public

    Forked from huggingface/nanotron

    Minimalistic large language model 3D-parallelism training

    Python Apache License 2.0 Updated Jan 19, 2024
  • Python Updated Jan 13, 2024
  • llmperf Public

    Forked from ray-project/llmperf

    LLMPerf is a library for validating and benchmarking LLMs

    Jupyter Notebook Apache License 2.0 Updated Jan 12, 2024
  • CUDA related news and material links

    1 MIT License Updated Dec 28, 2023
  • TuPaTE Public

    Forked from JetRunner/TuPaTE

    Code for EMNLP 2022 paper "Efficiently Tuned Parameters are Task Embeddings"

    Python 1 Apache License 2.0 Updated Dec 13, 2023
  • unsloth Public

    Forked from unslothai/unsloth

    5X faster 50% less memory LLM finetuning

    Python Apache License 2.0 Updated Dec 1, 2023
  • A Text to Meme model that can generate a full meme given user text.

    Python 4 2 Updated Nov 23, 2023
  • ia_3_test Public

    Forked from ChaoGaoUCR/ia_3_test

    Fork of Chao's test with peftt ia^3. Trying to get to the bottom of IA3 training errors.

    Python Updated Sep 20, 2023