Skip to content
Change the repository type filter

All

    Repositories list

    • OLMo

      Public
      Modeling, training, eval, and inference code for OLMo
      Python
      Apache License 2.0
      450000Updated Oct 6, 2024Oct 6, 2024
    • aimrun

      Public
      simple interface for integrating aim into MLOps frameworks
      Python
      MIT License
      0000Updated Oct 5, 2024Oct 5, 2024
    • bitlinear

      Public
      BitLinear implementation
      Python
      MIT License
      51801Updated Oct 5, 2024Oct 5, 2024
    • diffusers

      Public
      🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
      Python
      Apache License 2.0
      5.3k000Updated Oct 1, 2024Oct 1, 2024
    • organoids

      Public
      Automatic segmentation and analysis of organoids
      Python
      MIT License
      0000Updated Sep 19, 2024Sep 19, 2024
    • syntheval

      Public
      Software for evaluating the quality of synthetic data compared with real data.
      Python
      MIT License
      3800Updated Sep 17, 2024Sep 17, 2024
    • dolma

      Public
      Data and tools for generating and inspecting OLMo pre-training data.
      Python
      Apache License 2.0
      105000Updated Sep 3, 2024Sep 3, 2024
    • nanoT5

      Public
      Fast & Simple repository for pre-training and fine-tuning T5-style models
      Python
      Apache License 2.0
      72102Updated Aug 28, 2024Aug 28, 2024
    • meta library for synthetic data generation
      Python
      MIT License
      0200Updated Aug 4, 2024Aug 4, 2024
    • aim

      Public
      Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
      Python
      Apache License 2.0
      316000Updated Aug 3, 2024Aug 3, 2024
    • cramming

      Public
      Cramming the training of a (BERT-type) language model into limited compute.
      Python
      MIT License
      100000Updated Jul 4, 2024Jul 4, 2024
    • Research paper supplement and code example of using SynthEval for executing a model benchmark
      Jupyter Notebook
      MIT License
      0000Updated Jun 10, 2024Jun 10, 2024
    • snopt

      Public
      Sorting Network OPTimizer
      Python
      0000Updated Apr 27, 2024Apr 27, 2024
    • A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
      Python
      GNU Affero General Public License v3.0
      5.2k000Updated Apr 12, 2024Apr 12, 2024
    • nanoGPT

      Public
      The simplest, fastest repository for training/finetuning medium-sized GPTs.
      Python
      MIT License
      5.8k000Updated Mar 24, 2024Mar 24, 2024
    • A subprocess-based reimplementation of parts of Python's multiprocessing library
      Python
      MIT License
      0000Updated Mar 2, 2024Mar 2, 2024
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      27k000Updated Feb 2, 2024Feb 2, 2024
    • Simple script for downloading Youtube comments without using the Youtube API
      Python
      MIT License
      224000Updated Jan 10, 2024Jan 10, 2024
    • cair

      Public
      CAIR rubric for privacy metrics
      MIT License
      0100Updated Jan 5, 2024Jan 5, 2024
    • Scaling study of Synthetic Data Generation models and evaluations
      Python
      0000Updated Dec 30, 2023Dec 30, 2023
    • We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. Meanwhile, we created a new branch to build a Tabular LLM.(我们分别统一了丰富的IFT数据(如CoT数据)、多种训练效率方法以及多种LLMs,三个层面上的接口,打造方便研究人员上手的LLM-IFT研究和使用平台。我们欢迎开源爱好者在这个repo上发起任何有意义的pr,一起将尽可能多的LLM相关技术集成进来。
      Jupyter Notebook
      Apache License 2.0
      245000Updated Sep 21, 2023Sep 21, 2023
    • DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
      Python
      MIT License
      51000Updated May 11, 2023May 11, 2023