PyTorch native quantization and sparsity for training and inference
-
Updated
Aug 6, 2025 - Python
PyTorch native quantization and sparsity for training and inference
A modular, accelerator-ready machine learning framework built in Go that speaks float8/16/32/64. Designed with clean architecture, strong typing, and native concurrency for scalable, production-ready AI systems. Ideal for engineers who value simplicity, speed, and maintainability.
Official Code for the paper ELMO : Efficiency via Low-precision and Peak Memory Optimization in Large Output Spaces (in ICML 2025)
A library written in C for converting between float8 (8-bit minifloat numbers) and float32 (single-precision floating-point numbers) formats.
Add a description, image, and links to the float8 topic page so that developers can more easily learn about it.
To associate your repository with the float8 topic, visit your repo's landing page and select "manage topics."