From 08a95ac5767a18c6ddc22fb9aa0f02ee4b95e52f Mon Sep 17 00:00:00 2001
From: Thien Tran <gau.nernst@yahoo.com.sg>
Date: Tue, 14 May 2024 01:10:24 +0000
Subject: [PATCH] add note about FP6 kernel

---
 torchao/csrc/fp6_llm/README.md | 7 +++++++
 1 file changed, 7 insertions(+)
 create mode 100644 torchao/csrc/fp6_llm/README.md

diff --git a/torchao/csrc/fp6_llm/README.md b/torchao/csrc/fp6_llm/README.md
new file mode 100644
index 000000000..ff764cc27
--- /dev/null
+++ b/torchao/csrc/fp6_llm/README.md
@@ -0,0 +1,7 @@
+# FP6-LLM kernel
+
+This kernel is adapted from https://github.com/usyd-fsalab/fp6_llm. It performs linear op (A @ W.T), where A is in FP16 and W is in FP6 (E3M2 without infinities and NaN).
+
+On most hardware, this kernel is faster than FP16 linear for batch size from 1 to 128, and slower for batch size larger than or equal to 256. See https://github.com/usyd-fsalab/fp6_llm/issues/8 for a detailed discussion.
+
+See https://github.com/pytorch/ao/pull/223 for some benchmark results.