From 71aa5adb848cfeaf66b06cef5edcb1c52c9a5330 Mon Sep 17 00:00:00 2001 From: sunchaesk <47470853+sunchaesk@users.noreply.github.com> Date: Wed, 24 Jul 2024 16:42:22 -0400 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 15768080..a48ec282 100644 --- a/README.md +++ b/README.md @@ -187,7 +187,7 @@ How to [reproduce](benchmark/flexgen). ### Latency-Throughput Trade-Off The figure below shows the latency and throughput trade-off of three offloading-based systems on OPT-175B (left) and OPT-30B (right). -FlexGen achieves a new Pareto-optimal frontier with significatnly higher maximum throughput for both models. +FlexGen achieves a new Pareto-optimal frontier with significantly higher maximum throughput for both models. Other systems cannot further increase throughput due to out-of-memory. "FlexGen(c)" is FlexGen with compression.