AI-Hypercomputer · qihqi · Jun 17, 2024 · Jun 13, 2024 · FanhaiLu1 · Jun 13, 2024
diff --git a/benchmarks/summary.md b/benchmarks/summary.md
@@ -22,6 +22,8 @@ Date | Device  | dtype | batch size | cache length |max input length |max output
 ----| ------- | ------ |---------- | -------------|-----------------|------------------|----------------------
 2024-05-14 | TPU v5e-8 | bfloat16 | 512 | 2048 | 1024 | 1024 | 8700
 2024-05-14 | TPU v5e-8 | int8 | 1024 | 2048 | 1024 | 1024 | 8746
+2024-06-13 | TPU v5e-1 | bfloat16 | 1024 | 2048 | 1024 | 1024 | 4249
+
 
 ** NOTE: ** Gemma 2B uses `--shard_on_batch` flag so it's data parallel instead
 of model parallel.