Skip to content

Commit

Permalink
[DOCS] supported llm data update 24.4 (#27254)
Browse files Browse the repository at this point in the history
  • Loading branch information
kblaszczak-intel authored Oct 28, 2024
1 parent 3e9030f commit f980ba8
Show file tree
Hide file tree
Showing 3 changed files with 152 additions and 27 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,11 @@ Most Efficient Large Language Models for AI PC

This page is regularly updated to help you identify the best-performing LLMs on the
Intel® Core™ Ultra processor family and AI PCs.
The current data is as of OpenVINO 2024.4, 24 Oct. 2024

The tables below list key performance indicators for a selection of Large Language Models,
running on an Intel® Core™ Ultra 7-165H based system, on built-in GPUs.
The tables below list the key performance indicators for a selection of Large Language Models,
running on an Intel® Core™ Ultra 7-165H, Intel® Core™ Ultra 7-265V, and Intel® Core™ Ultra
7-288V based system, on built-in GPUs.



Expand Down Expand Up @@ -34,18 +36,17 @@ running on an Intel® Core™ Ultra 7-165H based system, on built-in GPUs.
All models listed here were tested with the following parameters:

* Framework: PyTorch
* Model precision: INT4
* Beam: 1
* Batch size: 1

.. grid-item::

.. button-link:: https://docs.openvino.ai/2024/_static/benchmarks_files/OV-2024.4-platform_list.pdf
.. button-link:: https://docs.openvino.ai/2024/_static/benchmarks_files/llm_models_platform_list_.pdf
:color: primary
:outline:
:expand:

:material-regular:`download;1.5em` Get full system info [PDF]
:material-regular:`download;1.5em` Get system descriptions [PDF]

.. button-link:: ../../_static/benchmarks_files/llm_models.csv
:color: primary
Expand Down
168 changes: 146 additions & 22 deletions docs/sphinx_setup/_static/benchmarks_files/llm_models.csv
Original file line number Diff line number Diff line change
@@ -1,22 +1,146 @@
Model name,"Throughput: (tokens/sec. 2nd token)",1st token latency (msec),Max RSS memory used. (MB),Input tokens,Output tokens
OPT-2.7b,"20.2",2757,7084,937,128
Phi-3-mini-4k-instruct,"19.9",2776,7028,1062,128
Orca-mini-3b,"19.2",2966,7032,1024,128
Phi-2,"17.8",2162,7032,1024,128
Stable-Zephyr-3b-dpo,"17.0",1791,7007,946,128
ChatGLM3-6b,"16.5",3569,6741,1024,128
Dolly-v2-3b,"15.8",6891,6731,1024,128
Stablelm-3b-4e1t,"15.7",2051,7018,1024,128
Red-Pajama-Incite-Chat-3b-V1,"14.8",6582,7028,1020,128
Falcon-7b-instruct,"14.5",4552,7033,1049,128
Codegen25-7b,"13.3",3982,6732,1024,128
GPT-j-6b,"13.2",7213,6882,1024,128
Stablelm-7b,"12.8",6339,7013,1020,128
Llama-3-8b,"12.8",4356,6953,1024,128
Llama-2-7b-chat,"12.3",4205,6906,1024,128
Llama-7b,"11.7",4315,6927,1024,128
Mistral-7b-v0.1,"10.5",4462,7242,1007,128
Zephyr-7b-beta,"10.5",4500,7039,1024,128
Qwen1.5-7b-chat,"9.9",4318,7034,1024,128
Baichuan2-7b-chat,"9.8",4668,6724,1024,128
Qwen-7b-chat,"9.0",5141,6996,1024,128
Topology,Precision,Input Size,max rss memory,1st latency (ms),2nd latency (ms),2nd tok/sec
opt-125m-gptq,INT4-MIXED,1024,1610.2,146,9.4,106.38
opt-125m-gptq,INT4-MIXED,32,1087.6,60.8,9.5,105.26
tiny-llama-1.1b-chat,INT4-MIXED,32,1977,85.7,20.2,49.50
tiny-llama-1.1b-chat,INT4-MIXED,1024,1940.8,367.7,20.3,49.26
tiny-llama-1.1b-chat,INT8-CW,32,1855.2,70.2,21.8,45.87
qwen2-0.5b,INT4-MIXED,1024,3029.3,226.4,22.3,44.84
qwen2-0.5b,INT8-CW,1024,3093,222,22.3,44.84
qwen2-0.5b,FP16,1024,2509.5,234.3,22.4,44.64
qwen2-0.5b,FP16,32,1933.8,146.4,22.4,44.64
tiny-llama-1.1b-chat,INT8-CW,1024,2288.3,368.6,22.9,43.67
qwen2-0.5b,INT4-MIXED,32,2670.9,115.1,23,43.48
qwen2-0.5b,INT8-CW,32,2530,157.9,24.3,41.15
red-pajama-incite-chat-3b-v1,INT4-MIXED,32,2677.3,186.1,27.9,35.84
qwen2-1.5b,INT4-MIXED,32,4515.1,179.8,28.7,34.84
qwen2-1.5b,INT4-MIXED,1024,4927.5,254.3,29.1,34.36
dolly-v2-3b,INT4-MIXED,32,2420.9,245.6,30.8,32.47
qwen2-1.5b,INT8-CW,32,4824.9,165.1,31.2,32.05
phi-2,INT4-MIXED,32,2523.5,233.9,31.5,31.75
qwen2-1.5b,INT8-CW,1024,5401.8,331.1,32,31.25
stable-zephyr-3b-dpo,INT4-MIXED,30,2816.2,151.3,32.9,30.40
red-pajama-incite-chat-3b-v1,INT4-MIXED,1020,2646.7,860.6,33,30.30
opt-2.7b,INT4-MIXED,31,2814.5,174.7,33.1,30.21
phi-2,INT4-MIXED,32,2363.6,236.6,34,29.41
stablelm-3b-4e1t,INT4-MIXED,32,3079.1,220,34,29.41
minicpm-1b-sft,INT4-MIXED,31,2971,185.1,34.1,29.33
minicpm-1b-sft,INT8-CW,31,3103.6,233.5,34.3,29.15
dolly-v2-3b,INT4-MIXED,1024,2152.3,876.6,34.7,28.82
phi-3-mini-4k-instruct,INT4-MIXED,38,2951,155.4,35.9,27.86
phi-2,INT4-MIXED,1024,2689.9,971.7,36.5,27.40
stablelm-3b-4e1t,INT4-MIXED,1024,3335.9,519.3,37.3,26.81
opt-2.7b,INT4-MIXED,937,3227.5,639.5,37.7,26.53
phi-3-mini-4k-instruct,INT4-MIXED,38,3289.7,161,37.9,26.39
gemma-2b-it,INT4-MIXED,32,4099.6,258.6,38,26.32
tiny-llama-1.1b-chat,FP16,32,3098.7,143.9,38.2,26.18
stable-zephyr-3b-dpo,INT4-MIXED,946,3548.5,453.9,38.8,25.77
tiny-llama-1.1b-chat,FP16,1024,3388.6,523,39,25.64
phi-2,INT4-MIXED,1024,2594.7,964.2,39.1,25.58
minicpm-1b-sft,FP16,31,3597.7,164.8,39.8,25.13
gemma-2b-it,INT4-MIXED,1024,5059.1,669.1,40.5,24.69
phi-3-mini-4k-instruct,INT4-MIXED,1061,3431.8,840.1,40.6,24.63
phi-3-mini-4k-instruct,INT4-MIXED,1061,3555.6,836.3,41.8,23.92
qwen2-1.5b,FP16,32,3979.4,111.8,42.5,23.53
red-pajama-incite-chat-3b-v1,INT8-CW,32,3639.9,199.1,43.6,22.94
qwen2-1.5b,FP16,1024,4569.8,250.5,44.1,22.68
dolly-v2-3b,INT8-CW,32,3727,248.2,44.5,22.47
opt-2.7b,INT8-CW,31,3746.3,175.6,44.6,22.42
stablelm-3b-4e1t,INT8-CW,32,3651.3,178,45.4,22.03
chatglm3-6b,INT4-MIXED,32,4050.3,88.1,47.4,21.10
phi-2,INT8-CW,32,3608.7,232,48.3,20.70
red-pajama-incite-chat-3b-v1,INT8-CW,1020,2951,816.6,48.4,20.66
stablelm-3b-4e1t,INT8-CW,1024,4142.8,658.7,48.5,20.62
opt-2.7b,INT8-CW,937,4019,640.7,48.8,20.49
stable-zephyr-3b-dpo,INT8-CW,30,3264.5,150.7,48.8,20.49
gemma-2b-it,INT8-CW,32,4874.7,249.4,48.9,20.45
chatglm3-6b,INT4-MIXED,32,3902.1,84.9,49.5,20.20
dolly-v2-3b,INT8-CW,1024,2931.4,865.2,49.7,20.12
gemma-2b-it,INT8-CW,1024,5834,545.4,50.7,19.72
vicuna-7b-v1.5,INT4-MIXED,32,4560.3,119.4,50.7,19.72
chatglm3-6b,INT4-MIXED,1024,4070.1,895.9,50.9,19.65
chatglm3-6b,INT4-MIXED,1024,3832.1,854.4,52,19.23
orca-mini-3b,INT4-MIXED,32,2345.5,132.8,52.2,19.16
phi-2,INT8-CW,1024,3511.6,989.7,53.1,18.83
chatglm2-6b,INT4-MIXED,32,4960.2,91.5,54.2,18.45
qwen1.5-7b-chat,INT4-MIXED,32,5936.5,195.7,54.8,18.25
stable-zephyr-3b-dpo,INT8-CW,946,3700.5,677.9,54.8,18.25
llama-2-7b-chat-hf,INT4-MIXED,32,4010.5,113.7,55.6,17.99
qwen-7b-chat,INT4-MIXED,32,7393,132.7,56.1,17.83
chatglm2-6b,INT4-MIXED,1024,5234.5,747.3,56.2,17.79
qwen2-7b,INT4-MIXED,32,7086.2,183,56.3,17.76
phi-3-mini-4k-instruct,INT8-CW,38,4574.4,132.9,56.9,17.57
llama-2-7b-gptq,INT4-MIXED,32,4134.1,120,58,17.24
chatglm3-6b-gptq,INT4-MIXED,32,4288.1,99.4,58.1,17.21
qwen2-7b,INT4-MIXED,1024,7716.4,734.9,58.3,17.15
mistral-7b-v0.1,INT4-MIXED,31,4509.3,115,58.6,17.06
codegen25-7b,INT4-MIXED,32,4211.8,136.5,59,16.95
qwen1.5-7b-chat,INT4-MIXED,1024,7007.2,792.7,60.6,16.50
chatglm3-6b-gptq,INT4-MIXED,1024,4545.4,860.3,60.9,16.42
phi-3-mini-4k-instruct,INT8-CW,1061,5087.2,1029.5,60.9,16.42
gpt-j-6b,INT4-MIXED,32,4013.5,316.1,61.1,16.37
mistral-7b-v0.1,INT4-MIXED,1007,876.5,984.4,61.7,16.21
llama-3-8b,INT4-MIXED,32,4357.1,132.8,62,16.13
llama-2-7b-chat-hf,INT4-MIXED,1024,3564.8,1163.7,62.5,16.00
qwen-7b-chat-gptq,INT4-MIXED,32,7384.1,217.8,62.9,15.90
zephyr-7b-beta,INT4-MIXED,32,5331.6,125,62.9,15.90
qwen-7b-chat,INT4-MIXED,32,6545.8,218.7,63,15.87
llama-3.1-8b,INT4-MIXED,31,5076.3,110.4,63.4,15.77
llama-3.1-8b,INT4-MIXED,31,4419,145.6,63.5,15.75
llama-2-7b-gptq,INT4-MIXED,1024,3434.2,921.6,64.4,15.53
llama-3-8b,INT4-MIXED,32,4886.7,132.3,65.4,15.29
stablelm-7b,INT4-MIXED,32,4768.4,132.1,65.5,15.27
codegen25-7b,INT4-MIXED,1024,1429.7,967.5,65.7,15.22
zephyr-7b-beta,INT4-MIXED,1024,5575.6,837.2,65.7,15.22
llama-3-8b,INT4-MIXED,32,4888.3,161.8,66.2,15.11
mistral-7b-v0.1,INT4-MIXED,31,4401.4,142.7,66.2,15.11
llama-3-8b,INT4-MIXED,1024,3782.4,1091.5,66.8,14.97
llama-3.1-8b,INT4-MIXED,31,4781.4,159.4,67,14.93
glm-4-9b,INT4-MIXED,33,6392.6,298.7,67.2,14.88
qwen-7b-chat,INT4-MIXED,1024,8472.8,1331.2,67.4,14.84
gpt-j-6b,INT4-MIXED,1024,1237.8,1638.8,68.1,14.68
llama-2-7b-chat-hf,INT4-MIXED,32,4497.4,153.2,68.7,14.56
llama-3-8b,INT4-MIXED,1024,4526.9,1060.3,69.8,14.33
mistral-7b-v0.1,INT4-MIXED,1007,3968.7,1033.1,69.9,14.31
llama-3-8b,INT4-MIXED,1024,4297.9,1041.7,70,14.29
orca-mini-3b,INT8-CW,32,3744.3,174,70.5,14.18
stablelm-7b,INT4-MIXED,1020,4402.1,1186.4,70.5,14.18
gemma-2b-it,FP16,32,5806.3,117.6,71.8,13.93
glm-4-9b,INT4-MIXED,1025,7003.5,1354.2,72.5,13.79
gemma-2b-it,FP16,1024,6804.7,490.6,73.4,13.62
stablelm-3b-4e1t,FP16,32,6217,207.5,75.2,13.30
llama-2-7b-chat-hf,INT4-MIXED,1024,4320.9,1247.7,75.8,13.19
gemma-7b-it,INT4-MIXED,32,8050.6,134.6,76.1,13.14
gemma-7b-it,INT4-MIXED,32,7992.6,146.4,76.1,13.14
qwen-7b-chat,INT4-MIXED,1024,5712.7,1144.4,77.1,12.97
stablelm-3b-4e1t,FP16,1024,6722.9,491.4,77.7,12.87
chatglm2-6b,INT8-CW,32,6856.2,111.6,78.9,12.67
opt-2.7b,FP16,31,5377.5,138,79.6,12.56
chatglm2-6b,INT8-CW,1024,7133.8,1012.1,81,12.35
red-pajama-incite-chat-3b-v1,FP16,32,5672.5,211,81.2,12.32
gemma-7b-it,INT4-MIXED,1024,9399.5,1726.7,82.2,12.17
dolly-v2-3b,FP16,32,5573,230.6,82.5,12.12
gemma-7b-it,INT4-MIXED,1024,9460,1241.2,82.7,12.09
opt-2.7b,FP16,937,4727.8,618.8,84.6,11.82
baichuan2-7b-chat,INT4-MIXED,32,5782.4,274.1,84.8,11.79
phi-2,FP16,32,5497.3,244.9,85,11.76
stable-zephyr-3b-dpo,FP16,30,5714.8,173.1,86,11.63
red-pajama-incite-chat-3b-v1,FP16,1020,5262.2,817.4,86.2,11.60
dolly-v2-3b,FP16,1024,2376.1,935.5,87,11.49
qwen-7b-chat,INT4-MIXED,32,8597.4,226.2,87.7,11.40
phi-2,FP16,1024,4063.9,969.8,89.7,11.15
chatglm3-6b,INT8-CW,32,6158.8,123.4,89.8,11.14
stable-zephyr-3b-dpo,FP16,946,5337.1,781.4,90.5,11.05
baichuan2-7b-chat,INT4-MIXED,1024,807.4,1725.7,91.8,10.89
vicuna-7b-v1.5,INT8-CW,32,7391,171.3,92.5,10.81
chatglm3-6b,INT8-CW,1024,550.7,1210.9,93.3,10.72
phi-3-mini-4k-instruct,FP16,38,8299.3,142,94.1,10.63
qwen2-7b,INT8-CW,32,9941.1,139.1,94.9,10.54
qwen-7b-chat-gptq,INT4-MIXED,1024,6545,1103.9,95.8,10.44
qwen2-7b,INT8-CW,1024,10575.1,1183,96.7,10.34
qwen-7b-chat,INT4-MIXED,1024,6777.4,1309.6,96.9,10.32
vicuna-7b-v1.5,INT8-CW,1024,8013.7,1154.6,96.9,10.32
phi-3-medium-4k-instruct,INT4-MIXED,38,8212.8,448.3,97,10.31
zephyr-7b-beta,INT8-CW,32,7888,144.8,97.4,10.27
phi-3-mini-4k-instruct,FP16,1061,8814.8,1195.7,98.7,10.13
zephyr-7b-beta,INT8-CW,1024,8136.7,1191.6,99.4,10.06
llama-2-13b-chat-hf,INT4-MIXED,32,6927.5,165.3,99.9,10.01
Binary file not shown.

0 comments on commit f980ba8

Please sign in to comment.