Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Update performance benchmark: upgrade trt-llm to r24.07, and add SGLang #7412
[CI] Update performance benchmark: upgrade trt-llm to r24.07, and add SGLang #7412
Changes from 176 commits
91b49b3
e61fa29
00aaefa
7d4b1f0
c67deaa
f7901f1
2e9c063
4e15409
5f880ae
b74d95a
fc2f850
0a3bfae
89c7fe8
cbbcbd0
0648d9e
7f16b64
5155b77
85d39f9
4516373
dbf6607
a8cac72
e461c64
c935bda
a7e12e7
9605cf3
fcc3f52
760d70f
8230734
1ba468b
c7bfc22
4c1500c
711b65f
483e1b1
1c5f677
a65763b
aa278eb
5da3db1
45e94a2
a4cb503
03f7830
84fd15e
797c4d6
6fd153a
61a45b5
eb592aa
c7aafa0
23b886a
ac7ecc5
aefae68
ac95463
2337b39
fb834ff
2f8db9d
7435538
6fd1ac5
2ff5429
a53fbc7
02f22f9
b17e76f
594f35b
b8a3f76
a64aeab
12b1ec4
f513995
c52c45e
582b5b2
7802c75
8e3e269
8409687
e138cca
3951a96
c23ccc6
058e1aa
e320941
1b4946c
40ffa0a
9efae6c
d2072af
a1596ed
4c7d73c
f089fae
35c2025
8f411ec
3387919
8700543
0b531f0
9a6a18a
26ad283
16ce24a
b5f90fd
3c40c2f
ca6a9fd
c3696b0
1c95549
0ee0688
4befb1b
53b13a2
1410ce3
c75dbcd
427e013
9228035
039f391
a0a944d
6fb655a
ca36b0e
415cc0f
0a8d641
f72eeca
6e2a9d0
f364a54
c4a6dfd
9a2acda
208a111
99153de
52eabfe
8ad3184
093f410
76ce5b7
2dfecb9
6c1f754
9a8e8fa
f9cd4bb
e3ba754
decf67b
257087f
9882b17
7d0e3c6
8bf3308
ceaaff7
945a09b
2f4a1cb
bf61370
7e7435c
c040060
4a815f9
49190d7
b33055b
811fdbf
57b4b7c
1d2f0e2
3970bfe
ec61fb6
93156a0
408fab0
e43b66c
575d89a
661633f
4276c99
e6c94e5
42650a1
fbd27dc
e2373e8
4ffa6f9
501fea6
555db07
79102e7
094339c
5d054f2
7f74875
e7e6c57
ba1c9ee
9163d52
950219c
8f8ed06
6e3e6e1
ede9688
3b994c5
6bc6777
9d63edb
faf4083
1b43f1c
3a5fa29
7e12e84
60892b6
66ced32
5bc0198
efc0bc4
9c20da0
e20d6b9
3a76dc7
141ce12
4f1a72a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fixed in their 0.12 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am currently using r24.07 (I am having trouble upgrading it to r24.08 --- see reasons in the next conversation), which is paired with trtllm 0.11 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use 24.08?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got r24.07 protobuf template filling scripts from NVIDIA and these scripts doesn't work for r24.08 right now. I confirmed with NVIDIA that in the future there will be a test docker that can be used for benchmarking so I am planning to use r24.07 for now and then switch to the test docker after its release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup unless we need to keep them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TGI benchmark: I prefer keeping it so that we can restore them after they support
--ignore-eos
flag.TRT benchmark: I also prefer keeping it --- currently I am separating the trt-llm test to llama8B and llama 70B test and comment out this test purely because TRT needs to compile the model and it exceeds the 1 hour 30 minutes CI time limit if I directly run the commented-out test. I am expecting that new TRT-LLM docker will have all model for the test suite pre-compiled (hopefully) so that I don't need to worry about this test exceeding the test limit (and then I can uncomment this test).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added comments in the file to explain why keeping those comments
This file was deleted.