[gemini] fix ci#5748
Merged
botbw merged 206 commits intohpcaitech:feature/prefetchfrom botbw:prefetchMay 23, 2024
Commits
Commits on Jan 11, 2024
- committed
Commits on Jan 16, 2024
Commits on Jan 17, 2024
Commits on Jan 18, 2024
Commits on Jan 19, 2024
Commits on Jan 22, 2024
Commits on Jan 24, 2024
Commits on Jan 25, 2024
Commits on Jan 26, 2024
Commits on Jan 29, 2024
Commits on Jan 30, 2024
Commits on Jan 31, 2024
Commits on Feb 2, 2024
- authored
- authored
- authored
- authored
- authored
Commits on Feb 7, 2024
- authored
- authored
- authored
- authored
[Inference] User Experience: update the logic of default tokenizer and generation config. (hpcaitech#5337)
authored
Commits on Feb 8, 2024
Commits on Feb 19, 2024
Commits on Feb 23, 2024
Commits on Feb 26, 2024
Commits on Feb 28, 2024
Commits on Mar 4, 2024
Commits on Mar 7, 2024
Commits on Mar 8, 2024
- committed
- authored
- committed
Merge branch 'feature/colossal-infer' of https://github.com/hpcaitech/ColossalAI into add_gpu_launch_config
committed- committed
- authored
Commits on Mar 11, 2024
Commits on Mar 13, 2024
- committed
- authored
fix rmsnorm template function invocation problem(template function partial specialization is not allowed in Cpp) and luckily pass e2e precision test (hpcaitech#5454)
authored- authored
- committed
Commits on Mar 14, 2024
Commits on Mar 15, 2024
Commits on Mar 19, 2024
Commits on Mar 21, 2024
Commits on Mar 25, 2024
- committed
- committed
[Inference]Support FP16/BF16 Flash Attention 2 And Add high_precision Flag To Rotary Embedding (hpcaitech#5461)
authored- committed
- authored
- authored
Commits on Mar 26, 2024
Commits on Mar 28, 2024
Commits on Apr 1, 2024
Commits on Apr 2, 2024
Commits on Apr 8, 2024
- authored
- committed
- committed
- committed
- committed
Commits on Apr 9, 2024
Commits on Apr 10, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- authored
Commits on Apr 11, 2024
Commits on Apr 19, 2024
Commits on Apr 30, 2024
- authored
- authored
[Inference/Kernel] refactor kvcache manager and rotary_embedding and kvcache_memcpy oper… (hpcaitech#5663)
authored- authored
- authored
- authored
Commits on May 3, 2024
Commits on May 5, 2024
Commits on May 6, 2024
- authored
- authored
- authored
Commits on May 7, 2024
Commits on May 8, 2024
- authored
- authored
- authored
- authored
- committed
[Inference] Finish Online Serving Test, add streaming output api, continuous batching test and example (hpcaitech#5432)
committed- committed
- committed
- committed
- committed
Commits on May 9, 2024
Commits on May 10, 2024
Commits on May 11, 2024
Commits on May 14, 2024
- authored
- authored
- authored
- authored
Commits on May 15, 2024
Commits on May 16, 2024
Commits on May 17, 2024
Commits on May 19, 2024
Commits on May 21, 2024
- authored
- authored
- committed
- committed
- committed
- committed
- authored
- authored