add openvino, trt #460

michaelfeil · 2024-11-12T06:26:15Z

openvino not working, need to use #454

openvino:

tried openvino==2024.4 + nightly, but both keep breaking
openvino via poetry install does not work. Needs to install via pip after installation (uninstall openvino and install the same version builds the wheel / extensions / etc correctly)

tensorrt:

using cuda12
cuda12 has tensorrt==10.2 as only compatible version.
tensorrt==10.2 crashes on linux, as its linked to a windows dll
tensorrt==10.3 works, but the optimum inplementation has issues
onnxruntime==1.20 leaves tensorrt==10.5, but has has no compatible openvino distribution.

greptile-apps

PR Summary

This PR adds OpenVINO and TensorRT optimizations to improve model inference performance across different hardware platforms.

Added OpenVINO support in /libs/infinity_emb/Docker.template.yaml with INFINITY_ENGINE="optimum" for CPU builds
Updated TensorRT support with CUDA 12.3.2 and TensorRT 10.3.0 in /libs/infinity_emb/Docker.template.yaml
Added provider-specific optimizations in /libs/infinity_emb/infinity_emb/transformer/utils_optimum.py for OpenVINO and TensorRT
Added quantized model support for OpenVINO in /libs/infinity_emb/infinity_emb/transformer/embedder/optimum.py
Updated dependencies in pyproject.toml to use OpenVINO 2024.4.0 and TensorRT 10.6.0

_{11 file(s) reviewed, 7 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

libs/infinity_emb/Docker.template.yaml

libs/infinity_emb/Dockerfile.trt_onnx_auto

libs/infinity_emb/infinity_emb/transformer/utils_optimum.py

libs/infinity_emb/pyproject.toml

codecov-commenter · 2024-11-12T06:37:23Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 33.33333% with 12 lines in your changes missing coverage. Please review.

Project coverage is 73.02%. Comparing base (cdbe888) to head (bb34e3e).

Files with missing lines	Patch %	Lines
...nity_emb/infinity_emb/transformer/utils_optimum.py	33.33%	12 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

❗ There is a different number of reports uploaded between BASE (cdbe888) and HEAD (bb34e3e). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (cdbe888) HEAD (bb34e3e)

2 1

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #460      +/-   ##
==========================================
- Coverage   79.23%   73.02%   -6.21%     
==========================================
  Files          42       42              
  Lines        3380     3392      +12     
==========================================
- Hits         2678     2477     -201     
- Misses        702      915     +213

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

greptile-apps bot reviewed Nov 12, 2024

View reviewed changes

michaelfeil and others added 4 commits November 12, 2024 00:55

add openvino, trt

00115ae

Merge branch 'main' into optimum-upgrade

39ba865

remove .lock

bb34e3e

add sentencepiece

78b5b0b

michaelfeil merged commit 9206840 into main Nov 12, 2024
36 checks passed

michaelfeil deleted the optimum-upgrade branch November 12, 2024 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add openvino, trt #460

add openvino, trt #460

michaelfeil commented Nov 12, 2024 •

edited

Loading

greptile-apps bot left a comment

codecov-commenter commented Nov 12, 2024 •

edited

Loading

add openvino, trt #460

add openvino, trt #460

Conversation

michaelfeil commented Nov 12, 2024 • edited Loading

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

codecov-commenter commented Nov 12, 2024 • edited Loading

Codecov Report

michaelfeil commented Nov 12, 2024 •

edited

Loading

codecov-commenter commented Nov 12, 2024 •

edited

Loading