Sparsity-aware deep learning inference runtime for CPUs
nlp performance computer-vision inference machinelearning pruning object-detection pretrained-models quantization cpus onnx sparsification llm-inference deepsparse
-
Updated
Jul 19, 2024 - Python