DJL v0.13.0 release
DJL v0.13.0 brings the new TensorRT and Python engines, and updates the engines PyTorch to 1.9.0, ONNXRuntime to 1.9.0, PaddlePaddle to 2.0.2, and introduces several new features:
Key Features
- Introduces TensorRT engine
- Introduces Python engine which allows you to run Python scripts with DJL
- Upgrades PyTorch engine to 1.9.0 with CUDA 11.1 support
- Upgrades ONNXRuntime engine to 1.9.0 with UINT8 support
- Upgrades PaddlePaddle engine to 2.0.2
- Introduces the djl-bench snap package:
sudo snap install djlbench --classic
- Introduces dynamic batch feature for djl-serving (#1154)
- DJL serving becomes a standalone repository: https://github.com/deepjavalibrary/djl-serving (#1170)
- Allows load ModelZoo model using url (#1120)
- Support .npy and .npz file format (#1131)
- djl-serving is available in dockerhub
- Publishes d2l book Chinese translation preview (chapter 1-5) : https://d2l-zh.djl.ai
Enhancement
- Introduces several new features in djl-serving:
- Improves djl-serving API to make it easy to get HTTP headers (#1134)
- Loads models on all GPUs at startup for djl-serving (#1132)
- Enables asynchronous logging for djl-serving
- Makes djl-serving access log in separate log file (#1150)
- Adds configuration to support number worker threads for GPU inference (#1153)
- Improves auto-scale algorithm for djl-serving (#1149)
- Introduces several new features in djl-bench:
- Adds a command line option to djl-bench to generate NDList file (#1155)
- Adds warmup to benchmark (#1152)
- Improves djll-bench to support djl:// urls (#1146)
- Adds support to benchmark on multiple GPUs (#1144)
- Adds support to benchmark onnx on GPU machines (#1148)
- Adds support to benchmark TensorRT models (#1257)
- Adds support to benchmark Python models (#1267)
- Introduces several new features in PyTorch engine:
- Supports PyTorch custom input data type with IValue (#1208)
- Introduces several new features in OnnxRuntime:
- Adds UINT8 support for OnnxRuntime (#1271)
- Introduces several new features in PaddlePaddle:
- Introduces several API improvements:
- Adds missing NDList.get(String) API (#1194)
- Adds support to directly load models from a TFHub url (#1231)
- Improves repository API to support passing argument in the URL query string (#1139)
- Avoids loading the default engine if it is not being used (#1136)
- Improves IO by adding a buffer to read/write (#1135)
- Improves NDArray.toString() debug mode performance (#1142)
- Makes GPU device detection engine specific to avoid confusion when using multiple engines (#1138)
Documentation and examples
- Adds Style Transfer example with CycleGAN (#1180)
Breaking change
- Removes support for Apache MXNet 1.6.0
- Deprecates Device.getDevices() API - Use Engine.getDevices() instead
- Renames SimpleVocabulary to DefaultVocabulary
Bug Fixes
- Fixes broken link in documents
- Fixes TensorFlow NDArray was created on CPU instead of GPU bug (#1279)
- Fixes default image processing pipeline (#1268)
- Fixed XGBoost NDArray multiple read bug (#1239)
- Fixes platform matching bug (#1167)
- Fixes NullPointerException in NDArray.toString() (#1157)
- Fixes PaddlePaddle crash due to GC (#1162)
- Fixes NDArrayAdapter.getSparseFormat() unsupported bug (#1151)
- Fixes mixed device issue in multiple engine use case (#1123)
- Fixes handle duplicate plugin issue for djl-serving (#1108)
- Fixes XGBoost NDArray creation bug (#1109)
- Fixes runtime exception running benchmark arm machine(#1107)
- Fixes unregister model regression (#1101)
Contributors
This release is thanks to the following contributors:
- Akshay Rajvanshi (@aksrajvanshi)
- Aziz Zayed (@AzizZayed)
- Elchanan Haas (@ElchananHaas)
- Erik Bamberg (@ebamberg)
- Frank Liu (@frankfliu)
- Jake Lee (@stu1130)
- Kimi MA (@kimim)
- Paul Greyson
- Qing Lan (@lanking520)
- Raymond Liu (@raymondkhliu)
- Sindhu Somasundaram (@sindhuvahinis)
- Zach Kimberg (@zachgk)