Releases: smarter-project/armnn_tflite_backend
Releases · smarter-project/armnn_tflite_backend
v23.05
v23.01
The following changes have been made:
- Bug fixes for dynamically batched inference requests
- Improve testing of backend for verification of correct thread pool counts and model config validation
- A quick patch attempts to address #5, and prevents the backend from segfaulting, however it also means that the number of armnn threads is locked to the number set by the first model that requests it. There will have to be major implementation changes for a fully fledged solution which would likely involving forking a separate process per tflite model managed by the backend
v22.12
This Release provides the following functionality from changes merged into the dev branch:
- Adds beta support for building and running tflite models with FlexOps (requires custom compile with
TFLITE_ENABLE_FLEX_OPS
cmake flag) - Adds github ci for building and testing the backend
- Removes the tests associated with the triton repository itself, as it was unmaintainable
- Adds useful printing as to how many nodes in the tflite graph were delegated by the selected XNNPACK or ArmNN
- Pins ArmNN version to v22.08 by default
- Pins TFLite version to v2.4.1 by default. Future releases don't allow selective application of XNNPACK delegate see tensorflow/tensorflow#56571
- Updates README for local development and testing
- Testing currently not done for FlexOps support or MaliGPU through CI
Known Issues:
- ArmNN delegate thread specification does not behave as expected. If two models specify differing numbers of threads while using ArmNN acceleration, last model to load determines number of threads used by All models. This is because the Arm Compute Library scheduler is a singleton, and doesn't play very nicely when running multiple models concurrently in the same process.
- In the same vein as above, if you load 2 models, one on cpu and one on gpu, the backend will segfault.