Releases · smarter-project/armnn_tflite_backend · GitHub

05 May 02:32

jishminor

v23.05 Latest

Latest

This release includes a few changes:

Fixes issue with TFLite XNNPACK delegate auto applying, via using the correct op resolver when loading model.
Rolls forward default versions of ArmNN/ACL to v23.02 and TFLite to v2.10.0
General housekeeping and improvements in CMAKE sources

Assets 2

12 Jan 21:51

jishminor

v23.01

The following changes have been made:

Bug fixes for dynamically batched inference requests
Improve testing of backend for verification of correct thread pool counts and model config validation
A quick patch attempts to address #5, and prevents the backend from segfaulting, however it also means that the number of armnn threads is locked to the number set by the first model that requests it. There will have to be major implementation changes for a fully fledged solution which would likely involving forking a separate process per tflite model managed by the backend

Assets 2

20 Dec 23:12

jishminor

v22.12

This Release provides the following functionality from changes merged into the dev branch:

Adds beta support for building and running tflite models with FlexOps (requires custom compile with TFLITE_ENABLE_FLEX_OPS cmake flag)
Adds github ci for building and testing the backend
Removes the tests associated with the triton repository itself, as it was unmaintainable
Adds useful printing as to how many nodes in the tflite graph were delegated by the selected XNNPACK or ArmNN
Pins ArmNN version to v22.08 by default
Pins TFLite version to v2.4.1 by default. Future releases don't allow selective application of XNNPACK delegate see tensorflow/tensorflow#56571
Updates README for local development and testing
Testing currently not done for FlexOps support or MaliGPU through CI

Known Issues:

ArmNN delegate thread specification does not behave as expected. If two models specify differing numbers of threads while using ArmNN acceleration, last model to load determines number of threads used by All models. This is because the Arm Compute Library scheduler is a singleton, and doesn't play very nicely when running multiple models concurrently in the same process.
In the same vein as above, if you load 2 models, one on cpu and one on gpu, the backend will segfault.

Assets 2