Skip to content

Releases: smarter-project/armnn_tflite_backend

v23.05

05 May 02:32
54b33f5
Compare
Choose a tag to compare

This release includes a few changes:

  • Fixes issue with TFLite XNNPACK delegate auto applying, via using the correct op resolver when loading model.
  • Rolls forward default versions of ArmNN/ACL to v23.02 and TFLite to v2.10.0
  • General housekeeping and improvements in CMAKE sources

v23.01

12 Jan 21:51
a16af2d
Compare
Choose a tag to compare

The following changes have been made:

  • Bug fixes for dynamically batched inference requests
  • Improve testing of backend for verification of correct thread pool counts and model config validation
  • A quick patch attempts to address #5, and prevents the backend from segfaulting, however it also means that the number of armnn threads is locked to the number set by the first model that requests it. There will have to be major implementation changes for a fully fledged solution which would likely involving forking a separate process per tflite model managed by the backend

v22.12

20 Dec 23:12
8290f21
Compare
Choose a tag to compare

This Release provides the following functionality from changes merged into the dev branch:

  • Adds beta support for building and running tflite models with FlexOps (requires custom compile with TFLITE_ENABLE_FLEX_OPS cmake flag)
  • Adds github ci for building and testing the backend
  • Removes the tests associated with the triton repository itself, as it was unmaintainable
  • Adds useful printing as to how many nodes in the tflite graph were delegated by the selected XNNPACK or ArmNN
  • Pins ArmNN version to v22.08 by default
  • Pins TFLite version to v2.4.1 by default. Future releases don't allow selective application of XNNPACK delegate see tensorflow/tensorflow#56571
  • Updates README for local development and testing
  • Testing currently not done for FlexOps support or MaliGPU through CI

Known Issues:

  • ArmNN delegate thread specification does not behave as expected. If two models specify differing numbers of threads while using ArmNN acceleration, last model to load determines number of threads used by All models. This is because the Arm Compute Library scheduler is a singleton, and doesn't play very nicely when running multiple models concurrently in the same process.
  • In the same vein as above, if you load 2 models, one on cpu and one on gpu, the backend will segfault.