- Add the
documentation.yml
workflow to deploy doc pages.
- Updated
README.md
with direct link to the documentation page.
- Fix broken softmax kernel for generic platform (#2).
- Improved README with more detailed
Getting Started
section, a section listing related publications, and a list of supported platforms. - Schedule a CI run every 6 days at 2AM CET to refresh the cache (it expires after 7 days if unused).
- Update the link of the Docker container used to run the CI with the Docker published by this repo instead of my fork.
- Add a retry on timeout step for large network tests. This is a temporary fix to address the sporadic freeze happening at the compilation stage, see this issue.
- Add the
FloatImmediate
AbstractType
- Define fp64, fp32, fp16, and bf16
- Add float binding for the Adder in the Generic platform
- Add a FloatAdder test to the CI for Siracusa and Generic platforms
- Extend
testType.py
with float tests - LIMITATION: Current LLVM compiler does not support bfp16 and fp16, these types are commented in the library header
- cMake Flow for the Snitch Cluster
- Added
snitch_cluster
to Makefile - New Snitch platform with testing application
- Testrunner for tiled and untiled execution (
testRunner_snitch.py
,testRunner_tiled_snitch.py
) - Minimal library with CycleCounter and utility function
- Update the Banshee's commit to include a recent PR.
- Support for single-buffered tiling from L2.
- Parsers, Templates, TypeCheckers, Layers, and TCF for the newly supported operators.
- A code transformation pass to filter DMA cores or compute cores for an
ExecutionBlock
. - A code transformation pass to profile an
ExecutionBlock
. - Test for single kernels, both with and without tiling.
- Adds the
--debug
flag tocargo install
when installing Banshee to get the possibility of enabling the debug prints. - New tests for the
snitch_cluster
platform. - Add macros to
main.c
to disable printing and testing (convenient when running RTL simulations).
- Add the possibility of changing the simulator when using the snitch-tiled test runner.
- gvsoc in the Makefile and dockerfile
- cmake flow for gvsoc
- CI tests regarding Snitch run on GVSOC as well
- Add the RTL library to the snitch_cluster build process in the Makefile, required for GVSOC simulation
- Float Support for Constbuffer
- Simple Float GEMM on Generic and Pulp
- FP GEMM to CI
- FP GEMM Tiling on PULP
- Float bug on Testslice, CMSIS TestUtil, DivInterger
- AbstractDatayType Float Bugs
- Add one new #define OUTPUTTYPE to testoutput.h
Change main.c to use OUTPUTTYPE instead of float
- Float Template, binding and parser, test for Conv2D, LayerNorm, Div, Relu, Softmax, MaxPool, Matmul, Transpose, Gelu, Mul, Reshape, Gather, Squeeze, Padding
- CCT model test to Generic Target
- Math Lib link on Generic Target
- float infinity macro #define inf
- Signprop depend on float check and platform
- MaxPool Padding Extract Pass for float and interger
- Testinput, testoutput, weight type casted from double to float warning
- New templates for GEMM and Softmax.
- Added GEMM and Softmax to TargetLibraries, including case for GEMM with a transposed B matrix.
- Added new CI tests for GEMM and Softmax.
- Adapted snitch Bindings and Platform files.
- Relaxed the error threshold between expected and actual values in deeploytest.
- Float Bindings, Tilers of CCT kernels for Pulp Target
- Float Convolution, MaxPool Parser, Template, Kernel with HWC layout and padding integrated
- Added tiling constraints for conv gather and layernorm and exisitng constraints for other kernels
- profileuntiling arg
- CCT onnx tests with img size of 16 and 32
- CycleMeasure Pass for Siracusa Untiling Profilling
- GEMM Tiling Constraints transA and `transB' not supported
- MatMul layer Multi-Dimensional Input Issue
- Add Layer for Broadcasted Bias
- Resolved an issue where concatenation of float32 with f caused inf errors during code generation
- CODEOWNERS file to control who is responsible for reviewing future PRs.
- A visualization of the memory allocation solution generated by Deeploy at each level of memory. I use Plotpy to generate a static
html
file and save it to theDeeployState
directory. - An initialization strategy for the variable in the tiling to randomize the variables related to the permutation matrix.
- New interface to
testRunner_tiled_siracusa
to control the generation of the memory allocation visualization, the memory allocation strategy, and the search strategy. - Export a new docker container with
plotpy
as dependency.
- Removed unused
TilerAwareDeployer
class.
- Fixed a bug in the MemoryScheduler where the CP problem was solved more time that it was needed.