Merge branch 'master' of https://github.com/fastmachinelearning/hls4ml …

…into pr/560
fastmachinelearning · Jun 21, 2022 · 4cd76ce · 4cd76ce
2 parents 5bddddf + aa7ce78
commit 4cd76ce
Show file tree

Hide file tree

Showing 108 changed files with 4,890 additions and 809 deletions.
diff --git a/.gitignore b/.gitignore
@@ -10,4 +10,5 @@ my-hls-test
 *.tar.gz
 docs/_build
 docs/autodoc/*
-hls4mlprj_*
+hls4mlprj_*
+*~
diff --git a/Jenkinsfile b/Jenkinsfile
@@ -1,7 +1,7 @@
 pipeline {
   agent {
     docker {
-      image 'vivado-el7:1'
+      image 'vivado-el7:2'
       args  '-v /data/Xilinx:/data/Xilinx'
     }
   }
@@ -14,9 +14,8 @@ pipeline {
       steps {
         dir(path: 'test') {
           sh '''#!/bin/bash --login
-              conda activate hls4ml-py36
+              conda activate hls4ml-py37
               pip install tensorflow
-              pip install git+git://github.com/google/qkeras.git@v0.8.0#egg=qkeras
               pip install -U ../ --user
               ./convert-keras-models.sh -x -f keras-models.txt
               pip uninstall hls4ml -y'''

diff --git a/MANIFEST.in b/MANIFEST.in
@@ -4,3 +4,4 @@ graft example-prjs
 graft example-models
 graft test
 recursive-include hls4ml/templates *
+include hls4ml/backends/vivado_accelerator/supported_boards.json
diff --git a/docs/api/configuration.rst b/docs/api/configuration.rst
@@ -59,16 +59,21 @@ It looks like this:
 
 .. code-block:: yaml
 
+   # Project section
+   OutputDir: my-hls-test
+   ProjectName: myproject
+
+   # Model section (Keras model)
    KerasJson: keras/KERAS_3layer.json
    KerasH5:   keras/KERAS_3layer_weights.h5 #You can also use h5 file from Keras's model.save() without supplying json file.
    InputData: keras/KERAS_3layer_input_features.dat
    OutputPredictions: keras/KERAS_3layer_predictions.dat
-   OutputDir: my-hls-test
-   ProjectName: myproject
-   Device: xcku115-flvb2104-2-i
-   ClockPeriod: 5
 
-   IOType: io_parallel # options: io_serial/io_parallel
+   # Backend section (Vivado backend)
+   Part: xcku115-flvb2104-2-i
+   ClockPeriod: 5
+   IOType: io_parallel # options: io_parallel/io_stream
+   
    HLSConfig:
      Model:
        Precision: ap_fixed<16,6>
@@ -83,18 +88,23 @@ It looks like this:
 There are a number of configuration options that you have.  Let's go through them.  You have basic setup parameters: 
 
 
+* **OutputDir**\ : the output directory where you want your HLS project to appear
+* **ProjectName**\ : the name of the HLS project IP that is produced
 * **KerasJson/KerasH5**\ : for Keras, the model architecture and weights are stored in a ``json`` and ``h5`` file.  The path to those files are required here. 
   We also support keras model's file obtained just from ``model.save()``. In this case you can just supply the ``h5`` file in ``KerasH5:`` field.
 * **InputData/OutputPredictions**\ : path to your input/predictions of the model. If none is supplied, then hls4ml will create aritificial data for simulation. The data used above in the example can be found `here <https://cernbox.cern.ch/index.php/s/2LTJVVwCYFfkg59>`__. We also support ``npy`` data files. We welcome suggestions on more input data types to support. 
-* **OutputDir**\ : the output directory where you want your HLS project to appear
-* **ProjectName**\ : the name of the HLS project IP that is produced
-* **Device**\ : the particular FPGA part number that you are considering, here it's a Xilinx Virtex-7 FPGA
+
+The backend-specific section of the configuration depends on the backend. You can get a starting point for the necessary settings using, for example `hls4ml.templates.get_backend('Vivado').create_initial_config()`.
+For Vivado backend the options are:
+
+* **Part**\ : the particular FPGA part number that you are considering, here it's a Xilinx Virtex-7 FPGA
 * **ClockPeriod**\ : the clock period, in ns, at which your algorithm runs
   Then you have some optimization parameters for how your algorithm runs:
-* **IOType**\ : your options are ``io_parallel`` or ``io_serial`` where this really defines if you are pipelining your algorithm or not
-* **ReuseFactor**\ : in the case that you are pipelining, this defines the pipeline interval or initiation interval
-* **Strategy**\ : Optimization strategy on FPGA, either "Latency" or "Resource". If none is supplied then hl4ml uses "Latency" as default. Note that a reuse factor larger than 1 should be specified when using "resource" strategy. An example of using larger reuse factor can be found `here. <https://github.com/hls-fpga-machine-learning/models/tree/master/keras/KERAS_dense>`__
-* **Precision**\ : this defines the precsion of your inputs, outputs, weights and biases. It is denoted by ``ap_fixed<X,Y>``\ , where ``Y`` is the number of bits representing the signed number above the binary point (i.e. the integer part), and ``X`` is the total number of bits.
+* **IOType**\ : your options are ``io_parallel`` or ``io_stream`` which defines the type of data structure used for inputs, intermediate activations between layers, and outputs. For ``io_parallel``, arrays are used that, in principle, can be fully unrolled and are typically implemented in RAMs. For ``io_stream``, HLS streams are used, which are a more efficient/scalable mechanism to represent data that are produced and consumed in a sequential manner. Typically, HLS streams are implemented with FIFOs instead of RAMs. For more information see `here <https://docs.xilinx.com/r/en-US/ug1399-vitis-hls/pragma-HLS-stream>`__.
+* **HLSConfig**\: the detailed configuration of precision and parallelism, including:
+  * **ReuseFactor**\ : in the case that you are pipelining, this defines the pipeline interval or initiation interval
+  * **Strategy**\ : Optimization strategy on FPGA, either "Latency" or "Resource". If none is supplied then hl4ml uses "Latency" as default. Note that a reuse factor larger than 1 should be specified when using "resource" strategy. An example of using larger reuse factor can be found `here. <https://github.com/fastmachinelearning/models/tree/master/keras/KERAS_dense>`__
+  * **Precision**\ : this defines the precsion of your inputs, outputs, weights and biases. It is denoted by ``ap_fixed<X,Y>``\ , where ``Y`` is the number of bits representing the signed number above the binary point (i.e. the integer part), and ``X`` is the total number of bits.
   Additionally, integers in fixed precision data type (\ ``ap_int<N>``\ , where ``N`` is a bit-size from 1 to 1024) can also be used. You have a chance to further configure this more finely with per-layer configuration described below.
 
 2.2 Per-Layer Configuration

diff --git a/docs/api/profiling.rst b/docs/api/profiling.rst
@@ -8,7 +8,7 @@ Using a low precision can help reduce the FPGA resource usage of a model, but ma
 
 Profiling uses some extra dependencies, to install these, run ``pip install hls4ml[profiling]``. The profiling tools are provided as a ``Python`` module which you can use.
 
-Three types of objects can be provided: **a Keras model object**\ , **test data**\ , and an **ModelGraph object**.
+Three types of objects can be provided: **a model object**\ , **test data**\ , and a **ModelGraph object**. The model can be Keras or PyTorch.
 You will need to initialise these objects by using a trained model, loading a model from a file, and loading your data. The Keras model and data each need to be in the format that would normally allow you to run, e.g. ``model.predict(X)``.
 
 .. code-block:: python
@@ -29,29 +29,40 @@ You will need to initialise these objects by using a trained model, loading a mo
 
    hls_model = keras_to_hls(config)
 
-   # produce an activation profile (ap)
-   # and weights profile (wp)
-   ap, wp = numerical(keras_model=model, hls_model = hls_model, X=X)
+   # produce 4 plots
+   plots = numerical(model=model, hls_model = hls_model, X=X)
    plt.show()
 
-Calling the ``hls4ml.model.profiling.numerical`` method with these three objects provided will produce two figures as below:
+Calling the ``hls4ml.model.profiling.numerical`` method with these three objects provided will produce four figures as below:
 
-.. image:: ../img/activations.png
+.. image:: ../img/weights_keras.png
    :width: 45%
-.. image:: ../img/weights.png
+.. image:: ../img/weights_hls4ml.png
    :width: 45%
+.. image:: ../img/act_keras.png
+   :width: 45%
+.. image:: ../img/act_hls4ml.png
+   :width: 45%
+
+Plots are title "before optimization" and "final / after optimization".
+The "before optimization" plots show the distributions of the original Keras or PyTorch model, while the "after optimization" plots show the distributions of the ModelGraph.
+In the example images, notice the "bn1", "bn2", "bn3" labels in the "before optimization" plots which are missing from the "after optimization".
+These layer are BatchNormalization layers, which hls4ml has fused into the preceding Dense layers (labelled "fc{1,2,3}").
+Because of this optimization, the weights of "fc1" of the ModelGraph are actually the product of the weights of the Keras model "fc1" with "bn1".
+Similarly, the output of "fc1" of the ModelGraph should correspond to the output of the Keras model "bn1".
+When optimizing precision, the data types should be chosen to work well for the "after optimization" model.
 
 Different plots styles are available with the ``plot`` keyword argument. Valid options are ``boxplot`` (default), ``histogram``\ , ``violinplot``. In the default boxplot style, each variable in the neural network is evaluated using the given test data and the distribution of (non-zero) values is shown with a box and whisker diagram.
 
 When different combinations of the input objects are given, different plots will be produced:
 
-1) Only Keras model: only the weights profile plot will be produced, the activation profile will be ``None``. No grey boxes representing the data types will be shown.
+1) Only Keras or PyTorch model: only the weights profile plot will be produced, the activation profile will be ``None``. No grey boxes representing the data types will be shown.
 
-2) Only ModelGraph (or ModelGraph and Keras model): only the weights profile plot will be produced, with grey boxes indicating the data types from the ModelGraph. 
+2) Only ModelGraph (or ModelGraph and Keras or PyTorch model): two weights profile plots will be produced, with grey boxes indicating the data types from the ModelGraph. The first plot is the "before optimization" model, while the second plot is the "after optimization" model.
 
-3) Keras model and data (\ ``X``\ ): both the weights profile and activation profile will be produced. No grey boxes representing the data types will be shown.
+3) Keras or PyTorch model and data (\ ``X``\ ): both the weights profile and activation profile will be produced. No grey boxes representing the data types will be shown.
 
-4) Keras model, ModelGraph, and data: both weights and activation profiles are produced, with grey boxes indicating the data types from the ModelGraph.
+4) Keras or PyTorch model, ModelGraph, and data: both weights and activation profiles are produced, with grey boxes indicating the data types from the ModelGraph.
 
 Each box shows the median and quartiles of the distribution. The grey shaded boxes show the range which can be represented with the ``hls4ml`` config file used.
 

diff --git a/docs/command.rst b/docs/command.rst
@@ -9,7 +9,7 @@ This page documents all the commands that ``hls4ml`` supports.
 Overview
 =========
 
-To start you can just type in ``hls4ml -h`` or ``hls4ml  --help`` in your command line, a messege will show up like below: 
+To start you can just type in ``hls4ml -h`` or ``hls4ml  --help`` in your command line, a message will show up like below: 
 
 .. code-block::
 

diff --git a/docs/img/act_hls4ml.png b/docs/img/act_hls4ml.png
diff --git a/docs/img/act_keras.png b/docs/img/act_keras.png
diff --git a/docs/img/activations.png b/docs/img/activations.png
diff --git a/docs/img/weights.png b/docs/img/weights.png
diff --git a/docs/img/weights_hls4ml.png b/docs/img/weights_hls4ml.png
diff --git a/docs/img/weights_keras.png b/docs/img/weights_keras.png
diff --git a/docs/index.rst b/docs/index.rst
@@ -48,4 +48,4 @@ Tutorials
 =================================
 Detailed tutorials on how to use ``hls4ml``'s various functionalities can be found at:
 
-https://github.com/hls-fpga-machine-learning/hls4ml-tutorial
+https://github.com/fastmachinelearning/hls4ml-tutorial
diff --git a/docs/release_notes.rst b/docs/release_notes.rst
@@ -6,6 +6,34 @@ See `here <https://github.com/fastmachinelearning/hls4ml/releases>`__ for offici
 
 ----
 
+**v0.6.0 / coris**
+
+## What's Changed
+* `VivadoAccelerator` backend: target `pynq-z2` and `zcu102` boards directly from hls4ml by @nicologhielmetti
+* Updated `PyTorch` and `ONNX` converters by @Duchstf 
+* `line_buffer` Conv2D implementation for `io_stream`: reduced resource usage and latency by @Keb-L, @violatingcp, @vloncar 
+* Support `QConv2DBatchnorm` layer from `QKeras` by @nicologhielmetti 
+* Improved profiling plots - easier to compare original vs `hls4ml` converted models by @maksgraczyk 
+* Better derivation of data types for `QKeras` models by @jmduarte, @thesps 
+* Improved CI by @thesps
+* More support for models with branches, skip connections, `Merge` and `Concatenate` layers by @jmduarte, @vloncar 
+* Support for `Dense` layers over multi-dimensional tensors by @vloncar 
+* Overall improvements by @vloncar, @jmduarte, @thesps, @jmitrevs & others
+
+## New Contributors
+* @siorpaes made their first contribution in https://github.com/fastmachinelearning/hls4ml/pull/424
+* @jmitrevs made their first contribution in https://github.com/fastmachinelearning/hls4ml/pull/403
+* @anders-wind made their first contribution in https://github.com/fastmachinelearning/hls4ml/pull/302
+* @KOVI89alipes made their first contribution in https://github.com/fastmachinelearning/hls4ml/pull/318
+* @maksgraczyk made their first contribution in https://github.com/fastmachinelearning/hls4ml/pull/323
+* @Keb-L made their first contribution in https://github.com/fastmachinelearning/hls4ml/pull/332
+* @ConsVin made their first contribution in https://github.com/fastmachinelearning/hls4ml/pull/307
+* @nicologhielmetti made their first contribution in https://github.com/fastmachinelearning/hls4ml/pull/298
+
+**Full Changelog**: https://github.com/fastmachinelearning/hls4ml/compare/v0.5.0...v0.6.0
+
+----
+
 **v0.5.0 / bartsia**
 
 What's new:

diff --git a/docs/setup.rst b/docs/setup.rst
@@ -18,6 +18,9 @@ http://www.h5py.org
 
 https://pypi.python.org/pypi/PyYAML 
 
+**QKeras**\ : for working with quantized models 
+
+https://github.com/google/qkeras
 
 **PyTorch**\ : for reading in Torch models
 
@@ -82,7 +85,7 @@ After that, you can use :code:`Vivado HLS` to synthesize the model:
    #Print out the report if you want
    hls4ml.report.read_vivado_report('my-hls-test')
 
-Done! you've built your first project using ``hls4ml`` ! To learn more about our various API functionalities, check out our tutorials `here <https://github.com/fastmachinelearning/hls4ml-tutorial>`__.
+Done! You've built your first project using ``hls4ml`` ! To learn more about our various API functionalities, check out our tutorials `here <https://github.com/fastmachinelearning/hls4ml-tutorial>`__.
 
 If you want to configure your model further, check out our :doc:`Configuration <api/configuration>` page. 
 
@@ -123,7 +126,7 @@ To build the HLS project, do:
 
    hls4ml build -p my-hls-test -a
 
-This will create a Vivado HLS project with your model implmentation!
+This will create a Vivado HLS project with your model implementation!
 
 **NOTE:** For the last step, you can alternatively do the following to build the HLS project:
 
@@ -138,7 +141,7 @@ This will create a Vivado HLS project with your model implmentation!
 
    vivado_hls -f build_prj.tcl "csim=1 synth=1 cosim=1 export=1"
 
-Setting the additional parameters to ``1`` to ``0`` disables that step, but disabling ``synth`` also disables ``cosim`` and ``export``.
+Setting the additional parameters from ``1`` to ``0`` disables that step, but disabling ``synth`` also disables ``cosim`` and ``export``.
 
 Further help
 ^^^^^^^^^^^^^^^^
@@ -173,4 +176,4 @@ Existing examples
   Training codes and examples of resources needed to train the models can be found `here <https://github.com/fastmachinelearning/keras-training>`__.
 
 * 
-  Other examples of various HLS projects with examples of different machine learning algorithm implementations is in the directory `example-prjs <https://github.com/fastmachinelearning/hls4ml/tree/master/example-prjs>`_.
+  Other examples of various HLS projects with examples of different machine learning algorithm implementations are in the directory `example-prjs <https://github.com/fastmachinelearning/hls4ml/tree/master/example-prjs>`_.
diff --git a/docs/status.rst b/docs/status.rst
@@ -5,13 +5,13 @@ Status and Features
 Status
 ========
 
-The latest stable release is :doc:`v0.5.0 <release_notes>`. This release brings the new `IOType: io_stream` and support for larger CNN models, see: <https://arxiv.org/abs/2101.05108>.
+The latest stable release is :doc:`v0.6.0 <release_notes>`. This release brings the new VivadoAccelerator backend to easily target boards like pynq-z2 and zcu102, with support for more boards like Alveo planned.
 
 
 Features
 ========
 
-A list of suppported ML codes and architectures, including a summary table is below.  Dependences are given in the :doc:`Setup <setup>` page.
+A list of supported ML codes and architectures, including a summary table is below.  Dependencies are given in the :doc:`Setup <setup>` page.
 
 ML code support: 
 

diff --git a/example-models b/example-models