Skip to content

v2.2.0

Latest
Compare
Choose a tag to compare
@suehtamacv suehtamacv released this 02 Nov 09:53
· 723 commits to main since this release
a6d1088

Fixed

  • Fix typo on the build instructions of the README
  • Fix Gnuplot installation on GitHub's CI
  • The number of elements requested by the Store Unit and the Element Requester now depends both on the requested eew and the past eew of the vector of the used register
  • When the VRF is written and EMUL > 1, the eew of all the interested registers is updated
  • Memory operations can change EMUL when EEW != VSEW
  • The LSU now correctly handles bursts with a saturated length of 256 beats
  • AXI transactions on an opposite channel w.r.t. the channel currently in use are started only after the completion of the previous transactions
  • Fix the number of elements to be requested for a vslidedown instruction

Added

  • benchmarks app to benchmark Ara
  • CI task to create roofline plots of imatmul and fmatmul, available as artifacts
  • Vector floating-point compare instructions (vmfeq, vmfne, vmflt, vmfle, vmfgt, vmfge)
  • Vector single-width floating-point/integer type-convert instructions (vfcvt.xu.f, vfcvt.x.f, vfcvt.rtz.xu.f, vfcvt.rtz.x.f, vfcvt.f.xu, vfcvt.f.x)
  • Vector widening floating-point/integer type-convert instructions (vfwcvt.xu.f, vfwcvt.x.f, vfwcvt.rtz.xu.f, vfwcvt.rtz.x.f, vfwcvt.f.xu, vfwcvt.f.x, vfwcvt.f.f)
  • Vector narrowing floating-point/integer type-convert instructions (vfncvt.xu.f, vfncvt.x.f, vfncvt.rtz.xu.f, vfncvt.rtz.x.f, vfncvt.f.xu, vfncvt.f.x, vfncvt.f.f)
  • Vector whole-register move instruction vmv<nr>
  • Vector whole-register load/store vl1r, vs1r
  • Vector load/store mask vle1, vse1
  • Whole-register instructions are executed also if vtype.vl == 0
  • Makefile option (trace=1) to generate waveform traces when running simulations with Verilator

Changed

  • Add spill register at the lane edge, to cut the timing-critical interface between the Mask unit and the VFUs
  • Increase latency of the 16-bit multiplier from 0 to 1 to cut an in-lane timing-critical path
  • Widen CVA6's cache lines
  • Implement back-to-back accelerator instruction issue mechanism on CVA6
  • Use https protocol when cloning DTC from main Makefile
  • Use https protocol for newlib-cygwin in .gitmodules
  • Cut a timing-critical path from Addrgen to Sequencer (1 cycle more to start an AXI transaction)
  • Cut a timing-critical path in the VSTU, relative to the calculation of the pointer to the VRF word received from the lanes
  • Create ara_system wrapper containing Ara, Ariane, and an AXI mux, instantiated from within Ara's SoC
  • Retime address calculation of the addrgen
  • Push MASKU operand muxing from the lanes to the Mask Unit
  • Reduce CVA6's default cache size
  • Update Verilator to v4.214
  • Update bender to v0.23.1