Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface to vector units #3599

Merged
merged 64 commits into from
May 17, 2024
Merged

Interface to vector units #3599

merged 64 commits into from
May 17, 2024

Conversation

jerryz123
Copy link
Contributor

@jerryz123 jerryz123 commented Mar 21, 2024

This PR adds an interface to vector units. This has been tested to support a fully compliant, not-yet-open-sourced, RVV-1.0 implementation, including support for precise traps and virtual memory.

The interface should not affect existing configs with no vector unit.

API:

  • RocketVectorUnitParams - specifies fields for controlling generation of vector unit LazyModule
  • EarlyVectorDecoder - decodes read/write ctrl bits, and allows implementations to define subsets of supported insns (non-supported will trap)
  • VectorUnit - integrates with ex/mem/wb stages of rocket, can set vstart/vxsat/fflags, can request replays, block memory operations in rocket-pipe

UArch:

  • vsets are decoded early, executed in EX stage
  • Load/store/arithmetic instructions are issued to the vector unit in ex stage along with vtype, but may be killed before wb stage
  • The implementation can request replays for backpressure, or delay retirement of younger instructions to implement precise traps
  • The implementation can access memory through a TL port or share the Rocket L1D$.

Type of change: bug report | feature request | other enhancement

Impact: no functional change | API addition (no impact on existing code) | API modification

Development Phase: proposal | implementation

Release Notes

@jerryz123 jerryz123 closed this Mar 21, 2024
@jerryz123 jerryz123 reopened this Mar 21, 2024
Copy link

linux-foundation-easycla bot commented Mar 21, 2024

CLA Missing ID CLA Not Signed

Copy link
Member

@sequencer sequencer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few thoughts from the T1's aspect, no need to block by T1, after our general release, we will try to contribute back to the upstream version of RC.

The tricky part still might be decoder, and I will find some way to align.


id_ctrl.vec := false.B
if (usingVector) {
val v_decode = rocketParams.vector.get.decoder(p)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting.

Comment on lines +851 to +857
csr.io.vector.foreach { v =>
v.set_vconfig.valid := wb_reg_set_vconfig && wb_reg_valid
v.set_vconfig.bits := wb_reg_rs2.asTypeOf(new VConfig)
v.set_vs_dirty := wb_valid && wb_ctrl.vec
v.set_vstart.valid := wb_valid && wb_reg_set_vconfig
v.set_vstart.bits := 0.U
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the CSRs, T1 just insert instruction for chaining w/ csr flushing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe how that works? I'm not sure where csr maintenance is affected by chaining.

src/main/scala/rocket/RocketCore.scala Outdated Show resolved Hide resolved
src/main/scala/tile/FPU.scala Show resolved Hide resolved
src/main/scala/tile/RocketTile.scala Show resolved Hide resolved
@DecodeTheEncoded
Copy link

DecodeTheEncoded commented Mar 26, 2024 via email

@jerryz123
Copy link
Contributor Author

Yes

@jerryz123
Copy link
Contributor Author

This is ready. The CLA failure is innocuous, the email in that commit is wrong, and it is not easy to fix it.

Copy link
Member

@sequencer sequencer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only issue we may spot is the handling of

val ex_avl = Mux(ex_ctrl.rxs1

Thanks @qinjun-li, @SharzyL for review it together.

@@ -10,6 +10,7 @@ import org.chipsalliance.cde.config.Parameters
class StoreGen(typ: UInt, addr: UInt, dat: UInt, maxSize: Int) {
val size = Wire(UInt(log2Up(log2Up(maxSize)+1).W))
size := typ
val dat_padded = dat.pad(maxSize*8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do this padding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxSize is the width of the memory word. In configurations with wide DCache, maxSize > dat, since dat is xLen.

This is needed to handle that case.

@@ -241,6 +248,7 @@ class Rocket(tile: RocketTile)(implicit p: Parameters) extends CoreModule()(p)
val ex_reg_inst = Reg(Bits())
val ex_reg_raw_inst = Reg(UInt())
val ex_reg_wphit = Reg(Vec(nBreakpoints, Bool()))
val ex_reg_set_vconfig = Reg(Bool())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T1 do set_vconfig at WB stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid pipeline bubbles due to dependency on new VL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other key requirement is for performant precise traps. The checks for memory access fault in M/W stages must occur ahead of the commit point and depend on the updated vconfig.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid pipeline bubbles due to dependency on new VL.

Understand, we do it in another way: for each vsetvl, it will be a phantom instruction that will be observed by following instructions. But we didn't take the precise trap into consideration.

src/main/scala/rocket/RocketCore.scala Show resolved Hide resolved
src/main/scala/rocket/RocketCore.scala Show resolved Hide resolved
src/main/scala/rocket/RocketCore.scala Show resolved Hide resolved
src/main/scala/rocket/RocketCore.scala Show resolved Hide resolved
src/main/scala/rocket/RocketCore.scala Show resolved Hide resolved
src/main/scala/rocket/RocketCore.scala Show resolved Hide resolved
src/main/scala/rocket/RocketCore.scala Show resolved Hide resolved
src/main/scala/tile/FPU.scala Show resolved Hide resolved
@jerryz123
Copy link
Contributor Author

Thanks @SharzyL @qinjun-li @sequencer for the detailed review, will investigate and respond

Copy link
Member

@sequencer sequencer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, feel free to merge as you wish.

@jerryz123 jerryz123 merged commit a235684 into dev May 17, 2024
25 of 26 checks passed
@jerryz123 jerryz123 deleted the ifv branch May 17, 2024 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants