Skip to content

Latest commit

 

History

History
75 lines (53 loc) · 4.81 KB

README.md

File metadata and controls

75 lines (53 loc) · 4.81 KB

AuRORA: Virtualized Accelerator Orchestration for Multi-Tenant Workloads

Quick Links

This repository corresponds to submodule "chipyard" as ReRoCC AE version implements here.

What is AuRORA

AuRORA is a novel full-stack accelerator integration methodology that enables scalable multi-accelerator deployment for multi-tenant workloads. AuRORA supports virtualized accelerator orchestration through co-designing the hw-sw stack of accelerator allow adaptively binding the workloads into accelerators. AuRORA consists of ReRoCC (remote RoCC), a virtualized and disaggregated accelerator interface for many-accelerator integration, and a runtime system for adaptive accelerator management. Similar to virtual memory to physical memory abstraction, AuRORA provides an abstraction between user's view of accelerator and the physical accelerator instances. AuRORA's virtualized interface allows workloads to be flexibly and dynamically orchestrated to available accelerators based on their latency requirement, regardless of the physical accelerator instances' location. To effectively support virtualized accelerator orchestration, AuRORA delivers a full-stack solution that co-designs the HW and SW layers, with the goal of delivering scalable performance for multi-accelerator systems.

From bottom to top, AuRORA full-stack includes:

  • Low-overhead shim microarchitecture to interface between cores and accelerators.
  • Hardware messaging protocol between core and accelerators to enable scalable and virtualized accelerator deployment on SoC.
  • ISA extension to allow user threads to interact with AuRORA hardware in a programmable fashion.
  • Lightweight software runtime to dynamically reallocate resources for multi-tenant workloads.

Please refer to our paper for details.

AuRORA Microarchitecture

AuRORA microarchitecture component consists of Client and Manager. Client integrates with the host general-purpose cores. It allows communication to and from disaggregated accelerators and provide illusion of tight-coupling. Manager wraps an existing accelerators. It includes PTW and L2 TLB which are compliant to accelerator MMU. It implements a shadow copy of architectural CSRs used by accelerator MMU.

AuRORA ISA

AuRORA includes 5 ISAs, which are rerocc_acquire and rerocc_release to acquire and release the accelerator, rerocc_assign to map acquired accelerator to available opcode, rerocc_fence to fence memory between core memory and acquire accelerator if needed, and rerocc_memrate for memory rate partitioning. This file contains ISA sets used.

SoC integration

AuRORA supports both crossbar and NoC integration for protocol transport. This can be shared with on-chip memory interconnect, or can be configured as a separate interconnect. Please refer to the SoC Configs how we configured NoC and crossbar SoC.

AuRORA Runtime

AuRORA runtime is implemented in gemmini tests for convenience as we use Gemmini DNN accelerator generator for evaluation.

Citing AuRORA

If AuRORA helps you in your research, you are encouraged to cite our paper. Here is an example bibtex:

@inproceedings{
  aurora,
  title={AuRORA: Virtualized Accelerator Orchestration for Multi-Tenant Workloads},
  author={Seah Kim and Jerry Zhao and Krste Asanovic and Borivoje Nikolic and Yakun Sophia Shao},
  booktitle={IEEE/ACM International Symposium on Microarchitecture (MICRO)},
  year={2023}
}

Other Useful Resources

Using Chipyard

To learn about using Chipyard, see the documentation on the Chipyard documentation site: https://chipyard.readthedocs.io/

Using FireSim

To learn about using FireSim, you can find the documentation and getting-started guide at docs.fires.im.

Using Gemmini

To learn about using Gemmini, visit Gemmini repository.