Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parametrizable BD chain syntax and new syntax for BD configuration in sequence #1656

Merged
merged 35 commits into from
Aug 6, 2024

Conversation

andrej
Copy link
Collaborator

@andrej andrej commented Aug 1, 2024

Adds the following ops:

  • Intended for future general use, but for now only a lowering for the runtime sequence is available:
    • aie.bd_chain -- Define an abstract sequence of dma_bd invocations. Think of this as a 'function definition'.
  • Intended for use in the runtime sequence:
    • aiex.dma_start_bd_chain -- Materialize an abstract BD chain with concrete arguments. Think of this as an inlined 'function call'.
    • aiex.dma_configure_task -- New way of configuring a task (task = buffer descriptor configuration + channel + direction) in the sequence function that reuses the same familiar existing syntax used on the mem tiles and core memory modules.
    • aiex.dma_start_task -- Push a configured task (see above) to a task queue
    • aiex.dma_await_task -- Await a task completion token from a previously submitted task.
    • aiex.dma_free_task -- Inform the compiler that buffer descriptor IDs that are part of a configured task can be reused after this call. dma_await_task implies dma_free_task, but it is still useful to have a freestanding dma_fre_task to let the programmer express when it is safe to reuse BDs even without an explicit sync (e.g. due to implicit other synchronization).

Adds the following passes:

  • --aie-materialize-bd-chains: Concretize invocations of previously defined abstract BD chains. Essentially inlines BD chains.
  • --aie-assign-runtime-sequence-bd-ids: Assign buffer descriptor IDs for BDs configured inside the runtime sequence. Reuses components of the existing --aie-assign-buffer-descriptor-ids pass but adapted to the runtime sequence which is more complicated since it allows reuse of BD IDs.
  • --aie-dma-tasks-to-npu: Takes BD configurations expressed in the pre-existing syntax, wrapped in some of the new ops above, and turns them into NPU runtime sequence instructions. Expressing BD configuration using the pre-existing syntax is more powerful than e.g. nd_dma_memcpy_nd because it allows to express chains of BDs.

Discussed in #1610.

Example Use of a BD Chain

  aie.device(npu1_4col) {
    %tile_0_0 = aie.tile(0, 0)
    %tile_0_2 = aie.tile(0, 2)

    aie.bd_chain @simple_chain(%arg0: memref<8xi16>, %arg1: memref<12xi16>, %arg2: memref<8xi16>) {
            aie.dma_bd(%arg0 : memref<8xi16>, 0, 8, [<size=1, stride=0>, <size=2, stride=2>, <size=2, stride=4>, <size=2, stride=1>])
            aie.next_bd ^bd1
        ^bd1:
            aie.dma_bd(%arg1 : memref<12xi16>, 0, 12)
            aie.next_bd ^bd2
        ^bd2:
            aie.dma_bd(%arg2 : memref<8xi16>, 0, 8)
            aie.end
    }

    aiex.runtime_sequence(%arg0: memref<8xi16>, %arg1: memref<12xi16>, %arg2: memref<8xi16>) {
      %t1 = aiex.dma_start_bd_chain @simple_chain(%arg0, %arg1, %arg2) : (memref<8xi16>, memref<12xi16>, memref<8xi16>)  
                                    on (%tile_0_0, MM2S, 0) 
      aiex.dma_await_task(%t1)
    }
  }

Example Use of Syntax Reuse for BD Configuration in the Runtime Sequence

(The more abstract BD chains lower to this in the --aie-materialize-bd-chains pass.)

 aie.device(npu1_4col) {
    %tile_0_0 = aie.tile(0, 0)
    %tile_0_2 = aie.tile(0, 2)

    aiex.runtime_sequence(%arg0: memref<8xi16>) {

      // Configure a new task: BDs + channel + direction
      %t1 = aiex.dma_configure_task(%tile_0_0, MM2S, 0) {
        aie.dma_bd(%arg0 : memref<8xi16>, 0, 8)
        aie.next_bd ^bb1
      ^bb1:
        aie.dma_bd(%arg0 : memref<8xi16>, 0, 8)
        aie.end
      }
     
      // Push the configured task to queue
      aiex.dma_start_task(%t1)

      // Awaiting a task frees previously used BD IDs and issues a sync instruction
      aiex.dma_await_task(%t1)
    }
  }

There remains a good number of things to do, but the PR is already getting large and this is a good point to merge.

Further to do's:

  • Integrate the new operations with Python bindings; expressing BD chains in Python is going to be the most challenging part of this
  • Make it possible to refer to shim DMA allocations (e.g. generated from objectfifos) in the aiex.dma_start_task and aiex.start_bd_chain operations.

Copy link
Collaborator

@fifield fifield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, most of my comments are style and formatting nits.

include/aie/Dialect/AIE/IR/AIETargetModel.h Outdated Show resolved Hide resolved
lib/Dialect/AIE/IR/AIEDialect.cpp Outdated Show resolved Hide resolved
include/aie/Dialect/AIEX/IR/AIEX.td Outdated Show resolved Hide resolved
include/aie/Dialect/AIEX/IR/AIEX.td Outdated Show resolved Hide resolved
include/aie/Dialect/AIEX/IR/AIEX.td Outdated Show resolved Hide resolved
lib/Dialect/AIEX/Transforms/AIEMaterializeBDChains.cpp Outdated Show resolved Hide resolved
lib/Dialect/AIEX/Transforms/AIEMaterializeBDChains.cpp Outdated Show resolved Hide resolved
lib/Dialect/AIEX/Transforms/AIEMaterializeBDChains.cpp Outdated Show resolved Hide resolved
lib/Dialect/AIEX/Transforms/AIEMaterializeBDChains.cpp Outdated Show resolved Hide resolved
python/compiler/aiecc/main.py Outdated Show resolved Hide resolved
@andrej andrej force-pushed the parametrizable_bd_chain branch from 910955b to 3a581fa Compare August 2, 2024 16:59
@andrej andrej force-pushed the parametrizable_bd_chain branch from 46acde0 to 50db8c5 Compare August 5, 2024 21:15
@andrej andrej force-pushed the parametrizable_bd_chain branch 2 times, most recently from 66d9b22 to 4c06939 Compare August 5, 2024 22:16
@andrej andrej force-pushed the parametrizable_bd_chain branch from 4c06939 to 31df52d Compare August 5, 2024 22:30
@andrej andrej enabled auto-merge August 6, 2024 19:32
@andrej andrej added this pull request to the merge queue Aug 6, 2024
Merged via the queue into Xilinx:main with commit 45b094b Aug 6, 2024
51 checks passed
@andrej andrej deleted the parametrizable_bd_chain branch August 6, 2024 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants