Skip to content

Latest commit

 

History

History
766 lines (630 loc) · 26.8 KB

rif.adoc

File metadata and controls

766 lines (630 loc) · 26.8 KB

RIF Design Writeup

1. Introduction

Note
This document assumes you have basic understanding with the RISC-V Vector (RVV) Extension and the intrinsic functions associated with it. If not, please go to riscv-v-spec and rvv-intrinsic-doc.

This document records the design write-up and intents of RIF. RIF is short for RVV Intrinsic Fuzzing, as we try to generate series random test to fuzz and check if the compiler compiles C code that contains RVV C intrinsics correctly. Essentially, you may see the current RIF as a graph-based compiler that transforms the randomly generated data-flow graph into testing C code.

Before RIF, there was already a fuzz-test generator called riscv-vector-tests. It was built for fuzz-test on RVV. It served its purpose well and is an important part of the CI. In the mean time, there is some place that may be re-designed, and the trade-off from Python’s direct byte string translation makes the full test relatively time costly. All in all, we decided to write another fuzz-test generator that will complement VTG’s shortcoming(s).

1.1. Action Flow for RIF

The RIF simply goes through 3 stages. The 3 stages are:

  1. Graph generation

  2. Golden output generation

  3. Code generation

To summarize RIF’s action flow in a simple paragraph:

RIF randomly generates a bipartite data-flow graph which consists of "Operator" nodes and "Value" nodes. Then RIF will traverses through the nodes in topological order, propagating the values into computation nodes to generate expected output from execution. Finally RIF will generate a C program that executes the computations by the topological order of the data-flow graph.

You may start by looking into tool/random_gen.cpp.

1.2. VLA-Style Compute Pattern

RIF seeks to generate code that is portable, which means it should execute successfully under different VLEN settings. Therefore RIF aims to code in a VLA (Vector Length Agnostic) style.

Take example for a reduction instruction (e.g. vredsum.vs). The number of elements reduces per instruction depends on VLEN. In VLA style we will not focus on how a single instruction performs, but rather the ultimate result when we want to apply the operation to a vector of data.

// Here is a code that performs differently for different VLEN
int elementsPerInst = VLEN / SEW;
for (int i=0; i<500; i+= elementsPerInst) {
  int reductionResult = 0;
  for (int j=0; i+j<500 && j<elementsPerInst; ++j)
    reductionResult += a[i];
  VERIFY(reductionResult);
}

// Here is a code that performs indifferently for different VLEN
int elementsPerInst = VLEN / SEW;
int reductionResult = 0;
for (int i=0; i<500; i+= elementsPerInst) {
  for (int j=0; i+j<500 && j<elementsPerInst; ++j)
    reductionResult += a[i];
}
VERIFY(reductionResult);

2. Graph Generation

RIF operates on a biparitie data-flow graph (with the class Graph). As you may see in the image below, the elliptic nodes are "Operator"-s and the rectangular ones are "Value"-s.

graph

The graph is bipartite, which means that a "Value" node will not be directly connected with another "Value" node. The graph follows the use-define chain convention, which means that every "Value" is "defined" by one-and-only-one "Operator". This makes the graph a directed acyclic graph (DAG) and allows us to represent data-flow.

Note
There should still be ways to represent control-flow inside a data-flow graph so it shouldn’t be a problem. It would be more convenient to always operate on a data-flow graph.

2.1. Value Node

2.1.1. Base Class - ValueBase

The base class for Value is under include/Basic.hpp. It conains the length and connectivity information of a node. The string members are used for recognition within RIF and code generation.

There are 2 virtual methods that needs to be implemented for the class that inherits from ValueBase - generateCCode and generateData. They are for C Code Generation and Golden computation respectively. Derived classes such as Derived Class - OneD*Val, Scalar*Val, Derived Class - InitializeOp and Derived Class - *Op will implement them.

struct ValueBase {
  /* ... */

  virtual void generateCCode(std::ostream &os) = 0;
  virtual void generateData() = 0;

  const CustomValType type;
  const std::string typeID;
  const std::string id;
  const std::string dataTypeID;

  std::vector<ValueBase *> inputs;
  std::vector<ValueBase *> outputs;

  int length;
  TypeInfo *typeInfo;
};

2.1.2. Derived Class - OneD*Val, Scalar*Val

RVV instructions may operate on different bit width, so we need different derived value types for each of them. A "Value" node can either be an "One-D Value" or a "Scalar Value".

  • One-D Value: One dimensional, represents an 1-D array of data

  • Scalar Value: A single value

The owned data are held under the derived classes. You may check out the derived classes under Value.hpp. The derived-value will allocate storage space upon construction, and values will be computed later in the Section Golden computation.

#define CUSTOM_ONE_D_TYPE(CUSTOM_NAME, DATA_TYPE, DATA_WIDTH, DATA_CLASS,      \
                          MIN_VALUE, MAX_VALUE)                                \
  struct OneD##CUSTOM_NAME##Val : ValueBase {                                  \
    OneD##CUSTOM_NAME##Val(...) {                                              \
      ptr = new DATA_TYPE[length];                                             \
      raw = new uint64_t[length];                                              \
    }                                                                          \
    virtual ~OneD##CUSTOM_NAME##Val() {                                        \
      delete ptr;                                                              \
      delete raw;                                                              \
    }                                                                          \
    void generateData() override;                                              \
    DATA_TYPE *ptr;                                                            \
    uint64_t *raw;                                                             \
  };                                                                           \
  DATA_TYPE *getRawPointer(OneD##CUSTOM_NAME##Val *val);
#include "CustomValue.def"
#undef CUSTOM_ONE_D_TYPE
#undef CUSTOM_SCALAR_TYPE

There are many derived-classes and we let C MACRO generate them for us. The MACRO CUSTOM_ONE_D_TYPE and CUSTOM_SCALAR_TYPE are used under include/CustomValue.def.

// CUSTOM_ONE_D_TYPE(CUSTOM_NAME, DATA_TYPE, DATA_WIDTH, DATA_CLASS, MIN_VALUE, MAX_VALUE)
// CUSTOM_SCALAR_TYPE(CUSTOM_NAME, DATA_TYPE, DATA_WIDTH, DATA_CLASS, MIN_VALUE, MAX_VALUE)
CUSTOM_ONE_D_TYPE(Int32, int32_t, 32, SIGNED_INT, -10000000, +10000000)
CUSTOM_SCALAR_TYPE(Float32, float32_t, 32, FLOAT, 0, +20000000)

They will be fed into the MACRO definition to do under include/Value.hpp and (for derived class declaration), library/Value.cpp (for derived class member function implementation), library/Basic.cpp (for enum definition) and libraryOperator.cpp (for switch-statment code generation).

The "Value" nodes for the Graph are all derived classes. Each node represents a value computed (or initialized) and will become a variable in the generated C code. The derived value classes implement generateCCode and generateData method. generateCCode will generate strings for definition of the variable. The definitions will be outputted as global variables in the C code. The generateData will trigger the RNG generator under include/Utils.hpp by its provided data type specified from the macro definition. Please go to library/Value.cpp for more details.

2.2. Operator Node

2.2.1. Base Class - OperatorBase

A RIF "Operator" node represents an RVV C intrinsic computation. Since an "Operator" node represents an intrinsic function and a intrinsic functions will return a value, "Operator" node is derived from ValueBase. The additional information attached to OperatorBase are the input types, output type and the operator’s attribute. Since the data-flow graph follows the use-define chain, an "Operator" will only generate at most one output. It can take arbitrary numbers of input. On the other hand, the opAttr (operator' attribute) data member allows us to augment properties to the operator type, such as MaskedOperation, WideningOperation, etc…​ You can find the collection of attributes under include/Basic.hpp.

struct OperatorBase : ValueBase {
  /* ... */
  // returns 1 if fail
  int addInput(int inputIdx, ValueBase *input);
  // return 1 if fail
  int addOutput(ValueBase *output);

  const OperatorAttrT opAttr;
  const std::vector<CustomValType> inputTypes;
  const CustomValType outputType;
};

2.2.2. Derived Class - Overview

The derived operators don’t contain any extra data member, they simply override generateCCode and generateData. The class definitions of these derived class are generated through C MACRO. The MACRO CUSTOM_OP_TYPE is declared under include/CustomOperator.def.

These MACRO will be fed into Basic.cpp (for enum definition), include/Operator.hpp (for derived class definition) and library/Operator.cpp for derived class member function implementation. The derived class from OperatorBase either specializes into InitializeOp or *Op that represents an intrinsic.

2.2.3. Derived Class - InitializeOp

InitializeOp is the only special case for "Operator"-s in Graph. It serves as the operator to generate random initial value for the "Values" that don’t have a "define" after the random graph generation.

InitializeOp::generateData will not be called, as you can see it is asserted to be unreachable in the code below. RIF will look further to its define value and call *Val::generateData instead.. Please look into Golden computation for more.

// Under Operator.cpp
void InitializeOp::generateData() {
  // Please go to Graph::generateData() for the data generation call
  assert(false && "This function shall not be triggered");
}

2.2.4. Derived Class - *Op

Every *Op represents an RVV C intrinsic. *Op::generateData calls its compute header generated in Section [sec-python-golden-header], you can find them under include/autogen after you make anything that is associated with Operator.hpp. *Op::generateCCode forwards into a local function generateOperatorCode, which acts as an indirect layer for code generation. Please checkout Section Operator(Intrinsic) Computation.

#define CUSTOM_OP_TYPE(OP_TYPE, OP_ID, SEW, TYPE_CLASS, OP_ATTR, OUTPUT_TYPE,  \
                       NUM_OF_INPUTS, ...)                                     \
  void OP_TYPE##Op::generateData() { compute##OP_TYPE##Op(this); }             \
  void OP_TYPE##Op::generateCCode(std::ostream &os) {                          \
    generateOperatorCode(os, this);                                            \
  }
#include "CustomOperator.def"
#undef CUSTOM_OP_TYPE

2.3. Random Generation

RIF uses pseudo RNG. The seeds can be provided from the input argument (parsed under library/Utils.cpp::parse_opt). The seed will be held under a global variable InitialSeed (check it out under include/Utils.hpp). Otherwise there will be default values for it. The reason to have a seed is to make things reproducable.

User of the random function are the Section Initial Data Generation and Section Generating topological order.

2.4. Generating topological order

Since RIF operatates on a DAG data-flow graph, a topological order gurantees us that if we operate the nodes in such order, no data-dependency will be violated. Therefore a topological order generator is needed. Moreover we may want to reproduce such order and generate the same test file through RIF, so the topological order generator takes a seed from either the default or the input argument. You can find the topological order generator under Graph::getOperatorsInTopologicalOrder.

You may find the use of the order generator under Section Golden Computation Algorithm Sketch and Code Generation Algorithm Sketch.

2.5. Graph Generation Algorithm Sketch

The input parameters that effect the random graph generation are the following. You can find the options and how they are parsed under library/Utils.cpp.

-l, --length=DATA_LENGTH         Data length for the initial node (root)
-n, --nodes-to-gen=NODES_TO_GEN  The number of nodes to generate for the graph.
-r, --root=OPERATOR_ENUM         Initial node for the graph generation, default root is 'AddVV32'
-s, --seed=RANDOM_SEED           Seed for random number generator, default seed is '0xdeadbeef'

Here is a pseudo code for how RIF random generates the data-flow graph.

def AddUninitializedValuesAroundOp(G, op) :
  for inputType in op.inputTypes :
      Value value = createValue(inputType)
      G.addUsedValueNode(value, op)
  ValueType value = createop.outputType
  G.addDefineValueNode(value, op)

# The algorithm starts Here
def RandomGraphGeneration() :
    srand(RANDOM_SEED)

    # Declare an empty graph, the one-d values inside will be of length DATA_LENGTH
    Graph G(DATA_LENGTH)

    Operator op = createOperator(OPERATOR_ENUM)
    G.addOperator(op)
    addUninitializedValuesAroundOp(G, op)

    while size(G.ops) < NODES_TO_GEN :
        int operation = random_coinflip() # 0 / 1
        if operator == 0 : # Grows on existing value
            value = Random select an value from G
            if value is already defined by an opertaor:
                OperatorType opType = Random select a feasible operator type that uses value.ValueType
                Operator op = createOperator(opType)
                op = G.addOperator(op)
                op.addUsedValueNode(value)
                addUninitializedValuesAroundOp(G, op)
            else :
                OperatorType opType = Random select a feasible operator type that defines value.ValueType
                Operator op = createOperator(opType)
                op = G.addOperator(op)
                op.addDefinedValueNode(value)
                addUninitializedValuesAroundOp(G, op)
        else : # Create a random operator that doesn't connect to any existing connected-component
            OperatorType opType = Random select operator type except InitializeOp
            Operator op = createOperator(opType)
            op = G.addOperator(op)
            addUninitializedValuesAroundOp(G, op)

    # Add an InitializeOp for all values without a define
    for value in filter(G.values, values.define == NULL) :
        Operator op = createOperator(InitializeOp)
        op.addDefinedValueNode(value)

3. Golden computation

Golden computation is an important feature RIF provides. By computing for the exact expected output for exeuction of the generated test file, we can verify that the compiler have compiled the C code correctly. This section introduces how the data is initialized and how RIF computes them.

All nodes on the graph have a virtual function generateData that needs to be implemented (including the operators, since they also are derived from ValueBase). generateData will be called and computation will propagate through the whole graph.

3.1. Initial Data Generation

Since we are traversing in topological order, the first operators we encounter are the ones without any in-degree, which should be the InitializeOp-s. When an InitialzeOp is encountered, RIF forwards the responsibility of data generation to its defining value. The value node holds its data type so they are able to call the pseudo random number generator under include/Utils.hpp. Take the one-d values for example (check it out under library/Value.cpp).

#define CUSTOM_ONE_D_TYPE(CUSTOM_NAME, DATA_TYPE, DATA_WIDTH, DATA_CLASS,      \
                          MIN_VALUE, MAX_VALUE)                                \
  void OneD##CUSTOM_NAME##Val::generateData() {                                \
    auto length = this->length;                                                \
    auto data = getRawPointer(this);                                           \
    for (int i = 0; i < length; ++i)                                           \
      data[i] = getRandomNumber<DATA_TYPE>(MIN_VALUE, MAX_VALUE, &raw[i]);     \
  }                                                                            \
#include "CustomValue.def"
#undef CUSTOM_ONE_D_TYPE
#undef CUSTOM_SCALAR_TYPE

3.2. Python-generated Golden Header

Operator nodes except InitializeOp maps to some RVV C intrinsic API. Given that we traverse under topological order, we can assume that all input values will have their data ready for computation when RIF visits an operator. Every operator implements the generateData method.

#define CUSTOM_OP_TYPE(OP_TYPE, OP_ID, SEW, TYPE_CLASS, OP_ATTR, OUTPUT_TYPE,  \
                       NUM_OF_INPUTS, ...)                                     \
  void OP_TYPE##Op::generateData() { compute##OP_TYPE##Op(this); }             \
#include "CustomOperator.def"
#undef CUSTOM_OP_TYPE

The computeOP_TYPEOp functions are under headers auto-generated by Python code under scripts/. The scripts look into CustomOperator.def and generate a header for every MACRO definition. The headers are collected under include/autogen/, with include/AutoGenComputeOp.h collecting all the headers into a header file. You may notice that there is an inclusion within the for-loop. It is the header from Spike, please checkout Section Code Reused From riscv-isa-sim.

// include/autogen/computeAddVV32VInt32Op.h
void computeAddVV32VInt32Op(RIF::OperatorBase *op) {
  auto a = static_cast<RIF::OneDInt32Val *>(op->inputs[0]);
  auto b = static_cast<RIF::OneDInt32Val *>(op->inputs[1]);
  auto c = static_cast<RIF::OneDInt32Val *>(op->outputs[0]);

  assert(a->length == b->length && a->length == c->length);

  auto length = a->length;

  auto dataA = getRawPointer(a);
  auto dataB = getRawPointer(b);
  auto dataOut = getRawPointer(c);

  auto sew = op->typeInfo->sew.to_int();
  auto dataASew = a->typeInfo->sew.to_int(); // for index load / store only
  P.VU.vsew = sew;

  for (int i = 0; i < length; ++i) {
#include"vadd_vv.h"
  }
}

3.3. Code Reused From riscv-isa-sim

The header that is included inside the for-loop of the compute headers under include/autogen/ are the core logics of the vector function. They are logics defined by riscv-isa-sim (also called as "Spike"), which is a RISC-V ISA Simulator by SiFive. The logic headers are under riscv-isa-sim/riscv/insns.h. Reusing the header allows RIF to avoid reconstructing compute logic for vector instructions and the behaviors is already guaranteed correct because Spike is a well-tested simulator.

// riscv-isa-sim/riscv/insns/vadd_vv.h
// vadd.vv vd, vs1, vs2, vm
VI_VV_LOOP
({
  vd = vs1 + vs2;
})

Spike encapsulates the core logic with MACRO and decodes them with the header riscv-isa-sim/riscv/decode.h. Like Spike, RIF holds its own decode header include/Decode.hpp that suits our computation. Unfortunately, not all compute logic headers are well encapsulated, so RIF would have to manually override some of them. The overrided headers are placed under include/rif-local/.

// include/Decode.hpp
#define VI_VV_LOOP(BODY)                                                       \
  auto vs2 = dataA[i];                                                         \
  auto vs1 = dataB[i];                                                         \
  auto &vd = dataOut[i];                                                       \
  BODY
Note
RIF performs floating point computation with the softfloat library just like Spike.

3.4. Golden Computation Algorithm Sketch

Since RIF is operating on a data-flow graph, by traversing through the graph in topological order, we can guarantee that no data dependency is violated.

// Under library/Graph.cpp
void Graph::generateData(uint32_t seed) {
  auto ordering = getOperatorsInTopologicalOrder(seed);
  initializeRNG(seed);
  for (auto id : ordering) {
    auto op = operatorLUT[id];
    if (op->type == CustomValType::Initialize) {
      auto value = op->outputs[0];
      value->generateData();
    } else {
      uint8_t save_roundingMode = softfloat_roundingMode; // save
      op->generateData();
      softfloat_roundingMode = save_roundingMode; // restore
    }
  }
}

4. C Code Generation

The last stage of RIF is the code generation. The generated C code files can be split into 3 parts:

  1. Variable Declaration

  2. Operator(Intrinsic) Computation

  3. Result Verification

The generated C code will look like the half-pseudo code below. You may checkout expected code generation samples under the tests with CodeGen label under the folder test/.

/*
[a]        [b]
  \       /
   (add_op)
      |
     [c]
*/


#include <...>

// 1. Variable Declaration
int a[500];
int b[500];
int c[500];

// 2. Operation(Intrinsic) Computation
void init_value_for_a() {
    int tmp[] = { /* integer literals generated by RIF */ }
    for (int i=0; i<500; ++i)
      a[i] = tmp[i];
}
void init_value_for_b() {
    int tmp[] = { /* integer literals generated by RIF */ }
    for (int i=0; i<500; ++i)
      b[i] = tmp[i];
}
void operator_add() {
    // RIF will generate such vector code (half-pseudo code here)
    int counter = 500;
    int *placeholder_a = a;
    int *placeholder_b = b;
    int *placeholder_c = c;

    for (size_t vl; counter > 0; counter -= vl) {
        vl = vsetvl(counter);
        // load data
        vec_a = vector_unit_strided_load(placeholder_a, vl);
        vec_b = vector_unit_strided_load(placeholder_b, vl);

        // computation
        vec_c = vadd_vv_i32m1(vec_a, vec_b, vl);

        // store data
        vector_unit_strided_store(placeholder_c, vec_c, vl)

        // increment the placeholder
        placeholder_a += vl;
        placeholder_b += vl;
        placeholder_c += vl;
    }
}

// 3. Result Verification
int golden_init_value_for_a() { return true; }
int golden_init_value_for_b() { return true; }
int golden_operator_add() {
    int tmp[] = { /* integer literals generated by RIF */ }
    for (int i=0; i<500; ++i)
        if (c[i] != tmp[i])
            return false;

    return true;
}

int main ()
{
    // Calls computation in topological order
    init_value_for_a();
    init_value_for_b();
    operator_add();

    // Calls golden verification in topological order
    bool success = 1;
    success &= golden_init_value_for_a();
    success &= golden_init_value_for_b();
    success &= golden_operator_add();

    return success != true;
}

4.1. Variable Declaration

The variables are declared outside of the main function to avoid stack overflow. There is no requirement for ordering in varaible declaration. So a single for-loop will do the job. The code generation for these declarations are under generateCCode of the derived value nodes, you can check for details under library/Value.cpp.

#define CUSTOM_ONE_D_TYPE(CUSTOM_NAME, DATA_TYPE, DATA_WIDTH, DATA_CLASS,      \
                          MIN_VALUE, MAX_VALUE)                                \
  void OneD##CUSTOM_NAME##Val::generateCCode(std::ostream &os) {               \
    os << this->dataTypeID << " " << this->id << "[" << this->length           \
       << "];\n";                                                              \
  }
#include "CustomValue.def"
#undef CUSTOM_ONE_D_TYPE
#undef CUSTOM_SCALAR_TYPE

4.2. Operator(Intrinsic) Computation

Although the RVV instructions varies among each other, their code generation for computation still shares many components. The main components can be split into clear steps like the owns shown in the code snippet of Section C Code Generation.

This allows RIF to create a common framework for operators - CodeGenForOperator. It is a base class that extracts the common methods to code gen the computation for an operator. You can find the definition under include/Operator.hpp.

void CodeGenForOperator::generateSingleOperatorCode() {
  auto output = op->outputs[0];
  getRawPointers(op->inputs, output);
  std::string counter = CodeGenForOperator::getCounter(os, loopLength);
  CodeGenForOperator::getLoopStart(os, counter);
  {
    getVL(counter);
    std::vector<std::string> args = getIntrinsicArguments();
    auto opResult = genOpString(os, op, args, output);
    storeResult(opResult);
    incrementRawPointerByVLEN();
  }
  CodeGenForOperator::getLoopEnd(os);
}

4.3. Result Verification

Results are verified after all of them have been computed. A verification function is created for every operator on the graph (even the InitializeOp-s, they returns 1 unconditionally). For non-initialize operators, RIF will fill in the precomputed results into number literals inside the function and uses a for-loop to verify it.

Since RIF calculates floating point computation with softfloat, comparison of the data needs an adapter to transform the number literals into floating-point types. So different code generation is required for them. The code gen function for this is templatized and realized under Operator.cpp::emitOneDVerificationCode.

4.4. Code Generation Algorithm Sketch

The starting point for code generation is Graph::generateCCode.

Graph G
OutputStream os

# 1. Variable Declaration
for value in G.values :
  value->generateCCode(os)

# 2. Operator(Intrinsic) Computation
for op in G.ops :
    if op is an InitializeOp :
        op->define->generateCCode(os)
    else
        op->generateCCode(os)

# 3. Result Verification
for op in G.ops :
    op->generateVerificationCode(os)

os << "int main() {\n";
{
    # Calls computation in topological order
    for op in topological_order(G.ops) :
        os << op.getNameWithType() << "()\n";

    os << "bool success = true;\n";

    # Calls verification
    os << "success &= " << std::string("golden_") + op.getNameWithType() << "()\n";

    os << "return success;\n";
}
os << "}\n";

5. Appendix: Graphviz Generation

Generating a visualization for the graph is beneficial for debugging. You may call Graph::generateGraphViz to create a .dot file and turn it into a png file with:

dot -Tpng output.dot > output.png