[mlir][sparse] replace "sparse compiler" with "sparsifier" in doc #67082

aartbik · 2023-09-22T01:44:38Z

Rationale:
The term "sparse compiler", although dear to my heart, is often mistaken as a completely separate compiler, and not a pass within a full compiler pipeline. Therefore, we start migrating to the term "sparsifier".

Rationale: The term "sparse compiler", although dear to my heart, is often mistaken as a completely separate compiler, and not a pass within a full compiler pipeline. Therefore, we start migrating to the term "sparsifier".

llvmbot · 2023-09-22T01:45:49Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-sparse

Changes

Rationale:
The term "sparse compiler", although dear to my heart, is often mistaken as a completely separate compiler, and not a pass within a full compiler pipeline. Therefore, we start migrating to the term "sparsifier".

Full diff: https://github.com/llvm/llvm-project/pull/67082.diff

6 Files Affected:

(modified) mlir/include/mlir/Dialect/SparseTensor/IR/Enums.h (+1-2)
(modified) mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorAttrDefs.td (+17-10)
(modified) mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorBase.td (+6-1)
(modified) mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td (+4-4)
(modified) mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorType.h (+2-2)
(modified) mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.td (+2-2)

diff --git a/mlir/include/mlir/Dialect/SparseTensor/IR/Enums.h b/mlir/include/mlir/Dialect/SparseTensor/IR/Enums.h
index 675c15347791921..ea0d9e2d43b74c7 100644
--- a/mlir/include/mlir/Dialect/SparseTensor/IR/Enums.h
+++ b/mlir/include/mlir/Dialect/SparseTensor/IR/Enums.h
@@ -190,8 +190,7 @@ enum class DimLevelType : uint8_t {
   TwoOutOfFour = 64,         // 0b10000_00
 };
 
-/// This enum defines all the storage formats supported by the sparse compiler,
-/// without the level properties.
+/// This enum defines all supported storage format without the level properties.
 enum class LevelFormat : uint8_t {
   Dense = 4,             // 0b00001_00
   Compressed = 8,        // 0b00010_00
diff --git a/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorAttrDefs.td b/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorAttrDefs.td
index 19d7f599c5f7560..68ccae2257d8e43 100644
--- a/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorAttrDefs.td
+++ b/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorAttrDefs.td
@@ -106,18 +106,18 @@ def SparseTensorEncodingAttr : SparseTensor_Attr<"SparseTensorEncoding",
     sparsity-agnostic representation of the computation, i.e., an implicit sparse
     representation is converted to an explicit sparse representation where co-iterating
     loops operate on sparse storage formats rather than tensors with a sparsity
-    encoding. Compiler passes that run before this sparse compiler pass need to
-    be aware of the semantics of tensor types with such a sparsity encoding.
+    encoding. Compiler passes that run before this sparsier pass need to be aware
+    of the semantics of tensor types with such a sparsity encoding.
 
-    In this encoding, we use `dimension` to refer to the axes of the semantic tensor,
-    and `level` to refer to the axes of the actual storage format, i.e., the
+    In this encoding, we use **dimension** to refer to the axes of the semantic tensor,
+    and **level** to refer to the axes of the actual storage format, i.e., the
     operational representation of the sparse tensor in memory. The number of
     dimensions is usually the same as the number of levels (such as CSR storage format).
     However, the encoding can also map dimensions to higher-order levels (for example,
     to encode a block-sparse BSR storage format) or to lower-order levels
     (for example, to linearize dimensions as a single level in the storage).
 
-    The encoding contains a `map` that provides the following:
+    The encoding contains a map that provides the following:
 
     - An ordered sequence of dimension specifications, each of which defines:
       - the dimension-size (implicit from the tensor’s dimension-shape)
@@ -125,16 +125,17 @@ def SparseTensorEncodingAttr : SparseTensor_Attr<"SparseTensorEncoding",
     - An ordered sequence of level specifications, each of which includes a required
       **level-type**, which defines how the level should be stored. Each level-type
       consists of:
+      - a **level-expression**, which defines what is stored
       - a **level-format**
       - a collection of **level-properties** that apply to the level-format
-      - a **level-expression**, which defines what is stored
 
     Each level-expression is an affine expression over dimension-variables. Thus, the
     level-expressions collectively define an affine map from dimension-coordinates to
     level-coordinates. The dimension-expressions collectively define the inverse map,
     which only needs to be provided for elaborate cases where it cannot be inferred
     automatically. Within the sparse storage format, we refer to indices that are
-    stored explicitly as `coordinates` and indices into the storage format as `positions`.
+    stored explicitly as **coordinates** and indices into the storage format as
+    **positions**.
 
     The supported level-formats are the following:
 
@@ -155,16 +156,16 @@ def SparseTensorEncodingAttr : SparseTensor_Attr<"SparseTensorEncoding",
     - **high** : the upper bound is stored explicitly in a separate array
     - **block2_4** : the compression uses a 2:4 encoding per 1x4 block
 
-    In addition to the  `map`, the following two fields are optional:
+    In addition to the map, the following two fields are optional:
 
-    - The required bitwidth for `position` storage (integral offsets
+    - The required bitwidth for position storage (integral offsets
       into the sparse storage scheme).  A narrow width reduces the memory
       footprint of overhead storage, as long as the width suffices to
       define the total required range (viz. the maximum number of stored
       entries over all indirection levels).  The choices are `8`, `16`,
       `32`, `64`, or, the default, `0` to indicate the native bitwidth.
 
-    - The required bitwidth for `coordinate` storage (the coordinates
+    - The required bitwidth for coordinate storage (the coordinates
       of stored entries).  A narrow width reduces the memory footprint
       of overhead storage, as long as the width suffices to define
       the total required range (viz. the maximum value of each tensor
@@ -231,7 +232,9 @@ def SparseTensorEncodingAttr : SparseTensor_Attr<"SparseTensorEncoding",
     ```
   }];
 
+  //
   // Data in sparse tensor encoding.
+  //
   let parameters = (
     ins
     // A level-type for each level of the sparse storage.
@@ -239,12 +242,16 @@ def SparseTensorEncodingAttr : SparseTensor_Attr<"SparseTensorEncoding",
       "::mlir::sparse_tensor::DimLevelType",
       "level-types"
       >: $lvlTypes,
+
     // A mapping from dimension-coordinates to level-coordinates.
     "AffineMap":$dimToLvl,
+
     // The required bitwidth for position storage.
     "unsigned":$posWidth,
+
     // The required bitwidth for coordinate storage.
     "unsigned":$crdWidth,
+
     // A slice attribute for each dimension of the tensor type.
     ArrayRefParameter<
       "::mlir::sparse_tensor::SparseTensorDimSliceAttr",
diff --git a/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorBase.td b/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorBase.td
index b0e9089c3230eb5..f01957df1516439 100644
--- a/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorBase.td
+++ b/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorBase.td
@@ -25,11 +25,16 @@ def SparseTensor_Dialect : Dialect {
     means of a small sparse runtime support library.
 
     The concept of **treating sparsity as a property, not a tedious
-    implementation detail**, by letting a **sparse compiler** generate
+    implementation detail**, by letting a **sparsifier** generate
     sparse code automatically was pioneered for linear algebra by [Bik96]
     in MT1 (see https://www.aartbik.com/sparse.php) and formalized
     to tensor algebra by [Kjolstad17,Kjolstad20] in the Sparse Tensor
     Algebra Compiler (TACO) project (see http://tensor-compiler.org).
+    Please note that we started to prefer the term "sparsifier" over
+    the also commonly used "sparse compiler" terminology to refer to
+    such a pass to make it clear that the sparsifier pass is not a
+    seperate compiler, but should be an integral part of any compiler
+    pipeline that is built with the MLIR compiler infrastructure
 
     The MLIR implementation [Biketal22] closely follows the "sparse
     iteration theory" that forms the foundation of TACO.  A rewriting
diff --git a/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td b/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
index 59815fc755ee5f3..e2a2c09c5e9a01c 100644
--- a/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
+++ b/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
@@ -74,7 +74,7 @@ def SparseTensor_PackOp : SparseTensor_Op<"pack", [Pure]>,
     sources; e.g., when passing two numpy arrays from Python.
 
     Disclaimer: This is the user's responsibility to provide input that can be
-    correctly interpreted by the sparse compiler, which does not perform
+    correctly interpreted by the sparsifier, which does not perform
     any sanity test during runtime to verify data integrity.
 
     TODO: The returned tensor is allowed (in principle) to have non-identity
@@ -120,7 +120,7 @@ def SparseTensor_UnpackOp : SparseTensor_Op<"unpack", [Pure, SameVariadicResultS
     unpacked MLIR sparse tensor to frontend; e.g., returning two numpy arrays to Python.
 
     Disclaimer: This is the user's responsibility to allocate large enough buffers
-    to hold the sparse tensor. The sparse compiler simply copies each fields
+    to hold the sparse tensor. The sparsifier simply copies each fields
     of the sparse tensor into the user-supplied buffer without bound checking.
 
     TODO: the current implementation does not yet support non-identity mappings.
@@ -362,7 +362,7 @@ def SparseTensor_ToSliceOffsetOp : SparseTensor_Op<"slice.offset", [Pure]>,
     Extracts the offset of the sparse tensor slice at the given dimension.
 
     Currently, sparse tensor slices are still a work in progress, and only
-    works when runtime library is disabled (i.e., running sparse compiler
+    works when runtime library is disabled (i.e., running the sparsifier
     with `enable-runtime-library=false`).
 
     Example:
@@ -389,7 +389,7 @@ def SparseTensor_ToSliceStrideOp : SparseTensor_Op<"slice.stride", [Pure]>,
     Extracts the stride of the sparse tensor slice at the given dimension.
 
     Currently, sparse tensor slices are still a work in progress, and only
-    works when runtime library is disabled (i.e., running sparse compiler
+    works when runtime library is disabled (i.e., running the sparsifier
     with `enable-runtime-library=false`).
 
     Example:
diff --git a/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorType.h b/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorType.h
index cfc3374148f95c0..d9d6db46542a37a 100644
--- a/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorType.h
+++ b/mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorType.h
@@ -127,8 +127,8 @@ class SparseTensorType {
   /// Allow implicit conversion to `RankedTensorType`, `ShapedType`,
   /// and `Type`.  These are implicit to help alleviate the impedance
   /// mismatch for code that has not been converted to use `SparseTensorType`
-  /// directly.  Once more of the sparse compiler has been converted to
-  /// using `SparseTensorType`, we may want to make these explicit instead.
+  /// directly.  Once more uses have been converted to `SparseTensorType`,
+  /// we may want to make these explicit instead.
   ///
   /// WARNING: This user-defined-conversion method causes overload
   /// ambiguity whenever passing a `SparseTensorType` directly to a
diff --git a/mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.td b/mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.td
index ab7fffac88d9287..d8d5dbb5ad3ce75 100644
--- a/mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.td
+++ b/mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.td
@@ -31,7 +31,7 @@ def PreSparsificationRewrite : Pass<"pre-sparsification-rewrite", "ModuleOp"> {
 def SparsificationPass : Pass<"sparsification", "ModuleOp"> {
   let summary = "Automatically generate sparse tensor code from sparse tensor types";
   let description = [{
-    A pass that implements the core functionality of a **sparse compiler**.
+    A pass that implements the core functionality of a **sparsifier**.
     Each Linalg operation (MLIR's tensor index notation) that operates on
     sparse tensor types is converted into code in which the sparsity is
     explicit both in terms of co-iterating looping logic as well as
@@ -332,7 +332,7 @@ def SparseVectorization : Pass<"sparse-vectorization", "ModuleOp"> {
 def SparseGPUCodegen : Pass<"sparse-gpu-codegen", "ModuleOp"> {
   let summary = "Generates GPU code during sparsification";
   let description = [{
-    Enables sparse compiler to use GPU acceleration.
+    Enables the sparsifier to use GPU acceleration.
   }];
   let constructor = "mlir::createSparseGPUCodegenPass()";
   let dependentDialects = [

jpienaar · 2023-09-22T03:15:22Z

Wouldn't sparsifier indicate an automatic process based on data vs something part of authoring process?

joker-eph · 2023-09-22T04:13:15Z

I would think the same as @jpienaar personally with the new terminology, but I don't have a better suggestion either, so up to you ultimately.

aartbik · 2023-09-22T15:53:11Z

Yeah, no name is perfect or unambiguous I am afraid, but the "compiler" part in my favorite "sparse compiler" part has given cause for real confusion in the past, hence the new name for the effort.

aartbik requested review from PeimingLiu and yinying-lisa-li September 22, 2023 01:44

llvmbot added mlir:sparse Sparse compiler in MLIR mlir labels Sep 22, 2023

aartbik added 2 commits September 21, 2023 18:48

typos

6e8ba6b

minor rephrasing

3f97761

yinying-lisa-li approved these changes Sep 22, 2023

View reviewed changes

aartbik merged commit 2e7d83d into llvm:main Sep 22, 2023

aartbik deleted the bik branch September 22, 2023 16:19

kstoimenov mentioned this pull request Sep 22, 2023

Add memcpm test kstoimenov/llvm-project#13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir][sparse] replace "sparse compiler" with "sparsifier" in doc #67082

[mlir][sparse] replace "sparse compiler" with "sparsifier" in doc #67082

aartbik commented Sep 22, 2023

llvmbot commented Sep 22, 2023 •

edited

Loading

jpienaar commented Sep 22, 2023

joker-eph commented Sep 22, 2023

aartbik commented Sep 22, 2023

[mlir][sparse] replace "sparse compiler" with "sparsifier" in doc #67082

[mlir][sparse] replace "sparse compiler" with "sparsifier" in doc #67082

Conversation

aartbik commented Sep 22, 2023

llvmbot commented Sep 22, 2023 • edited Loading

jpienaar commented Sep 22, 2023

joker-eph commented Sep 22, 2023

aartbik commented Sep 22, 2023

llvmbot commented Sep 22, 2023 •

edited

Loading