add new neural types docs

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
NVIDIA · Feb 18, 2020 · 1ac0072 · 1ac0072
1 parent 2f20b70
commit 1ac0072
Show file tree

Hide file tree

Showing 10 changed files with 178 additions and 43 deletions.
diff --git a/docs/sources/source/tutorials/neuraltypes.rst b/docs/sources/source/tutorials/neuraltypes.rst
@@ -1,63 +1,166 @@
 Neural Types
 ============
 
-Neural Types are used to check input tensors to make sure that two neural modules are compatible, and catch
-semantic and dimensionality errors.
+Basics
+~~~~~~
 
-Neural Types are implemented by :class:`NeuralType<nemo.core.neural_types.NeuralType>` class which is a mapping from Tensor's axis to :class:`AxisType<nemo.core.neural_types.AxisType>`.
+All input and output ports of every neural module in NeMo are typed.
+The type system's goal is check compatibility of connected input/output port pairs.
+The type system's constraints are checked when the user connects modules with each other and before any training or
+inference is started.
 
-:class:`AxisType<nemo.core.neural_types.AxisType>` contains following information per axis:
+Neural Types are implemented with the Python class :class:`NeuralType<nemo.core.neural_types.NeuralType>` and helper
+classes derived from :class:`ElementType<nemo.core.neural_types.ElementType>`, :class:`AxisType<nemo.core
+.neural_types.AxisType>` and :class:`AxisKindAbstract<nemo.core.neural_types.AxisKindAbstract>`.
 
-* Semantic Tag, which must inherit from :class:`BaseTag<nemo.core.neural_types.BaseTag>`, for example: :class:`BatchTag<nemo.core.neural_types.BatchTag>`, :class:`ChannelTag<nemo.core.neural_types.ChannelTag>`, :class:`TimeTag<nemo.core.neural_types.TimeTag>`, etc. These tags can be related via `is-a` inheritance.
-* Dimension: unsigned integer
-* Descriptor: string
+**A Neural Type contains two categories of information:**
 
+* **axes** - represents what varying a particular axis means (e.g. batch, time, etc.)
+* **elements_type** - represents the semantics and properties of what is stored inside the activations (audio signal,text embedding, logits, etc.)
 
-To instantiate a NeuralType you should pass it a dictionary (axis2type) which will map axis to it's AxisType.
-For example, a ResNet18 input and output ports can be described as:
+
+To instantiate a NeuralType you need to pass it the following arguments: `axes: Optional[Tuple] = None,
+elements_type: ElementType = VoidType(), optional=False`. Typically, the only place where you need to instantiate
+:class:`NeuralType<nemo.core.neural_types.NeuralType>` objects are inside your module's `input_ports` and
+`output_ports` properties.
+
+
+Consider an example below. It represents an (audio) data layer output ports, used in Speech recognition collection.
 
 .. code-block:: python
 
-    input_ports = {"x": NeuralType({0: AxisType(BatchTag),
-                                    1: AxisType(ChannelTag),
-                                    2: AxisType(HeightTag, 224),
-                                    3: AxisType(WidthTag, 224)})}
-    output_ports = {"output": NeuralType({
-                                    0: AxisType(BatchTag),
-                                    1: AxisType(ChannelTag)})}
+        {
+            'audio_signal': NeuralType(axes=(AxisType(kind=AxisKind.Batch, size=None, is_list=False),
+                                             AxisType(kind=AxisKind.Time, size=None, is_list=False)),
+                                       elements_type=AudioSignal(freq=self._sample_rate)),
+            'a_sig_length': NeuralType(axes=tuple(AxisType(kind=AxisKind.Batch, size=None, is_list=False)),
+                                       elements_type=LengthsType()),
+            'transcripts': NeuralType(axes=(AxisType(kind=AxisKind.Batch, size=None, is_list=False),
+                                             AxisType(kind=AxisKind.Time, size=None, is_list=False)),
+                                      elements_type=LabelsType()),
+            'transcript_length': NeuralType(axes=tuple(AxisType(kind=AxisKind.Batch, size=None, is_list=False)),
+                                            elements_type=LengthsType()),
+        }
+
+A less verbose version of exactly the same output ports looks like this:
 
+.. code-block:: python
 
+        {
+            'audio_signal': NeuralType(('B', 'T'), AudioSignal(freq=self._sample_rate)),
+            'a_sig_length': NeuralType(tuple('B'), LengthsType()),
+            'transcripts': NeuralType(('B', 'T'), LabelsType()),
+            'transcript_length': NeuralType(tuple('B'), LengthsType()),
+        }
 
-**Neural type comparison**
 
-Two :class:`NeuralType<nemo.core.neural_types.NeuralType>` objects can be compared using ``.compare`` method.
-The result is:
+
+Neural type comparison
+~~~~~~~~~~~~~~~~~~~~~~
+
+Two :class:`NeuralType<nemo.core.neural_types.NeuralType>` objects are compared using ``.compare`` method.
+The result is from the :class:`NeuralTypeComparisonResult<nemo.core.neural_types.NeuralTypeComparisonResult>`:
 
 .. code-block:: python
 
     class NeuralTypeComparisonResult(Enum):
-      """The result of comparing two neural type objects for compatibility.
-      When comparing A.compare_to(B):"""
-      SAME = 0
-      LESS = 1  # A is B
-      GREATER = 2  # B is A
-      DIM_INCOMPATIBLE = 3  # Resize connector might fix incompatibility
-      TRANSPOSE_SAME = 4 # A transpose will make them same
-      INCOMPATIBLE = 5  # A and B are incompatible. Can't fix incompatibility automatically
+        """The result of comparing two neural type objects for compatibility.
+        When comparing A.compare_to(B):"""
+
+        SAME = 0
+        LESS = 1  # A is B
+        GREATER = 2  # B is A
+        DIM_INCOMPATIBLE = 3  # Resize connector might fix incompatibility
+        TRANSPOSE_SAME = 4  # A transpose and/or converting between lists and tensors will make them same
+        CONTAINER_SIZE_MISMATCH = 5  # A and B contain different number of elements
+        INCOMPATIBLE = 6  # A and B are incompatible
+        SAME_TYPE_INCOMPATIBLE_PARAMS = 7  # A and B are of the same type but parametrized differently
+
+
+Special cases
+~~~~~~~~~~~~~
+
+* **Void** element types. Sometimes, it is necessary to have a functionality similar to "void*" in C/C++. That, is if we still want to enforce order and axes' semantics but should be able to accept elements of any type. This can be achieved by using an instance of :class:`VoidType<nemo.core.neural_types.VoidType>` as ``elements_type`` argument.
+* **Big void** this type will effectively disable any type checks. This is how to create such type: ``NeuralType()``. The result of its comparison to any other type will always be SAME.
+* **AxisKind.Any** this axis kind is used to represent any axis. This is useful, for example, in losses where a specific loss module can be used in difference applications and therefore with different axis kinds
+
+Inheritance
+~~~~~~~~~~~
+
+Type inheritance is a very powerful tool in programming. NeMo's neural types support inheritance. Consider the
+following example below.
+
+**Example.** We want to represent the following. A module's A output (out1) produces mel-spectrogram
+signal, while module's B output produces mffc-spectrogram. We also want to a thrid module C which can perform data
+augmentation with any kind of spectrogram. With NeMo neural types representing this semantics is easy:
 
+.. code-block:: python
+
+    input = NeuralType(('B', 'D', 'T'), SpectrogramType())
+    out1 = NeuralType(('B', 'D', 'T'), MelSpectrogramType())
+    out2 = NeuralType(('B', 'D', 'T'), MFCCSpectrogramType())
+
+    # then the following comparison results will be generated
+    input.compare(out1) == SAME
+    input.compare(out2) == SAME
+    out1.compare(input) == INCOMPATIBLE
+    out2.compare(out1) == INCOMPATIBLE
+
+This happens because both ``MelSpectrogramType`` and ``MFCCSpectrogramType`` inherit from ``SpectrogramType`` class.
+Notice, that mfcc and mel spectrograms aren't interchangable, which is why ``out1.compare(input) == INCOMPATIBLE``
 
-**Special cases**
+Advanced usage
+~~~~~~~~~~~~~~
 
-* *Non-tensor* objects should be denoted as ``NeuralType(None)``
-* *Optional*: input is as optional, if input is provided the type compatibility will be checked
-* *Root* type is denoted by ``NeuralType({})``: A port of ``NeuralType({})`` type must accept NmTensors of any NeuralType:
+**Extending with user-defined types.** If you need to add your own element types, create a new class inheriting from
+:class:`ElementType<nemo.core.neural_types.ElementType>`. Instead of using built-in axes kinds from
+:class:`AxisKind<nemo.core.neural_types.AxisKind>`, you can define your own
+by creating a new Python enum which should inherit from :class:`AxisKindAbstract<nemo.core.neural_types.AxisKindAbstract>`.
+
+**Lists**. Sometimes module's input or output should be a list (possibly nested) of Tensors. NeMo's
+:class:`AxisType<nemo.core.neural_types.AxisType>` class accepts ``is_list`` argument which could be set to True.
+Consider the example below:
 
 .. code-block:: python
 
-    root_type = NeuralType({})
-    root_type.compare(any_other_neural_type) == NeuralTypeComparisonResult.SAME
+        T1 = NeuralType(
+            axes=(
+                AxisType(kind=AxisKind.Batch, size=None, is_list=True),
+                AxisType(kind=AxisKind.Time, size=None, is_list=True),
+                AxisType(kind=AxisKind.Dimension, size=32, is_list=False),
+                AxisType(kind=AxisKind.Dimension, size=128, is_list=False),
+                AxisType(kind=AxisKind.Dimension, size=256, is_list=False),
+            ),
+            elements_type=ChannelType(),
+        )
+
+In this example, first two axes are lists. That is the object are list of lists of rank 3 tensors with dimensions
+(32x128x256). Note that all list axes must come before any tensor axis.
+
+.. tip::
+    We strongly recommend this to be avoided, if possible, and tensors used instead (perhaps) with padding.
 
-See "nemo/tests/test_neural_types.py" for more examples.
+
+**Named tuples (structures).** To represent struct-like objects, for example, bounding boxes in computer vision, use
+the following syntax:
+
+.. code-block:: python
+
+        class BoundingBox(ElementType):
+            def __str__(self):
+                return "bounding box from detection model"
+            def fields(self):
+                return ("X", "Y", "W", "H")
+        # ALSO ADD new, user-defined, axis kind
+        class AxisKind2(AxisKindAbstract):
+            Image = 0
+        T1 = NeuralType(elements_type=BoundingBox(),
+                        axes=(AxisType(kind=AxisKind.Batch, size=None, is_list=True),
+                              AxisType(kind=AxisKind2.Image, size=None, is_list=True)))
+
+In the example above, we create a special "element type" class for BoundingBox which stores exactly 4 values.
+We also, add our own axis kind (Image). So the final Neural Type (T1) represents lists (for batch) of lists (for
+image) of bounding boxes. Under the hood it should be list(lists(4x1 tensors)).
 
 
 **Neural Types help us to debug models**
@@ -76,6 +179,5 @@ For example, module should concatenate (add) two input tensors X and Y along dim
 
 A module expects image of size 224x224 but gets 256x256. The type comparison will result in ``NeuralTypeComparisonResult.DIM_INCOMPATIBLE`` .
 
-.. note::
-    This type mechanism is represented by Python inheritance. That is, :class:`NmTensor<nemo.core.neural_types.NmTensor>` class inherits from :class:`NeuralType<nemo.core.neural_types.NeuralType>` class.
+
 
diff --git a/nemo/collections/asr/data_layer.py b/nemo/collections/asr/data_layer.py
@@ -102,7 +102,12 @@ def output_ports(self):
             # 'a_sig_length': NeuralType({0: AxisType(BatchTag)}),
             # 'transcripts': NeuralType({0: AxisType(BatchTag), 1: AxisType(TimeTag)}),
             # 'transcript_length': NeuralType({0: AxisType(BatchTag)}),
-            'audio_signal': NeuralType(('B', 'T'), AudioSignal(freq=self._sample_rate)),
+            'audio_signal': NeuralType(
+                ('B', 'T'),
+                AudioSignal(freq=self._sample_rate)
+                if self is not None and self._sample_rate is not None
+                else AudioSignal(),
+            ),
             'a_sig_length': NeuralType(tuple('B'), LengthsType()),
             'transcripts': NeuralType(('B', 'T'), LabelsType()),
             'transcript_length': NeuralType(tuple('B'), LengthsType()),

diff --git a/nemo/collections/asr/las/misc.py b/nemo/collections/asr/las/misc.py
@@ -7,6 +7,7 @@
 from nemo.core.neural_types import *
 from nemo.utils.decorators import add_port_docs
 
+
 class JasperRNNConnector(TrainableNM):
     """Connector between jasper encoder and some other module, that does
     change number of channels.

diff --git a/nemo/collections/nlp/nm/data_layers/glue_benchmark_datalayer.py b/nemo/collections/nlp/nm/data_layers/glue_benchmark_datalayer.py
@@ -18,6 +18,7 @@
 from nemo.collections.nlp.nm.data_layers.text_datalayer import TextDataLayer
 from nemo.core import CategoricalValuesType, ChannelType, NeuralType, RegressionValuesType
 from nemo.utils.decorators import add_port_docs
+
 __all__ = ['GlueClassificationDataLayer', 'GlueRegressionDataLayer']
 
 

diff --git a/nemo/collections/nlp/nm/data_layers/joint_intent_slot_datalayer.py b/nemo/collections/nlp/nm/data_layers/joint_intent_slot_datalayer.py
@@ -18,6 +18,7 @@
 from nemo.collections.nlp.nm.data_layers.text_datalayer import TextDataLayer
 from nemo.core import ChannelType, NeuralType
 from nemo.utils.decorators import add_port_docs
+
 __all__ = ['BertJointIntentSlotDataLayer', 'BertJointIntentSlotInferDataLayer']
 
 

diff --git a/nemo/collections/nlp/nm/data_layers/punctuation_capitalization_datalayer.py b/nemo/collections/nlp/nm/data_layers/punctuation_capitalization_datalayer.py
@@ -18,6 +18,7 @@
 from nemo.collections.nlp.nm.data_layers.text_datalayer import TextDataLayer
 from nemo.core import ChannelType, LabelsType, NeuralType
 from nemo.utils.decorators import add_port_docs
+
 __all__ = ['PunctuationCapitalizationDataLayer']
 
 

diff --git a/nemo/collections/nlp/nm/data_layers/state_tracking_trade_datalayer.py b/nemo/collections/nlp/nm/data_layers/state_tracking_trade_datalayer.py
@@ -46,7 +46,6 @@
 from nemo.core.neural_types import ChannelType, LabelsType, LengthsType, NeuralType
 from nemo.utils.decorators import add_port_docs
 
-
 __all__ = ['MultiWOZDataLayer']
 
 

diff --git a/nemo/core/neural_types/neural_type.py b/nemo/core/neural_types/neural_type.py
@@ -54,10 +54,7 @@ def __str__(self):
                 f" elements_type: {self.elements_type.__class__.__name__}"
             )
         else:
-            return (
-                f"axes: None; "
-                f" elements_type: {self.elements_type.__class__.__name__}"
-            )
+            return f"axes: None; " f" elements_type: {self.elements_type.__class__.__name__}"
 
     def __init__(self, axes: Optional[Tuple] = None, elements_type: ElementType = VoidType(), optional=False):
         if not isinstance(elements_type, ElementType):

diff --git a/tests/core/test_infer.py b/tests/core/test_infer.py
@@ -21,8 +21,9 @@
 import nemo
 from nemo.backends.pytorch.nm import NonTrainableNM
 from nemo.core.neural_types import *
-from tests.common_setup import NeMoUnitTest
 from nemo.utils.decorators import add_port_docs
+from tests.common_setup import NeMoUnitTest
+
 
 class AddsTen(NonTrainableNM):
     def __init__(self):

diff --git a/tests/core/test_neural_types.py b/tests/core/test_neural_types.py
@@ -29,6 +29,8 @@
     NeuralTypeComparisonResult,
     SpectrogramType,
     VoidType,
+    ElementType,
+    AxisKindAbstract
 )
 from tests.common_setup import NeMoUnitTest
 
@@ -186,3 +188,28 @@ def test_any_axis(self):
         self.assertEqual(t1.compare(t2), NeuralTypeComparisonResult.SAME)
         self.assertEqual(t2.compare(t1), NeuralTypeComparisonResult.INCOMPATIBLE)
         self.assertEqual(t1.compare(t0), NeuralTypeComparisonResult.INCOMPATIBLE)
+
+    def test_struct(self):
+        class BoundingBox(ElementType):
+            def __str__(self):
+                return "bounding box from detection model"
+            def fields(self):
+                return ("X", "Y", "W", "H")
+        # ALSO ADD new, user-defined, axis kind
+        class AxisKind2(AxisKindAbstract):
+            Image = 0
+        T1 = NeuralType(elements_type=BoundingBox(),
+                        axes=(AxisType(kind=AxisKind.Batch, size=None, is_list=True),
+                              AxisType(kind=AxisKind2.Image, size=None, is_list=True)))
+
+        class BadBoundingBox(ElementType):
+            def __str__(self):
+                return "bad bounding box from detection model"
+
+            def fields(self):
+                return ("X", "Y", "H")
+
+        T2 = NeuralType(elements_type=BadBoundingBox(),
+                        axes=(AxisType(kind=AxisKind.Batch, size=None, is_list=True),
+                              AxisType(kind=AxisKind2.Image, size=None, is_list=True)))
+        self.assertEqual(T2.compare(T1), NeuralTypeComparisonResult.INCOMPATIBLE)
Original file line number	Diff line number	Diff line change
Expand Up		@@ -46,7 +46,6 @@
		from nemo.core.neural_types import ChannelType, LabelsType, LengthsType, NeuralType
		from nemo.utils.decorators import add_port_docs


		__all__ = ['MultiWOZDataLayer']


Expand Down