[Onnx] Fix NLL Loss tests #8971

AndrewZhaoLuo · 2021-09-08T23:54:11Z

Supports negative indices for gather and gathernd op.

This was what caused issues #8918 and #8964. Finally, the onnx tests throw invalid indices at you but mask them with ignore_index. This was not accounted for and sometimes caused failures too.

This should fix the flaky onnx test and allow for the remaining non-expanded onnx tests

Testing:
Ran pytest tests/python/frontend/onnx/test_forward.py::test_onnx_nodes -k test_nllloss_NCd1d2d3_none_no_weight_negative_ii --count=100 and confirmed no failures

Ran pytest tests/python/frontend/onnx/test_forward.py::test_onnx_nodes -k test_nllloss_NCd1d2d3_sum_weight_high_ii --count=100 and confirmed no failures
Ran pytest tests/python/frontend/onnx/test_forward.py::test_onnx_nodes -k test_gather --count=100 and confirmed no failures

masahi · 2021-09-09T00:14:43Z

@AndrewZhaoLuo Maybe we can remove this function?

tvm/python/tvm/relay/frontend/onnx.py

Lines 1510 to 1511 in cf439ec

    
           def normalize_gather_indices(data, indices, axis): 
        
               """Make sure gather indicies aren't negative"""

AndrewZhaoLuo · 2021-09-10T21:39:06Z

This is ready for a full review now.

AndrewZhaoLuo · 2021-09-10T21:39:53Z

Another topic of discussion is whether we should just allow negative indices to be always on. There is overhead but I believe it is very small.

I did it with an indexmod operation instead of a branch and it increased runtime of gather by 3% across 100 iterations and a relatively small input size of [3, 3, 3]

mbrookhart · 2021-09-13T15:31:49Z

@AndrewZhaoLuo, can you try profiling that with a larger input?

mbrookhart · 2021-09-13T15:32:31Z

This failed the dynamic gather test in test_any

AndrewZhaoLuo · 2021-09-13T17:40:43Z

This failed the dynamic gather test in test_any

Should be fixed now. In general it seems people are inconsistent with using kwargs in the codebase.

masahi · 2021-09-14T01:54:09Z

include/tvm/topi/transform.h

@@ -1242,6 +1244,8 @@ inline Tensor gather(const Tensor& data, int axis, const Tensor& indices,
    out_shape.push_back(indices->shape[i]);
  }

+  PrimExpr axis_size = data->shape[axis];


masahi · 2021-09-14T01:54:21Z

include/tvm/topi/transform.h

@@ -1252,12 +1256,13 @@ inline Tensor gather(const Tensor& data, int axis, const Tensor& indices,
        Array<PrimExpr> real_indices;
        for (size_t i = 0; i < ndim_i; ++i) {
          if (i == static_cast<size_t>(axis)) {
-            real_indices.push_back(indices(indices_position));
+            PrimExpr index = indices(indices_position);
+            real_indices.push_back(index);


remove this diff

masahi · 2021-09-14T02:05:15Z

src/te/tensor.cc

+    for (size_t i = 0; i < shape.size(); i++) {
+      PrimExpr new_index = if_then_else(indices[i] < make_const(indices[i]->dtype, 0),
+                                        indices[i] + shape[i], indices[i]);
+      indices.Set(i, new_index);


Negative indices handling is also done in

tvm/python/tvm/relay/op/transform.py

Lines 926 to 927 in d9fe672

begin = _make.where(begin < cast_like(const(0), begin), begin + ishape_slice, begin)

begin = _make.where(begin >= ishape_slice, ishape_slice, begin)

tvm/include/tvm/topi/detail/strided_slice.h

Lines 45 to 48 in cbe3dca

int64_t end_range = stride < 0 ? extent - 1 : extent;

if (index < 0) {

index += extent;

}

tvm/include/tvm/topi/detail/strided_slice.h

Line 105 in cbe3dca

PrimExpr b = begin[i] < 0 ? b_expr + idim : b_expr;

I believe there are other cases like this spread across the code base. Maybe we should revisit all index-taking op and centralize negative indices handling. Generally I think people prefer not making a change down the stack.

Hmm this is a good point. I think pushing down the stack is the right choice personally since I expect the most basic indexing op to work with negative indices. Since all of the other operations will use these basic indexing ops we should therefore get these things for free. In our case, we add a flag to a basic indexing operation which turns on this features.

Otherwise we'll get a lot of copies of the same code everywhere.

Yeah I agree that implementation-wise, this is more convenient. Since this is a fundamental data structure change, how about we open a separate PR for negative indexing support to te::Tensor, to get opinions from more people?

I think that's a fair point. I'll refactor this to use normalize_gather_indices() in the meantime and do as you say.

AndrewZhaoLuo · 2021-09-15T19:10:03Z

Ok folks, I've removed the controversial changes and did an alternate work around. PTAL when you have time.

AndrewZhaoLuo · 2021-09-15T20:39:00Z

#9023 <-- discussion about making negative indices simpler

jroesch · 2021-09-16T08:33:13Z

@masahi can you land this one if you are OK?

@AndrewZhaoLuo

* main: (102 commits) Implementation of relay_to_tir target hook (apache#8423) [Onnx] Fix NLL Loss tests (apache#8971) [Bugfix] Fix other div zero errors also in rewrite_simplify (apache#8983) [ONNX] enable the onnx tests after PR apache#8274 merged (apache#9019) [Hexagon] Disable `thread_local` on Hexagon (apache#9025) [Hexagon] Allow undefined symbols in libtvm_runtime.so on Hexagon (apache#9024) [Onnx] Add momentum (apache#9000) fix (apache#9021) [Community] @AndrewZhaoLuo -> Reviewer (apache#9020) [Hexagon] Implement model launcher (apache#8986) [Relay][Pass] Add ExtractOperators pass (apache#8996) [BYOC][TensorRT] Add TensorRT own int8 calibration support to TensorRT BYOC integration (apache#8808) [ONNX] Add Einsum converter (apache#8985) Add standalone_crt/ to be part of the wheel package, when available. (apache#9005) [Relay] Remove memory planing from LowerTEPass (apache#8974) [Hexagon] Treat floats as float32 when passing args to offloaded kernels (apache#9010) [Runtime] Pipeline Executor Initial patch. (apache#8702) [Hexagon] `llvm-options` attribute is an array of strings (apache#9011) disable cuda int8 schedule for non-cuda gpu target (apache#9014) [Torch] Add an option to make imported models compatible with the Relay text parser (apache#9015) ...

* support negatibve indices in gather * move check to Tensor level indexing, gathernd * add test, update transform.h * remove unneeded gather * missing gather nd change * update tests * proper tensor comparison * blacking * lint * fix error * turn on test * missing test case * revert changes * add normalize_gather_indices * undo change * update * more removing diffs * more undoing Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>

AndrewZhaoLuo changed the title ~~[Relay][Topi][Op][Onnx] Support negative indices in gather~~ [WIP][Relay][Topi][Op][Onnx] Support negative indices in gather Sep 8, 2021

AndrewZhaoLuo mentioned this pull request Sep 8, 2021

[Onnx] Turn off one more flaky test #8972

Merged

support negatibve indices in gather

593b188

AndrewZhaoLuo requested review from Hzfengsy and kparzysz-quic as code owners September 9, 2021 20:56

AndrewZhaoLuo changed the title ~~[Relay][Topi][Op][Onnx] Support negative indices in gather~~ [Relay][Topi][Op][Onnx] Support negative indices in gather, gathernd Sep 10, 2021

AndrewZhaoLuo changed the title ~~[Relay][Topi][Op][Onnx] Support negative indices in gather, gathernd~~ [Relay][Topi][Op][Onnx] Support negative indices in gather, gathernd. Fixes NLL Loss tests Sep 10, 2021

missing test case

02f1870

This was linked to issues Sep 14, 2021

[FLAKY] test_onnx_nodes[llvm-test_nllloss_NCd1d2d3_none_no_weight_negative_ii] #8918

Closed

[CI Problem][FLAKY] Possible flaky test: test_onnx_nodes[llvm-test_nllloss_NCd1d2d3_sum_weight_high_ii] #8964

Closed

masahi reviewed Sep 14, 2021

View reviewed changes

Andrew Zhao Luo added 2 commits September 15, 2021 11:33

revert changes

a184f7c

add normalize_gather_indices

56650da

AndrewZhaoLuo changed the title ~~[Relay][Topi][Op][Onnx] Support negative indices in gather, gathernd. Fixes NLL Loss tests~~ [Onnx] Fix NLL Loss tests Sep 15, 2021

Andrew Zhao Luo added 4 commits September 15, 2021 12:05

undo change

73d3d55

update

d7e24f8

more removing diffs

5ff388b

more undoing

dbbd42e

masahi mentioned this pull request Sep 15, 2021

[TE] Support negative indices #9023

Merged

mbrookhart approved these changes Sep 15, 2021

View reviewed changes

masahi approved these changes Sep 16, 2021

View reviewed changes

masahi merged commit 02fbaf0 into apache:main Sep 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Onnx] Fix NLL Loss tests #8971

[Onnx] Fix NLL Loss tests #8971

AndrewZhaoLuo commented Sep 8, 2021 •

edited

Loading

masahi commented Sep 9, 2021

AndrewZhaoLuo commented Sep 10, 2021

AndrewZhaoLuo commented Sep 10, 2021 •

edited

Loading

mbrookhart commented Sep 13, 2021

mbrookhart commented Sep 13, 2021

AndrewZhaoLuo commented Sep 13, 2021

masahi Sep 14, 2021

masahi Sep 14, 2021

masahi Sep 14, 2021 •

edited

Loading

AndrewZhaoLuo Sep 14, 2021

masahi Sep 14, 2021 •

edited

Loading

AndrewZhaoLuo Sep 14, 2021

AndrewZhaoLuo commented Sep 15, 2021

AndrewZhaoLuo commented Sep 15, 2021

jroesch commented Sep 16, 2021

	begin = _make.where(begin < cast_like(const(0), begin), begin + ishape_slice, begin)
	begin = _make.where(begin >= ishape_slice, ishape_slice, begin)

	int64_t end_range = stride < 0 ? extent - 1 : extent;
	if (index < 0) {
	index += extent;
	}

[Onnx] Fix NLL Loss tests #8971

[Onnx] Fix NLL Loss tests #8971

Conversation

AndrewZhaoLuo commented Sep 8, 2021 • edited Loading

masahi commented Sep 9, 2021

AndrewZhaoLuo commented Sep 10, 2021

AndrewZhaoLuo commented Sep 10, 2021 • edited Loading

mbrookhart commented Sep 13, 2021

mbrookhart commented Sep 13, 2021

AndrewZhaoLuo commented Sep 13, 2021

masahi Sep 14, 2021

Choose a reason for hiding this comment

masahi Sep 14, 2021

Choose a reason for hiding this comment

masahi Sep 14, 2021 • edited Loading

Choose a reason for hiding this comment

AndrewZhaoLuo Sep 14, 2021

Choose a reason for hiding this comment

masahi Sep 14, 2021 • edited Loading

Choose a reason for hiding this comment

AndrewZhaoLuo Sep 14, 2021

Choose a reason for hiding this comment

AndrewZhaoLuo commented Sep 15, 2021

AndrewZhaoLuo commented Sep 15, 2021

jroesch commented Sep 16, 2021

AndrewZhaoLuo commented Sep 8, 2021 •

edited

Loading

AndrewZhaoLuo commented Sep 10, 2021 •

edited

Loading

masahi Sep 14, 2021 •

edited

Loading

masahi Sep 14, 2021 •

edited

Loading