Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Onnx] Fix NLL Loss tests #8971

Merged
merged 19 commits into from
Sep 16, 2021

Conversation

AndrewZhaoLuo
Copy link
Contributor

@AndrewZhaoLuo AndrewZhaoLuo commented Sep 8, 2021

Supports negative indices for gather and gathernd op.

This was what caused issues #8918 and #8964. Finally, the onnx tests throw invalid indices at you but mask them with ignore_index. This was not accounted for and sometimes caused failures too.

This should fix the flaky onnx test and allow for the remaining non-expanded onnx tests

Testing:
Ran pytest tests/python/frontend/onnx/test_forward.py::test_onnx_nodes -k test_nllloss_NCd1d2d3_none_no_weight_negative_ii --count=100 and confirmed no failures

Ran pytest tests/python/frontend/onnx/test_forward.py::test_onnx_nodes -k test_nllloss_NCd1d2d3_sum_weight_high_ii --count=100 and confirmed no failures
Ran pytest tests/python/frontend/onnx/test_forward.py::test_onnx_nodes -k test_gather --count=100 and confirmed no failures

@AndrewZhaoLuo AndrewZhaoLuo changed the title [Relay][Topi][Op][Onnx] Support negative indices in gather [WIP][Relay][Topi][Op][Onnx] Support negative indices in gather Sep 8, 2021
@masahi
Copy link
Member

masahi commented Sep 9, 2021

@AndrewZhaoLuo Maybe we can remove this function?

def normalize_gather_indices(data, indices, axis):
"""Make sure gather indicies aren't negative"""

@AndrewZhaoLuo AndrewZhaoLuo changed the title [Relay][Topi][Op][Onnx] Support negative indices in gather [Relay][Topi][Op][Onnx] Support negative indices in gather, gathernd Sep 10, 2021
@AndrewZhaoLuo
Copy link
Contributor Author

This is ready for a full review now.

@AndrewZhaoLuo
Copy link
Contributor Author

AndrewZhaoLuo commented Sep 10, 2021

Another topic of discussion is whether we should just allow negative indices to be always on. There is overhead but I believe it is very small.

I did it with an indexmod operation instead of a branch and it increased runtime of gather by 3% across 100 iterations and a relatively small input size of [3, 3, 3]

@AndrewZhaoLuo AndrewZhaoLuo changed the title [Relay][Topi][Op][Onnx] Support negative indices in gather, gathernd [Relay][Topi][Op][Onnx] Support negative indices in gather, gathernd. Fixes NLL Loss tests Sep 10, 2021
@mbrookhart
Copy link
Contributor

@AndrewZhaoLuo, can you try profiling that with a larger input?

@mbrookhart
Copy link
Contributor

This failed the dynamic gather test in test_any

@AndrewZhaoLuo
Copy link
Contributor Author

This failed the dynamic gather test in test_any

Should be fixed now. In general it seems people are inconsistent with using kwargs in the codebase.

@@ -1242,6 +1244,8 @@ inline Tensor gather(const Tensor& data, int axis, const Tensor& indices,
out_shape.push_back(indices->shape[i]);
}

PrimExpr axis_size = data->shape[axis];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove it

@@ -1252,12 +1256,13 @@ inline Tensor gather(const Tensor& data, int axis, const Tensor& indices,
Array<PrimExpr> real_indices;
for (size_t i = 0; i < ndim_i; ++i) {
if (i == static_cast<size_t>(axis)) {
real_indices.push_back(indices(indices_position));
PrimExpr index = indices(indices_position);
real_indices.push_back(index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this diff

src/te/tensor.cc Outdated
for (size_t i = 0; i < shape.size(); i++) {
PrimExpr new_index = if_then_else(indices[i] < make_const(indices[i]->dtype, 0),
indices[i] + shape[i], indices[i]);
indices.Set(i, new_index);
Copy link
Member

@masahi masahi Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative indices handling is also done in

begin = _make.where(begin < cast_like(const(0), begin), begin + ishape_slice, begin)
begin = _make.where(begin >= ishape_slice, ishape_slice, begin)

int64_t end_range = stride < 0 ? extent - 1 : extent;
if (index < 0) {
index += extent;
}

PrimExpr b = begin[i] < 0 ? b_expr + idim : b_expr;

I believe there are other cases like this spread across the code base. Maybe we should revisit all index-taking op and centralize negative indices handling. Generally I think people prefer not making a change down the stack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this is a good point. I think pushing down the stack is the right choice personally since I expect the most basic indexing op to work with negative indices. Since all of the other operations will use these basic indexing ops we should therefore get these things for free. In our case, we add a flag to a basic indexing operation which turns on this features.

Otherwise we'll get a lot of copies of the same code everywhere.

Copy link
Member

@masahi masahi Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree that implementation-wise, this is more convenient. Since this is a fundamental data structure change, how about we open a separate PR for negative indexing support to te::Tensor, to get opinions from more people?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a fair point. I'll refactor this to use normalize_gather_indices() in the meantime and do as you say.

@AndrewZhaoLuo AndrewZhaoLuo changed the title [Relay][Topi][Op][Onnx] Support negative indices in gather, gathernd. Fixes NLL Loss tests [Onnx] Fix NLL Loss tests Sep 15, 2021
@AndrewZhaoLuo
Copy link
Contributor Author

Ok folks, I've removed the controversial changes and did an alternate work around. PTAL when you have time.

@AndrewZhaoLuo
Copy link
Contributor Author

#9023 <-- discussion about making negative indices simpler

@jroesch
Copy link
Member

jroesch commented Sep 16, 2021

@masahi can you land this one if you are OK?

@masahi masahi merged commit 02fbaf0 into apache:main Sep 16, 2021
AndrewZhaoLuo added a commit to AndrewZhaoLuo/tvm that referenced this pull request Sep 16, 2021
* main: (102 commits)
  Implementation of relay_to_tir target hook (apache#8423)
  [Onnx] Fix NLL Loss tests (apache#8971)
  [Bugfix] Fix other div zero errors also in rewrite_simplify (apache#8983)
  [ONNX] enable the onnx tests after PR apache#8274 merged (apache#9019)
  [Hexagon] Disable `thread_local` on Hexagon (apache#9025)
  [Hexagon] Allow undefined symbols in libtvm_runtime.so on Hexagon (apache#9024)
  [Onnx] Add momentum (apache#9000)
  fix (apache#9021)
  [Community] @AndrewZhaoLuo -> Reviewer (apache#9020)
  [Hexagon] Implement model launcher (apache#8986)
  [Relay][Pass] Add ExtractOperators pass (apache#8996)
  [BYOC][TensorRT] Add TensorRT own int8 calibration support to TensorRT BYOC integration (apache#8808)
  [ONNX] Add Einsum converter (apache#8985)
  Add standalone_crt/ to be part of the wheel package, when available. (apache#9005)
  [Relay] Remove memory planing from LowerTEPass  (apache#8974)
  [Hexagon] Treat floats as float32 when passing args to offloaded kernels (apache#9010)
  [Runtime] Pipeline Executor Initial patch. (apache#8702)
  [Hexagon] `llvm-options` attribute is an array of strings (apache#9011)
  disable cuda int8 schedule for non-cuda gpu target (apache#9014)
  [Torch] Add an option to make imported models compatible with the Relay text parser (apache#9015)
  ...
ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
* support negatibve indices in gather

* move check to Tensor level indexing, gathernd

* add test, update transform.h

* remove unneeded gather

* missing gather nd change

* update tests

* proper tensor comparison

* blacking

* lint

* fix error

* turn on test

* missing test case

* revert changes

* add normalize_gather_indices

* undo change

* update

* more removing diffs

* more undoing

Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
* support negatibve indices in gather

* move check to Tensor level indexing, gathernd

* add test, update transform.h

* remove unneeded gather

* missing gather nd change

* update tests

* proper tensor comparison

* blacking

* lint

* fix error

* turn on test

* missing test case

* revert changes

* add normalize_gather_indices

* undo change

* update

* more removing diffs

* more undoing

Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants