-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add aten::maxunpool2d and aten::maxunpool3d #856
Conversation
yucai-intel
commented
Aug 30, 2024
•
edited by xytintel
Loading
edited by xytintel
- max_unpool2d
- max_unpool2d.out
- max_unpool3d
- max_unpool3d.out
b047ddd
to
b1087b8
Compare
src/ATen/native/xpu/MaxUnpooling.cpp
Outdated
const Tensor& indices, | ||
IntArrayRef output_size, | ||
Tensor& out) { | ||
native::xpu::max_unpooling2d_forward_template( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename: max_unpooling2d_forward_kernel
src/ATen/native/xpu/MaxUnpooling.cpp
Outdated
IntArrayRef stride, | ||
IntArrayRef padding, | ||
Tensor& out) { | ||
native::xpu::max_unpooling3d_forward_template( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_unpooling3d_forward_kernel
is_channels_last(is_channels_last_) {} | ||
|
||
private: | ||
const int64_t numInputElements; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using numInputElements_ as private variable name
|
||
private: | ||
scalar_t* input_data; | ||
int64_t* indices_data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input_data_
output_data(output_data_) {} | ||
|
||
private: | ||
const int64_t numInputElements; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numInputElements_
oH, | ||
oW, | ||
offsetZ); | ||
totalZ -= 65535; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
65535 is CUDA restriction, we don't need to follow, just launch totalZ number.
#include <c10/util/Exception.h> | ||
#include <comm/MemoryFormat.h> | ||
#include <comm/SYCLHelpers.h> | ||
#include <torch/library.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this head file ?
<ATen/native/xpu/sycl/Atomics.h>
<ATen/native/xpu/sycl/BatchKernel.h>
#include <ATen/native/xpu/sycl/NumericLimits.h>
#include <c10/core/Scalar.h>
#include <c10/util/Exception.h>
#include <torch/library.h>
please check
#pragma GCC diagnostic push | ||
// Avoid SYCL compiler return-type error | ||
#pragma clang diagnostic ignored "-Wreturn-type" | ||
#pragma GCC diagnostic ignored "-Wreturn-type" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if atomic is no needed please remove above code
int64_t slice = | ||
(item.get_group()[0] + offsetZ_) / iT_; // input slice/feature | ||
if (iRow < iH_ && iColumn < iW_) { | ||
scalar_t val = input_ptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls use PackedTensorAccessor64