Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fused elementwise ops #1

Conversation

HulekJakub
Copy link
Owner

PR types

PR changes

Describe

zxcd and others added 30 commits March 3, 2023 23:17
* add sigmoid composite rule

* add python api

* fix code style.

* add check_prim=True

* add sigmoid fp16 unit test.

* fix code style.

* rm bf16 check_prim

* fix code style.
* matmul refactored

* fc

* SetOutMemDescWithLogicalLayoutFusesSupport

* matmul_v2

* alpha support

* group repetetive funcs

* matmul utils

* execute matmul methods

* restore registered kernel names

* split header and impl files

* remove double negatives

* increase coverage

* add onednn tests to ctest

* remove fusion logic from base matmuls
…Paddle#51101)

* fix a bug which is triggered by the lack of __class__.op_type

* remove two "self.__class__.op_type = self.op_type"
PaddlePaddle#50865)

* move DeviceContextPool to phi

* add EmplaceExternalContextFunc

* update namespace

* update cmake

* fix bugs and create context_pool_impl.h

* replace platform::is_xxx_place

* fix bugs

* update generator

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix enforce usage

* Revert "fix enforce usage"

This reverts commit 5f521f0.

* fix bugs

* rm XPUDeviceContext and CustomDeviceContext

* fix bugs

* fix fix context init bug

* fix bugs after merge

* fix bugs

* fix name

* fix mutable_data

* update and fix bugs

* fix bugs

* update

* fix bugs

* fix name

* fix bugs

* merge

* fix bugs

* create context_pool in phi/backends

* create context_pool in phi/backends

* fix bugs

* fix xpu bugs

* fix rocm bugs

* fix bugs

* fix bugs

* fix bugs

* fix xpu bugs

* update

* update

* fix bugs

* fix bugs
* implement floor_grad by primitive logic

* implement floor_grad by primitive logic

* Merge branch 'develop' into floor_grad
* add float16 to pixel_shuffle

* update

* update

* update solve PR-CI-Kunlun-R200-Test

---------

Co-authored-by: wqgo <1552367872@qq.com>
* add float16 to less_than

* Modify the function title

* update

* update

* update  solve PR-CI-Kunlun-KP-Build

* update codestyle
* add float16 to greater_than

* update codestyle

---------

Co-authored-by: wqgo <1552367872@qq.com>
* [Auto Parallel] Speedup the completion process

* [Auto Parallel] Skip the property of dist_context when deepcopying

* [Auto Parallel] Remove the unnecessary print
* add float16 to less_equal

* update

* update

* update codestyle

* update file name

---------

Co-authored-by: wqgo <1552367872@qq.com>
Fix the quotation mark format problem in PixelShuffle document
* Remove InterpretercoreInferShapeContext

* Fix lod errors
…e#51153)

* add bf16 fp16 type support for interpolate

* add bf16 fp16 support for interpolate in phi on cpu
…addlePaddle#50094)

* first approach

* test finished

* cpp test deleted

* CmakeList corrected

* multi_gru_seq_fuse_pass rewritten

* dummy cout deleted

* review changes

* timeout extended
…addle#50359)

* first commit.

* change host logic

* fix code bugs

* fix code error

---------

Co-authored-by: zhangbopd <1299246947@qq.com>
jinyouzhi and others added 29 commits March 9, 2023 14:13
* Add output defs for sgd kernel

* add datatype infer for sgd

* add infer logic
* add output defs for fused_adam kernel

* complete the oters defs for cpu and gpu

* remove register for param_out
…dle#51262)

* register custom kernel for all type of custom device

* fix bug

* fix GetKernelInputArgDef

* fix amp bug

* fix TransToPhiPlace

* adapt interpreter_util
* support reshape test on prim and cinn

* fix mkldnn test

* polish test case
* [clear ps] del some all list

* [clear ps] for ci

* [clear ps] rm fluid emb
* fix accuracy and activation

* add adadelta

* support pad2d

* support pad

* modify exponential and linear_interp_v2

* modify meshgrid test

* add group_norm

* support some ops

* modify activation&group norm

* modify act

* reset group_norm

* modify acti

* modify pow test

* modify acti

* lint

* modify mkldnn test

* fix acti

* modify adadelta

* lint

* fix act mkldnn

* reset acti
* support elementwise_pow bfloat16

* add only_check_prim parameters in check_grad

* modify unit test

* fix floor test

* fix sigmoid bfloat16 test

* norm some ops prim test

* add unit16 for sqrt
* add softplus double grad

* use constant method
* support run haokanctr model in heterps-models

* polish setup.py

* polish JVM_LIB in evn_dict
…rk ci. (PaddlePaddle#51309)

* Add the collect and print of kernel registry infomation in op benchmark ci.

* Little change to test the ci.

* Remove the reduntant function.

* Move the collect of kernel registry information to the end of ci.
* add prim erf grad

* add yaml config for prim erf grad

* add math.h

* add cmath

* add math  defines

* use define math

* use define math

* define M_2_SQRTPI

* M_2_SQRTPI math

* try math.h

* fix typro

* remove pow in erf grad

* use new optest

* add fp16 fp32 test

* remove fp16 test
* move fluid.utils to paddle.utils.layers_utils

* fix error

* delete original fluid layers utils

* remove import and old utils

* remove more old utils import

* change import path of fill_constant in the layers_utils.py

* fix mistake

* fix error

* expose in __init__.py

* for comment

* when change the ref of func is_sequence, it should change to the root of is_sequence instead

* for codecheck
* where op test

* update bfloat16

* fix

* fix windows ci

* update bfloat16 data

* fix bloat16 x

* reset

* fix randint

* add print

* add delta

* cancel print

* code style

* update revirew
* AMP arange & Test

* fix arange bfloat16 dtype

* update for review

* update for review2

* fix tile

* update

* fix ci

* r

* f

* fix windows ci

* update bfloat data

* fix bloat16 input

* add print

* Update test_where_op.py

* update kernel

* del repeat

* update review
@HulekJakub HulekJakub merged commit e4be4cc into HulekJakub:Fused_Elementwise_Kernel_And_Op Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.