forked from PaddlePaddle/Paddle
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fused elementwise ops #1
Merged
HulekJakub
merged 124 commits into
HulekJakub:Fused_Elementwise_Kernel_And_Op
from
Silv3S:fused_elementwise_ops
Mar 10, 2023
Merged
Fused elementwise ops #1
HulekJakub
merged 124 commits into
HulekJakub:Fused_Elementwise_Kernel_And_Op
from
Silv3S:fused_elementwise_ops
Mar 10, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* add sigmoid composite rule * add python api * fix code style. * add check_prim=True * add sigmoid fp16 unit test. * fix code style. * rm bf16 check_prim * fix code style.
* matmul refactored * fc * SetOutMemDescWithLogicalLayoutFusesSupport * matmul_v2 * alpha support * group repetetive funcs * matmul utils * execute matmul methods * restore registered kernel names * split header and impl files * remove double negatives * increase coverage * add onednn tests to ctest * remove fusion logic from base matmuls
…Paddle#51101) * fix a bug which is triggered by the lack of __class__.op_type * remove two "self.__class__.op_type = self.op_type"
PaddlePaddle#50865) * move DeviceContextPool to phi * add EmplaceExternalContextFunc * update namespace * update cmake * fix bugs and create context_pool_impl.h * replace platform::is_xxx_place * fix bugs * update generator * fix bugs * fix bugs * fix bugs * fix bugs * fix bugs * fix bugs * fix bugs * fix enforce usage * Revert "fix enforce usage" This reverts commit 5f521f0. * fix bugs * rm XPUDeviceContext and CustomDeviceContext * fix bugs * fix fix context init bug * fix bugs after merge * fix bugs * fix name * fix mutable_data * update and fix bugs * fix bugs * update * fix bugs * fix name * fix bugs * merge * fix bugs * create context_pool in phi/backends * create context_pool in phi/backends * fix bugs * fix xpu bugs * fix rocm bugs * fix bugs * fix bugs * fix bugs * fix xpu bugs * update * update * fix bugs * fix bugs
* implement floor_grad by primitive logic * implement floor_grad by primitive logic * Merge branch 'develop' into floor_grad
* add float16 to pixel_shuffle * update * update * update solve PR-CI-Kunlun-R200-Test --------- Co-authored-by: wqgo <1552367872@qq.com>
* add float16 to less_than * Modify the function title * update * update * update solve PR-CI-Kunlun-KP-Build * update codestyle
* add float16 to greater_than * update codestyle --------- Co-authored-by: wqgo <1552367872@qq.com>
* [Auto Parallel] Speedup the completion process * [Auto Parallel] Skip the property of dist_context when deepcopying * [Auto Parallel] Remove the unnecessary print
* add float16 to less_equal * update * update * update codestyle * update file name --------- Co-authored-by: wqgo <1552367872@qq.com>
Fix the quotation mark format problem in PixelShuffle document
* Remove InterpretercoreInferShapeContext * Fix lod errors
…e#51153) * add bf16 fp16 type support for interpolate * add bf16 fp16 support for interpolate in phi on cpu
…addlePaddle#50094) * first approach * test finished * cpp test deleted * CmakeList corrected * multi_gru_seq_fuse_pass rewritten * dummy cout deleted * review changes * timeout extended
…addle#50359) * first commit. * change host logic * fix code bugs * fix code error --------- Co-authored-by: zhangbopd <1299246947@qq.com>
* Add output defs for sgd kernel * add datatype infer for sgd * add infer logic
* add output defs for fused_adam kernel * complete the oters defs for cpu and gpu * remove register for param_out
…dle#51262) * register custom kernel for all type of custom device * fix bug * fix GetKernelInputArgDef * fix amp bug * fix TransToPhiPlace * adapt interpreter_util
* support reshape test on prim and cinn * fix mkldnn test * polish test case
* [clear ps] del some all list * [clear ps] for ci * [clear ps] rm fluid emb
* fix accuracy and activation * add adadelta * support pad2d * support pad * modify exponential and linear_interp_v2 * modify meshgrid test * add group_norm * support some ops * modify activation&group norm * modify act * reset group_norm * modify acti * modify pow test * modify acti * lint * modify mkldnn test * fix acti * modify adadelta * lint * fix act mkldnn * reset acti
* support elementwise_pow bfloat16 * add only_check_prim parameters in check_grad * modify unit test * fix floor test * fix sigmoid bfloat16 test * norm some ops prim test * add unit16 for sqrt
* add softplus double grad * use constant method
* support run haokanctr model in heterps-models * polish setup.py * polish JVM_LIB in evn_dict
…rk ci. (PaddlePaddle#51309) * Add the collect and print of kernel registry infomation in op benchmark ci. * Little change to test the ci. * Remove the reduntant function. * Move the collect of kernel registry information to the end of ci.
* add prim erf grad * add yaml config for prim erf grad * add math.h * add cmath * add math defines * use define math * use define math * define M_2_SQRTPI * M_2_SQRTPI math * try math.h * fix typro * remove pow in erf grad * use new optest * add fp16 fp32 test * remove fp16 test
* move fluid.utils to paddle.utils.layers_utils * fix error * delete original fluid layers utils * remove import and old utils * remove more old utils import * change import path of fill_constant in the layers_utils.py * fix mistake * fix error * expose in __init__.py * for comment * when change the ref of func is_sequence, it should change to the root of is_sequence instead * for codecheck
* where op test * update bfloat16 * fix * fix windows ci * update bfloat16 data * fix bloat16 x * reset * fix randint * add print * add delta * cancel print * code style * update revirew
* AMP arange & Test * fix arange bfloat16 dtype * update for review * update for review2 * fix tile * update * fix ci * r * f * fix windows ci * update bfloat data * fix bloat16 input * add print * Update test_where_op.py * update kernel * del repeat * update review
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
PR changes
Describe