-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpu: de-duplicate some of the operators and refactor #1144
base: master
Are you sure you want to change the base?
Conversation
src/ggml-cpu/unary_ops.cpp
Outdated
GGML_ASSERT( nb0 == sizeof(dst_t)); | ||
GGML_ASSERT(nb00 == sizeof(src0_t)); | ||
|
||
const auto [ir0, ir1] = get_thread_range(params, src0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully all C++ compilers support structured bindings today. If we encounter issues, we might have to return simple struct of integers.
Wait for @slaren's review before merging. Thanks! |
@ggerganov Yes, will do :) Thanks for the comments, fixed them in the latest commit. The CI runner also passes. |
I will review when I have a chance, but I may not be able to do it for a few days. |
Thanks @slaren No hurry :) @slaren @ggerganov I made some progress on the next PR (a bigger refactor). It's working after the refactor, and passes the runners on ggml-ci. Just thought of sharing the approach early, to see if you see any obvious red-flags. I'll split it into two PRs:
The operator functions file needed a very small number of cosmetic changes, like replacing direct references to At a broad level, does this approach raise any obvious red-flags for you? It seems to work on all the ggml-ci runners, and doesn't seem to skip any CPU extension. But I'll test and read the diff a few more times before submitting the PRs. Thanks |
Note: This PR only deletes lines from
ggml-cpu.c
, it does not modify any functions in it. I'm not sure why git diff tries to combine them as changes. Standarddiff
shows the correct diff ofggml-cpu.c
: https://gist.github.com/cmdr2/a76df5af311417619788e8330b1908b3This PR de-duplicates some of the easily-templatized functions in
ggml-cpu.c
. It takes inspiration frombinbcast.cu
andcommon.cuh
.add
,sub
,mul
,div
abs
,sgn
,neg
,step
,tanh
,elu
,relu
,sigmoid
,hardsigmoid
,exp
,hardswish
,sqr
,sqrt
,sin
,cos
,log
This removes the op implementation functions from
ggml-cpu.c
(around 2000 lines). As a side-effect, all the functions now support bf16 as well as non-contiguoussrc1
.The performance is the same as the current implementation. It also passes all the runners on ggml-ci, which tested non-contiguous inputs (in
SAM
) andvDSP
on Mac.