Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AArch64: convolution diff issue with gemm_s8u8s32 kernel #2675

Open
xiang1guo opened this issue Feb 12, 2025 · 0 comments
Open

AArch64: convolution diff issue with gemm_s8u8s32 kernel #2675

xiang1guo opened this issue Feb 12, 2025 · 0 comments
Assignees
Labels
platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 sighting Suspicious library behavior. Should be promoted to a bug when confirmed

Comments

@xiang1guo
Copy link
Contributor

Backgroud

This is a follow up of issue 2674. Share the same background with issue 2674, but for a different test case as follows. The case is skipped in #2168 because of failures.

test_graph_unit_dnnl_large_partition_usm_cpu(test_large_partition_execute.Int8Resnet50Stage2Block)

Summary

When I try to analyze the issue, I found that the test case can also be reproduced with benchdnn without graph API component.
The failed kernel is convolution,gemm_s8u8s32:ref
See the following log:

ONEDNN_VERBOSE=1 ./tests/benchdnn/benchdnn --conv --reset --allow-enum-tags-only=0 --engine=cpu --dir=FWD_I --alg=direct --dt=u8:s8:u8 --bia-dt=f32 
--stag=acdb --wtag=any --dtag=acdb --attr-post-ops=eltwise_relu --attr-scales=src0:common:0.5+dst:common:0.5+wei:per_oc --attr-zero-points=src0:common:1+dst:common:1 --attr-scratchpad=user
 mb1_ic8oc8_ih12oh12kh1sh1dh0ph0_iw12ow12kw1sw1dw0pw0
onednn_verbose,v1,info,oneDNN v3.8.0 (commit af1410c21a7455af587ae496c719ac7896d8ed95)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:4
onednn_verbose,v1,info,cpu,isa:AArch64 SVE (256 bits)
onednn_verbose,v1,info,gpu,runtime:none
onednn_verbose,v1,info,graph,backend,0:dnnl_backend
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,implementation,backend,exec_time
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:f32::blocked:a::f0 dst:f32::blocked:a::f0,,,8,0.0109863
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:f32::blocked:abcd::f0 dst:s8::blocked:cdba::f8:zpm1,,,8x8x1x1,0.163086
onednn_verbose,v1,primitive,exec,cpu,reorder,rnn_data_reorder,undef,src:f32::blocked:abcd::f0 dst:s8::blocked:abcd::f0,,,8x8x1x1,0.0268555
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:s8::blocked:abcd::f0 dst:s8::blocked:cdba::f8:zpm1,,,8x8x1x1,0.0200195
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:f32::blocked:abcd::f0 dst:u8::blocked:acdb::f0,,,1x8x12x12,0.104004
onednn_verbose,v1,primitive,exec,cpu,convolution,gemm_s8u8s32:ref,forward_inference,src:u8::blocked:acdb::f0 wei:s8:a:blocked:cdba::f8:zpm1 bia:f32:a:blocked:a::f0 dst:u8::blocked:acdb::f0,attr-scratchpad:user attr-scales:src0:0:f32+dst:0:f32+wei:1:f32 attr-zero-points:src0:0:s32+dst:0:s32 attr-post-ops:eltwise_relu,alg:convolution_direct,mb1_ic8oc8_ih12oh12kh1sh1dh0ph0_iw12ow12kw1sw1dw0pw0,0.177002
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:u8::blocked:acdb::f0 dst:f32::blocked:abcd::f0,,,1x8x12x12,0.0229492
[   0][DST][0:0:0:0] exp_f32:          14 exp:          14 got:          16 diff:       2 rdiff:0.142857
[   1][DST][0:0:0:1] exp_f32:        21.5 exp:          22 got:          23 diff:       1 rdiff:0.0454545
[   2][DST][0:0:0:2] exp_f32:        15.5 exp:          16 got:          17 diff:       1 rdiff:  0.0625
[   3][DST][0:0:0:3] exp_f32:       14.75 exp:          15 got:          16 diff:       1 rdiff:0.0666667
[   4][DST][0:0:0:4] exp_f32:          19 exp:          19 got:          20 diff:       1 rdiff:0.0526316
[   5][DST][0:0:0:5] exp_f32:       11.75 exp:          12 got:          13 diff:       1 rdiff:0.0833333
[   6][DST][0:0:0:6] exp_f32:          15 exp:          15 got:          16 diff:       1 rdiff:0.0666667
[   7][DST][0:0:0:7] exp_f32:        15.5 exp:          16 got:          17 diff:       1 rdiff:  0.0625
[   8][DST][0:0:0:8] exp_f32:           8 exp:           8 got:          10 diff:       2 rdiff:    0.25
[   9][DST][0:0:0:9] exp_f32:       20.25 exp:          20 got:          22 diff:       2 rdiff:     0.1
[COMPARE_STATS][DST]: trh=0 err_max_diff:      32 err_max_rdiff:      32 all_max_diff:      32 all_max_rdiff:      32
0:FAILED (errors:897 total:1152) __REPRO: --conv --allow-enum-tags-only=false --dir=FWD_I --dt=u8:s8:u8 --bia-dt=f32 --stag=acdb --dtag=acdb --attr-scales=src:common:0.5+dst:common:0.5+wei:per_oc --attr-zero-points=src:common:1+dst:common:1 --attr-post-ops=relu --attr-scratchpad=user mb1ic8ih12oc8oh12kh1ph0
tests:1 passed:0 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:1 listed:0
total: 0.01s; fill: 0.00s (52%); compute_ref: 0.00s (5%); compare: 0.00s (11%);

Environment

  • system: Linux 22.04.1-Ubuntu SMP aarch64 aarch64 aarch64 GNU/Linux
  • gcc: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
  • cmake cmake version 3.22.1

Steps to reproduce

  • Build library:
1. setup ACL library
    git clone --branch v24.11.1 --depth 1 https://github.com/ARM-software/ComputeLibrary.git 
    git checkout 1f3bf6bbc4a1a57b5915fc0a19b195ae53acc06d
    scons -j4 Werror=0 debug=0 neon=1 opencl=0 embed_kernels=0 os=linux arch=armv8.2-a build=native multi_isa=1 fixed_format_kernels=1 cppthreads=0 openmp=1 examples=0 validation_tests=0
2. export ACL_ROOT_DIR=/path/to/ComputeLibrary
3. build oneDNN
    cmake .. -DDNNL_AARCH64_USE_ACL=ON -DONEDNN_BUILD_GRAPH=ON -DDNNL_CPU_RUNTIME=OMP -DONEDNN_WERROR=ON -DDNNL_BUILD_FOR_CI=ON -DONEDNN_TEST_SET=NIGHTLY -DCMAKE_BUILD_TYPE=Debug
    make -j 4
  • Run test:
ONEDNN_VERBOSE=1 ./tests/benchdnn/benchdnn --conv --reset --allow-enum-tags-only=0 --engine=cpu --dir=FWD_I --alg=direct --dt=u8:s8:u8 --bia-dt=f32 
--stag=acdb --wtag=any --dtag=acdb --attr-post-ops=eltwise_relu --attr-scales=src0:common:0.5+dst:common:0.5+wei:per_oc --attr-zero-points=src0:common:1+dst:common:1 --attr-scratchpad=user
 mb1_ic8oc8_ih12oh12kh1sh1dh0ph0_iw12ow12kw1sw1dw0pw0
@xiang1guo xiang1guo added platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 sighting Suspicious library behavior. Should be promoted to a bug when confirmed labels Feb 12, 2025
@TaoLv TaoLv changed the title Aarc64: convolution diff issue with gemm_s8u8s32 kernel AArch64: convolution diff issue with gemm_s8u8s32 kernel Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 sighting Suspicious library behavior. Should be promoted to a bug when confirmed
Projects
None yet
Development

No branches or pull requests

2 participants