Skip to content

Commit

Permalink
cpu: x64: fix segfault in fused 1x1-depthwise conv
Browse files Browse the repository at this point in the history
For activations in nxc layout, the intermediate buffer needs to hold the
full OC dimension. This aligns sse41 implementation with that of avx2.
  • Loading branch information
kwiersch authored and tprimak committed Mar 29, 2023
1 parent 1382605 commit f708100
Showing 1 changed file with 7 additions and 4 deletions.
11 changes: 7 additions & 4 deletions src/cpu/x64/jit_sse41_1x1_conv_kernel_f32.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright 2017-2022 Intel Corporation
* Copyright 2017-2023 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -678,11 +678,14 @@ status_t jit_sse41_1x1_conv_kernel_f32::init_conf(jit_1x1_conv_conf_t &jcp,
jcp.load_loop_load_step = jcp.ic * jcp.oc_block * sizeof(float);
jcp.load_loop_iter_step = jcp.oc_block;

load_blocking = 120; // assumes the kernel is jcp.ur x 3
load_blocking_max = 144;
load_blocking = is_data_layout_nxc
? jcp.load_dim
: 120; // assumes the kernel is jcp.ur x 3
load_blocking_max = is_data_layout_nxc ? jcp.load_dim : 144;
bcast_blocking = 128; // affects load balancing across threads
bcast_blocking_max = 192;
reduce_blocking = 128; // affects L1$ utilization
reduce_blocking = is_data_layout_nxc ? jcp.reduce_dim
: 128; // affects L1$ utilization
} else if (jcp.prop_kind == backward_data) {
jcp.reduce_dim = jcp.oc;
jcp.reduce_block = jcp.oc_block;
Expand Down

0 comments on commit f708100

Please sign in to comment.