You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When creating a WholeMemoryEmbedding instance with memory_type=distributed, the code would crash with dtype=int64 or int32 (working fine with fp32 and fp64).
WholeMemory failure at file=/opt/rapids/wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu line=55: File /opt/rapids/wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu, line 55, it != ScatterFuncIntegerInt64_dispatch2_map->end() check failed.
WholeMemory failure at file=/opt/rapids/wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu line=55: File /opt/rapids/wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu, line 55, it != ScatterFuncIntegerInt64_dispatch2_map->end() check failed.
WholeMemory failure at file=/opt/rapids/wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu line=55: File /opt/rapids/wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu, line 55, it != ScatterFuncIntegerInt64_dispatch2_map->end() check failed.
WholeMemory failure at file=/opt/rapids/wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu line=55: File /opt/rapids/wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu, line 55, it != ScatterFuncIntegerInt64_dispatch2_map->end() check failed.
[2023-09-19 20:43:35,753] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 1203 closing signal SIGTERM
[2023-09-19 20:43:36,017] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: -6) local_rank: 0 (pid: 1202) of binary: /usr/bin/python
Environment
Version: 23.08
Source build with bash build.sh libwholegraph pylibwholegraph tests -v --allgpuarch
Additional notes:
It seems like a bug to me... For integer type of data, scatter func impl should dispatch from int types, instead of float types (HALF-FLOAT-DOUBLE), right?
🐛 Bug
When creating a WholeMemoryEmbedding instance with
memory_type=distributed
, the code would crash withdtype=int64
orint32
(working fine withfp32
andfp64
).To Reproduce
Minimum code to reproduce:
<error messages and stack traces>
Environment
bash build.sh libwholegraph pylibwholegraph tests -v --allgpuarch
Additional notes:
It seems like a bug to me... For integer type of data, scatter func impl should dispatch from int types, instead of float types (
HALF-FLOAT-DOUBLE
), right?wholegraph/cpp/src/wholememory_ops/functions/scatter_func_impl_integer_data_int64_indices.cu
Lines 40 to 41 in 2e963b9
The text was updated successfully, but these errors were encountered: