Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set_value OP No FP16 Kernel BUG #50151

Closed
DrRyanHuang opened this issue Feb 1, 2023 · 4 comments
Closed

set_value OP No FP16 Kernel BUG #50151

DrRyanHuang opened this issue Feb 1, 2023 · 4 comments
Assignees
Labels

Comments

@DrRyanHuang
Copy link
Member

DrRyanHuang commented Feb 1, 2023

bug描述 Describe the Bug

环境 AISTUDIO Paddle2.4

您好!这是我代码的一部分:

cls_targets[paddle.where(positive_indices)[0].flatten(), 
            assigned_annotations[:, -1][positive_indices].cast("int64")] = 1
>>> paddle.where(positive_indices)[0].flatten()
Tensor(shape=[1], dtype=int64, place=Place(gpu:0), stop_gradient=True,
       [54250])
>>> assigned_annotations[:, -1][positive_indices].cast("int64")
Tensor(shape=[1], dtype=int64, place=Place(gpu:0), stop_gradient=True,
       [12])
>>> cls_targets
Tensor(shape=[54402, 21], dtype=float16, place=Place(gpu:0), stop_gradient=True,
       [[0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        ...,
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.]])

在 AMP O2 的上下文环境下运行,报错:

Traceback (most recent call last):
  File "<string>", line 2, in <module>
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/varbase_patch_methods.py", line 793, in __setitem__
    return _setitem_impl_(self, item, value)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/variable_index.py", line 760, in _setitem_impl_
    inplace_map={"Input": "Out"})
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 4005, in append_op
    inplace_map,
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/tracer.py", line 314, in trace_op
    stop_gradient, inplace_map)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/tracer.py", line 176, in eager_legacy_trace_op
    returns = function_ptr(*arg_list, *attrs_list)
RuntimeError: (NotFound) There are no kernels which are registered in the set_value operator.
  [Hint: Expected kernels_iter != all_op_kernels.end(), but received kernels_iter == all_op_kernels.end().] (at /paddle/paddle/fluid/imperative/prepared_operator.cc:347)
  [operator < set_value > error]

而在 amp O1 上下文环境下则不会报错
代码地址:https://github.com/DrRyanHuang/DAL-Paddle/tree/master
VScode 配置中:

            "args": [
                "--dataset", "VOC",
                "--train_path", "/home/aistudio/data/data21544/PascalVOC2007/VOC2007_train_val/ImageSets/Main/trainval.txt",
                "--test_path", "/home/aistudio/data/data21544/PascalVOC2007/VOC2007_test/ImageSets/Main/test.txt",
                "--resume",
                // "--fleet",
                "--amp", "--amp_level", "O1"   // 复现 bug 将此处改为 O2
            ]

其他补充信息 Additional Supplementary Information

No response

@paddle-bot
Copy link

paddle-bot bot commented Feb 1, 2023

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

@qili93
Copy link
Contributor

qili93 commented Feb 2, 2023

2.4 分支的 set_value 算子是支持 float16 数据类型的,这个问题比较奇怪,辛苦提供以下信息帮助定位下

  1. Paddle 的 commit ID 版本,请运行以下命令
python -c "import paddle; print(paddle.__version__)"
python -c "import paddle; print(paddle.version.commit)"
  1. 运行报错的完整 GLOG 日志,请运行 PYTHON 任务时打开 GLOG 并保存日志,参考如下命令运行
# 在 Python 命令之前加上 GLOG_v=6
GLOG_v=6 python train.py ... .... > paddle_debug.log 2>&1

然后请提供以上的输出以及生成的 paddle_debug.log 日志方便定位具体问题,谢谢!

@qili93
Copy link
Contributor

qili93 commented Feb 2, 2023

另外请提供最小可复现的运行脚本,我们内部尝试复现一下这个问题。

@paddle-bot paddle-bot bot added status/following-up 跟进中 and removed status/new-issue 新建 labels Feb 2, 2023
@zhangting2020
Copy link
Contributor

#50340 中修复了此问题,目前可以通过自己编译Paddle最新源码运行。在下一个发布的新版本中也会包含此问题的修复。

@paddle-bot paddle-bot bot added the status/developed 开发完成 label Feb 9, 2023
@paddle-bot paddle-bot bot closed this as completed Feb 9, 2023
@paddle-bot paddle-bot bot removed the status/following-up 跟进中 label Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants