Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_bev_features 函数中,有一个变量 bev_embed 被赋值为一个字符串类型,而ret_dict 实际上是一个张量,怎么办? #176

Open
yuanryann opened this issue Jun 14, 2024 · 1 comment

Comments

@yuanryann
Copy link

2024-06-14 09:46:00,707 - mmdet - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(ABOVE_NORMAL) Fp16OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

before_train_epoch:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(NORMAL ) DistSamplerSeedHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

before_train_iter:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook

after_train_iter:
(ABOVE_NORMAL) Fp16OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

after_train_epoch:
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

before_val_epoch:
(NORMAL ) DistSamplerSeedHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

before_val_iter:
(LOW ) IterTimerHook

after_val_iter:
(LOW ) IterTimerHook

after_val_epoch:
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

after_run:
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

2024-06-14 09:46:00,708 - mmdet - INFO - workflow: [('train', 1)], max: 24 epochs
2024-06-14 09:46:00,709 - mmdet - INFO - Checkpoints will be saved to /home/com0179/AI/MapTR/work_dirs/maptr_tiny_r50_24e by HardDiskBackend.
/home/com0179/AI/MapTR/projects/mmdet3d_plugin/models/utils/grid_mask.py:114: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:180.)
mask = torch.from_numpy(mask).to(x.dtype).cuda()
ret_dict: tensor([[[ 0.1889, -0.1615, 2.1812, ..., -1.0996, -1.4039, 0.6895],
[ 1.4819, -0.6015, 0.8437, ..., -0.6222, -0.7078, 0.7483],
[ 0.4329, 0.2124, 1.4304, ..., -1.9791, -0.9732, 0.6492],
...,
[-0.3204, -0.4688, 0.5317, ..., -1.9080, -0.5561, 0.6536],
[ 0.4254, -0.1113, 1.2542, ..., -1.9874, -0.6516, 1.0486],
[-0.2349, 0.8355, 0.9105, ..., -1.3129, 0.1006, 1.3759]],

    [[-0.2733,  0.0749,  0.9204,  ...,  0.9150, -0.3261,  0.0139],
     [ 1.3868, -0.3957,  0.8588,  ..., -1.4051, -0.0948,  0.3878],
     [ 0.8097,  0.7675,  0.6791,  ..., -0.4050, -0.3664, -0.3884],
     ...,
     [-1.0428, -0.7296,  0.3283,  ..., -2.0839, -0.6283,  1.3728],
     [-0.5850, -0.4228,  0.1651,  ..., -1.4061, -0.2002,  0.2984],
     [-0.8431,  1.0897,  0.4802,  ..., -1.9049, -0.2679,  1.8028]],

    [[ 0.7818, -0.6220,  1.4299,  ..., -1.4584, -2.0435,  0.2221],
     [ 1.0930, -0.2832,  0.5768,  ..., -0.3528, -0.5643,  0.1527],
     [ 0.7040, -0.0652,  1.5784,  ..., -1.1005, -0.4832, -0.1628],
     ...,
     [-0.7733, -1.2431,  0.6865,  ..., -2.4375, -0.8437,  1.2103],
     [-0.0844, -0.8666,  1.0173,  ..., -1.3839, -0.5428,  0.8602],
     [-0.2918,  0.1805,  0.2343,  ..., -0.1657, -0.3963,  1.7632]],

    [[ 0.8106,  0.2636,  1.1491,  ..., -0.6950, -0.6393,  0.6001],
     [ 1.6005, -0.2310,  1.1513,  ..., -0.4952, -0.2108,  0.5619],
     [ 0.4873,  0.1370,  0.7079,  ..., -0.9651, -0.5468,  0.6746],
     ...,
     [-0.8568, -1.1599,  0.2693,  ..., -2.6332, -1.6124,  1.2802],
     [ 0.1471,  0.2384,  0.8299,  ..., -1.7544, -0.6352,  1.3663],
     [ 0.3371,  1.3895,  0.4540,  ..., -1.4025, -0.7343,  1.7416]]],
   device='cuda:0', grad_fn=<NativeLayerNormBackward>)

Traceback (most recent call last):
File "./tools/train.py", line 259, in
main()
File "./tools/train.py", line 248, in main
custom_train_model(
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/bevformer/apis/train.py", line 27, in custom_train_model
custom_train_detector(
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/bevformer/apis/mmdet_train.py", line 199, in custom_train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
outputs = self.model.train_step(data_batch, self.optimizer,
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 52, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 162, in forward
return self.forward_train(**kwargs)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 277, in forward_train
losses_pts = self.forward_pts_train(img_feats, lidar_feat, gt_bboxes_3d,
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 141, in forward_pts_train
outs = self.pts_bbox_head(
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/dense_heads/maptr_head.py", line 254, in forward
outputs = self.transformer(
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 372, in forward
ouput_dic = self.get_bev_features(
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 267, in get_bev_features
if 'bev' in ret_dict:
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/_tensor.py", line 670, in contains
raise RuntimeError(
RuntimeError: Tensor.contains only supports Tensor or scalar, but you passed in a <class 'str'>.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1712135) of binary: /home/com0179/anaconda3/envs/MapTR/bin/python3
Traceback (most recent call last):
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/run.py", line 689, in run
elastic_launch(
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:


    ./tools/train.py FAILED        

=======================================

@ADMIN-RyanLin
Copy link

遇到过相似的问题,注意maptr版本的对应,v2要用v2的脚本

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants