YOLO-NAS-POSE doesn't work with GPU #1886

Daanfb · 2024-03-05T09:22:35Z

🐛 Describe the bug

I'm trying to do inference of an image with cuda but it doesn't work. With cpu it works fine.
This is my code:

from super_gradients.training import models
import torch

model = models.get("yolo_nas_pose_l", pretrained_weights="coco_pose").cuda()

url_im = 'https://th.bing.com/th/id/R.4403419aadfdab2366f83d126328e83a?rik=KUpElyH0bLElCA&pid=ImgRaw&r=0'
prediction = model.to("cuda").predict(url_im, conf=0.5)

This is the error I get:

NotImplementedError                       Traceback (most recent call last)
Cell In[22], line 4
      1 model = models.get("yolo_nas_pose_l", pretrained_weights="coco_pose").cuda()
      3 url_im = 'https://th.bing.com/th/id/R.44034[1](vscode-notebook-cell:?execution_count=18&line=1)9aadfdab2366f83d126328e83a?rik=KUpElyH0bLElCA&pid=ImgRaw&r=0'
----> 4 prediction = model.to("cuda").predict(url_im, conf=0.5)

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\super_gradients\training\models\pose_estimation_models\yolo_nas_pose\yolo_nas_pose_variants.py:171, in YoloNASPose.predict(self, images, iou, conf, pre_nms_max_predictions, post_nms_max_predictions, batch_size, fuse_model, skip_image_resizing)
    153 """Predict an image or a list of images.
    154 
    155 :param images:              Images to predict.
   (...)
    161 :param skip_image_resizing: If True, the image processor will not resize the images.
    162 """
    163 pipeline = self._get_pipeline(
    164     iou=iou,
    165     conf=conf,
   (...)
    169     skip_image_resizing=skip_image_resizing,
    170 )
--> 171 return pipeline(images, batch_size=batch_size)

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\super_gradients\training\pipelines\pipelines.py:118, in Pipeline.__call__(self, inputs, batch_size)
    116     return self.predict_video(inputs, batch_size)
    117 elif check_image_typing(inputs):
--> 118     return self.predict_images(inputs, batch_size)
    119 else:
    120     raise ValueError(f"Input {inputs} not supported for prediction.")

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\super_gradients\training\pipelines\pipelines.py:134, in Pipeline.predict_images(self, images, batch_size)
    131 images = load_images(images)
    133 result_generator = self._generate_prediction_result(images=images, batch_size=batch_size)
--> 134 return self._combine_image_prediction_to_images(result_generator, n_images=len(images))

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\super_gradients\training\pipelines\pipelines.py:431, in PoseEstimationPipeline._combine_image_prediction_to_images(self, images_predictions, n_images)
    426 def _combine_image_prediction_to_images(
    427     self, images_predictions: Iterable[PoseEstimationPrediction], n_images: Optional[int] = None
    428 ) -> Union[ImagesPoseEstimationPrediction, ImagePoseEstimationPrediction]:
    429     if n_images is not None and n_images == 1:
    430         # Do not show tqdm progress bar if there is only one image
--> 431         images_predictions = next(iter(images_predictions))
    432     else:
    433         images_predictions = [image_predictions for image_predictions in tqdm(images_predictions, total=n_images, desc="Predicting Images")]

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\super_gradients\training\pipelines\pipelines.py:173, in Pipeline._generate_prediction_result(self, images, batch_size)
    171 else:
    172     for batch_images in generate_batch(images, batch_size):
--> 173         yield from self._generate_prediction_result_single_batch(batch_images)

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\super_gradients\training\pipelines\pipelines.py:218, in Pipeline._generate_prediction_result_single_batch(self, images)
    216         self._fuse_model(torch_inputs)
    217     model_output = self.model(torch_inputs)
--> 218     predictions = self._decode_model_output(model_output, model_input=torch_inputs)
    220 # Postprocess
    221 postprocessed_predictions = []

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\super_gradients\training\pipelines\pipelines.py:404, in PoseEstimationPipeline._decode_model_output(self, model_output, model_input)
    397 def _decode_model_output(self, model_output: Union[List, Tuple, torch.Tensor], model_input: np.ndarray) -> List[PoseEstimationPrediction]:
    398     """Decode the model output, by applying post prediction callback. This includes NMS.
    399 
    400     :param model_output:    Direct output of the model, without any post-processing.
    401     :param model_input:     Model input (i.e. images after preprocessing).
    402     :return:                Predicted Bboxes.
    403     """
--> 404     list_of_predictions = self.post_prediction_callback(model_output)
    405     decoded_predictions = []
    406     for image_level_predictions, image in zip(list_of_predictions, model_input):

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\super_gradients\training\models\pose_estimation_models\yolo_nas_pose\yolo_nas_pose_post_prediction_callback.py:73, in YoloNASPosePostPredictionCallback.__call__(self, outputs)
     70     pred_pose_scores = pred_pose_scores[topk_candidates.indices]
     72 # NMS
---> 73 idx_to_keep = torchvision.ops.boxes.nms(boxes=pred_bboxes_xyxy, scores=pred_bboxes_conf, iou_threshold=self.nms_iou_threshold)
     75 final_bboxes = pred_bboxes_xyxy[idx_to_keep]  # [Instances,]
     76 final_scores = pred_bboxes_conf[idx_to_keep]  # [Instances,]

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\torchvision\ops\boxes.py:41, in nms(boxes, scores, iou_threshold)
     39     _log_api_usage_once(nms)
     40 _assert_has_ops()
---> 41 return torch.ops.torchvision.nms(boxes, scores, iou_threshold)

File c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\_ops.py:755, in OpOverloadPacket.__call__(self, *args, **kwargs)
    750 def __call__(self, *args, **kwargs):
    751     # overloading __call__ to ensure torch.ops.foo.bar()
    752     # is still callable from JIT
    753     # We save the function ptr as the `op` attribute on
    754     # OpOverloadPacket to access it here.
--> 755     return self._op(*args, **(kwargs or {}))

NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, Meta, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\cpu\nms_kernel.cpp:11[2](vscode-notebook-cell:?execution_count=18&line=2) [kernel]
Meta: registered at /dev/null:440 [kernel]
QuantizedCPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\quantized\cpu\qnms_kernel.cpp:1[2](vscode-notebook-cell:?execution_count=17&line=2)4 [kernel]
BackendSelect: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:[3](vscode-notebook-cell:?execution_count=17&line=3) [backend fallback]
Python: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:15[4](vscode-notebook-cell:?execution_count=17&line=4) [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:4[9](vscode-notebook-cell:?execution_count=17&line=9)8 [backend fallback]
Functionalize: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:324 [backend fallback]
Named: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:19 [backend fallback]
ZeroTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:53 [backend fallback]
AutogradCPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:57 [backend fallback]
AutogradCUDA: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:65 [backend fallback]
AutogradXLA: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:69 [backend fallback]
AutogradMPS: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:77 [backend fallback]
AutogradXPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:61 [backend fallback]
AutogradHPU: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:90 [backend fallback]
AutogradLazy: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:73 [backend fallback]
AutogradMeta: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:81 [backend fallback]
Tracer: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:297 [backend fallback]
AutocastCPU: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:34 [kernel]
AutocastCUDA: registered at C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:27 [kernel]
FuncTorchBatched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:720 [backend fallback]
BatchedNestedTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:746 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:[10](vscode-notebook-cell:?execution_count=17&line=10)75 [backend fallback]
VmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:203 [backend fallback]
PythonTLSSnapshot: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:162 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:494 [backend fallback]
PreDispatch: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:166 [backend fallback]
PythonDispatcher: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:[15](vscode-notebook-cell:?execution_count=17&line=15)8 [backend fallback]

I try it on Google Colab and it works fine, but on my computer it doesn't.

Versions

Collecting environment information...
PyTorch version: 2.2.0+cu118
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Enterprise LTSC
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.17763-SP0
Is CUDA available: True
CUDA runtime version: 11.8.89
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1650
Nvidia driver version: 551.52
cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\cudnn_ops_train64_8.dll
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture=9
CurrentClockSpeed=1992
DeviceID=CPU0
Family=198
L2CacheSize=2048
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=1992
Name=Intel(R) Core(TM) i7-10700TE CPU @ 2.00GHz
ProcessorType=3
Revision=

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.23.0
[pip3] onnx==1.13.0
[pip3] onnx-graphsurgeon==0.3.12
[pip3] onnxruntime==1.13.1
[pip3] onnxsim==0.4.35
[pip3] pytorch-quantization==2.1.2
[pip3] torch==2.2.0+cu118
[pip3] torchaudio==2.2.0+cu118
[pip3] torchmetrics==0.8.0
[pip3] torchvision==0.17.0
[conda] Could not collect
super-gradients==3.6.0

The text was updated successfully, but these errors were encountered:

BloodAxe · 2024-03-05T09:27:08Z

Since you are using 1650 GPU, I believe your issue related to already reported & fixed problem.
In short - this GPU is known to not work correctly with fp16.

Feel free to check #1834 and a merged PR #1881

Daanfb · 2024-03-05T09:54:12Z

I have tried the command you said in this comment and it didn't find that branch

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting git+https://github.com/Deci-AI/super-gradients@feature/SG-000-introduce-fp16-flag-to-predict
  Cloning https://github.com/Deci-AI/super-gradients (to revision feature/SG-000-introduce-fp16-flag-to-predict) to c:\users\e2k6\appdata\local\temp\pip-req-build-d4b37ybs
  Running command git clone --filter=blob:none --quiet https://github.com/Deci-AI/super-gradients 'C:\Users\E2K6\AppData\Local\Temp\pip-req-build-d4b37ybs'
  WARNING: Did not find branch or tag 'feature/SG-000-introduce-fp16-flag-to-predict', assuming revision or ref.
  Running command git checkout -q feature/SG-000-introduce-fp16-flag-to-predict
  error: pathspec 'feature/SG-000-introduce-fp16-flag-to-predict' did not match any file(s) known to git
  error: subprocess-exited-with-error

  × git checkout -q feature/SG-000-introduce-fp16-flag-to-predict did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git checkout -q feature/SG-000-introduce-fp16-flag-to-predict did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

BloodAxe · 2024-03-05T09:54:56Z

Yes, that's because since PR was already merged, it was automatically deleted that branch. You can now install from master branch instead of a feature branch.

Daanfb · 2024-03-05T10:09:48Z

I have done the following steps:

pip uninstall super-gradientes
git clone https://github.com/Deci-AI/super-gradients.git
cd super-gradients
pip install -r requirements.txt
pip install -e .

Now I have super-gradients==3.6.0+master

This is my code:

model = models.get("yolo_nas_pose_l", pretrained_weights="coco_pose").cuda()

url_im = 'https://th.bing.com/th/id/R.4403419aadfdab2366f83d126328e83a?rik=KUpElyH0bLElCA&pid=ImgRaw&r=0'
prediction = model.to("cuda").predict(url_im, conf=0.5, fp16=False)

But I still get the same error:

NotImplementedError                       Traceback (most recent call last)
Cell In[5], [line 4](vscode-notebook-cell:?execution_count=5&line=4)
      [1](vscode-notebook-cell:?execution_count=5&line=1) model = models.get("yolo_nas_pose_l", pretrained_weights="coco_pose").cuda()
      [3](vscode-notebook-cell:?execution_count=5&line=3) url_im = 'https://th.bing.com/th/id/R.4403419aadfdab2366f83d126328e83a?rik=KUpElyH0bLElCA&pid=ImgRaw&r=0'
----> [4](vscode-notebook-cell:?execution_count=5&line=4) prediction = model.predict(url_im, conf=0.5, fp16=False)

File [~\Desktop\DANIEL\YOLO-NAS-POSE\super-gradients\src\super_gradients\training\models\pose_estimation_models\yolo_nas_pose\yolo_nas_pose_variants.py:174](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:174), in YoloNASPose.predict(self, images, iou, conf, pre_nms_max_predictions, post_nms_max_predictions, batch_size, fuse_model, skip_image_resizing, fp16)
    [154](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:154) """Predict an image or a list of images.
    [155](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:155) 
    [156](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:156) :param images:     Images to predict.
   (...)
    [163](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:163) :param fp16:       If True, use mixed precision for inference.
    [164](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:164) """
    [165](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:165) pipeline = self._get_pipeline(
    [166](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:166)     iou=iou,
    [167](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:167)     conf=conf,
   (...)
    [172](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:172)     fp16=fp16,
    [173](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:173) )
--> [174](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_variants.py:174) return pipeline(images, batch_size=batch_size)

File [~\Desktop\DANIEL\YOLO-NAS-POSE\super-gradients\src\super_gradients\training\pipelines\pipelines.py:120](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:120), in Pipeline.__call__(self, inputs, batch_size)
    [118](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:118)     return self.predict_video(inputs, batch_size)
    [119](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:119) elif check_image_typing(inputs):
--> [120](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:120)     return self.predict_images(inputs, batch_size)
    [121](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:121) else:
    [122](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:122)     raise ValueError(f"Input {inputs} not supported for prediction.")

File [~\Desktop\DANIEL\YOLO-NAS-POSE\super-gradients\src\super_gradients\training\pipelines\pipelines.py:136](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:136), in Pipeline.predict_images(self, images, batch_size)
    [133](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:133) images = load_images(images)
    [135](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:135) result_generator = self._generate_prediction_result(images=images, batch_size=batch_size)
--> [136](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:136) return self._combine_image_prediction_to_images(result_generator, n_images=len(images))

File [~\Desktop\DANIEL\YOLO-NAS-POSE\super-gradients\src\super_gradients\training\pipelines\pipelines.py:440](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:440), in PoseEstimationPipeline._combine_image_prediction_to_images(self, images_predictions, n_images)
    [435](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:435) def _combine_image_prediction_to_images(
    [436](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:436)     self, images_predictions: Iterable[PoseEstimationPrediction], n_images: Optional[int] = None
    [437](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:437) ) -> Union[ImagesPoseEstimationPrediction, ImagePoseEstimationPrediction]:
    [438](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:438)     if n_images is not None and n_images == 1:
    [439](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:439)         # Do not show tqdm progress bar if there is only one image
--> [440](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:440)         images_predictions = next(iter(images_predictions))
    [441](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:441)     else:
    [442](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:442)         images_predictions = [image_predictions for image_predictions in tqdm(images_predictions, total=n_images, desc="Predicting Images")]

File [~\Desktop\DANIEL\YOLO-NAS-POSE\super-gradients\src\super_gradients\training\pipelines\pipelines.py:175](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:175), in Pipeline._generate_prediction_result(self, images, batch_size)
    [173](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:173) else:
    [174](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:174)     for batch_images in generate_batch(images, batch_size):
--> [175](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:175)         yield from self._generate_prediction_result_single_batch(batch_images)

File [~\Desktop\DANIEL\YOLO-NAS-POSE\super-gradients\src\super_gradients\training\pipelines\pipelines.py:220](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:220), in Pipeline._generate_prediction_result_single_batch(self, images)
    [218](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:218)         self._fuse_model(torch_inputs)
    [219](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:219)     model_output = self.model(torch_inputs)
--> [220](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:220)     predictions = self._decode_model_output(model_output, model_input=torch_inputs)
    [222](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:222) # Postprocess
    [223](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:223) postprocessed_predictions = []

File [~\Desktop\DANIEL\YOLO-NAS-POSE\super-gradients\src\super_gradients\training\pipelines\pipelines.py:411](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:411), in PoseEstimationPipeline._decode_model_output(self, model_output, model_input)
    [404](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:404) def _decode_model_output(self, model_output: Union[List, Tuple, torch.Tensor], model_input: np.ndarray) -> List[PoseEstimationPrediction]:
    [405](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:405)     """Decode the model output, by applying post prediction callback. This includes NMS.
    [406](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:406) 
    [407](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:407)     :param model_output:    Direct output of the model, without any post-processing.
    [408](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:408)     :param model_input:     Model input (i.e. images after preprocessing).
    [409](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:409)     :return:                Predicted Bboxes.
    [410](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:410)     """
--> [411](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:411)     list_of_predictions = self.post_prediction_callback(model_output)
    [412](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:412)     decoded_predictions = []
    [413](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/pipelines/pipelines.py:413)     for image_level_predictions, image in zip(list_of_predictions, model_input):

File [c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\_contextlib.py:115](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/utils/_contextlib.py:115), in context_decorator.<locals>.decorate_context(*args, **kwargs)
    [112](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/utils/_contextlib.py:112) @functools.wraps(func)
    [113](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/utils/_contextlib.py:113) def decorate_context(*args, **kwargs):
    [114](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/utils/_contextlib.py:114)     with ctx_factory():
--> [115](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/utils/_contextlib.py:115)         return func(*args, **kwargs)

File [~\Desktop\DANIEL\YOLO-NAS-POSE\super-gradients\src\super_gradients\training\models\pose_estimation_models\yolo_nas_pose\yolo_nas_pose_post_prediction_callback.py:73](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_post_prediction_callback.py:73), in YoloNASPosePostPredictionCallback.__call__(self, outputs)
     [70](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_post_prediction_callback.py:70)     pred_pose_scores = pred_pose_scores[topk_candidates.indices]
     [72](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_post_prediction_callback.py:72) # NMS
---> [73](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_post_prediction_callback.py:73) idx_to_keep = torchvision.ops.boxes.nms(boxes=pred_bboxes_xyxy, scores=pred_bboxes_conf, iou_threshold=self.nms_iou_threshold)
     [75](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_post_prediction_callback.py:75) final_bboxes = pred_bboxes_xyxy[idx_to_keep]  # [Instances,]
     [76](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/E2K6/Desktop/DANIEL/YOLO-NAS-POSE/~/Desktop/DANIEL/YOLO-NAS-POSE/super-gradients/src/super_gradients/training/models/pose_estimation_models/yolo_nas_pose/yolo_nas_pose_post_prediction_callback.py:76) final_scores = pred_bboxes_conf[idx_to_keep]  # [Instances,]

File [c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\torchvision\ops\boxes.py:41](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torchvision/ops/boxes.py:41), in nms(boxes, scores, iou_threshold)
     [39](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torchvision/ops/boxes.py:39)     _log_api_usage_once(nms)
     [40](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torchvision/ops/boxes.py:40) _assert_has_ops()
---> [41](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torchvision/ops/boxes.py:41) return torch.ops.torchvision.nms(boxes, scores, iou_threshold)

File [c:\Users\E2K6\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\_ops.py:755](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/_ops.py:755), in OpOverloadPacket.__call__(self, *args, **kwargs)
    [750](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/_ops.py:750) def __call__(self, *args, **kwargs):
    [751](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/_ops.py:751)     # overloading __call__ to ensure torch.ops.foo.bar()
    [752](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/_ops.py:752)     # is still callable from JIT
    [753](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/_ops.py:753)     # We save the function ptr as the `op` attribute on
    [754](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/_ops.py:754)     # OpOverloadPacket to access it here.
--> [755](file:///C:/Users/E2K6/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/_ops.py:755)     return self._op(*args, **(kwargs or {}))

NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, Meta, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at [C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\cpu\nms_kernel.cpp:112](file:///C:/actions-runner/_work/vision/vision/pytorch/vision/torchvision/csrc/ops/cpu/nms_kernel.cpp:112) [kernel]
Meta: registered at /dev/null:440 [kernel]
QuantizedCPU: registered at [C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\quantized\cpu\qnms_kernel.cpp:124](file:///C:/actions-runner/_work/vision/vision/pytorch/vision/torchvision/csrc/ops/quantized/cpu/qnms_kernel.cpp:124) [kernel]
BackendSelect: fallthrough registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3) [backend fallback]
Python: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:154](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:154) [backend fallback]
FuncTorchDynamicLayerBackMode: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:498](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:498) [backend fallback]
Functionalize: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:324](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:324) [backend fallback]
Named: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7) [backend fallback]
Conjugate: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:17](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/ConjugateFallback.cpp:17) [backend fallback]
Negative: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:19](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/native/NegateFallback.cpp:19) [backend fallback]
ZeroTensor: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86) [backend fallback]
ADInplaceOrView: fallthrough registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:86](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:86) [backend fallback]
AutogradOther: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:53](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:53) [backend fallback]
AutogradCPU: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:57](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:57) [backend fallback]
AutogradCUDA: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:65](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:65) [backend fallback]
AutogradXLA: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:69](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:69) [backend fallback]
AutogradMPS: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:77](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:77) [backend fallback]
AutogradXPU: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:61](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:61) [backend fallback]
AutogradHPU: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:90](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:90) [backend fallback]
AutogradLazy: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:73](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:73) [backend fallback]
AutogradMeta: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:81](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:81) [backend fallback]
Tracer: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:297](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/torch/csrc/autograd/TraceTypeManual.cpp:297) [backend fallback]
AutocastCPU: registered at [C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:34](file:///C:/actions-runner/_work/vision/vision/pytorch/vision/torchvision/csrc/ops/autocast/nms_kernel.cpp:34) [kernel]
AutocastCUDA: registered at [C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\csrc\ops\autocast\nms_kernel.cpp:27](file:///C:/actions-runner/_work/vision/vision/pytorch/vision/torchvision/csrc/ops/autocast/nms_kernel.cpp:27) [kernel]
FuncTorchBatched: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:720](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:720) [backend fallback]
BatchedNestedTensor: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:746](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:746) [backend fallback]
FuncTorchVmapMode: fallthrough registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:28](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28) [backend fallback]
Batched: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\LegacyBatchingRegistrations.cpp:1075](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075) [backend fallback]
VmapMode: fallthrough registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33) [backend fallback]
FuncTorchGradWrapper: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:203](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:203) [backend fallback]
PythonTLSSnapshot: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:162](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:162) [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:494](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:494) [backend fallback]
PreDispatch: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:166](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:166) [backend fallback]
PythonDispatcher: registered at [C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:158](file:///C:/actions-runner/_work/pytorch/pytorch/builder/windows/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:158) [backend fallback]

BloodAxe · 2024-03-05T13:15:17Z

It looks like your torchvision is inconsistent with torch distribution you have.
You have

torch==2.2.0+cu118

But your torchvision does not have +cu118 syntax and may be a cause of this error.
I suggest you uninstall torch and torchvision and follow pytorch docs on installing matching distibution of these packages.

Daanfb · 2024-03-06T08:14:37Z

The command I used to install pytorch was this one:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

This is the command that is on the pytorch page

BloodAxe · 2024-03-06T08:23:34Z

I beleive this command is meant to run on clean environment where you don't have torchvision installed as it most likely will keep existing torchvision.
Here is how it looks like on my env for instance:

torch==2.2.0+cu121
torchaudio==2.2.0+cu121
torchvision==0.17.0+cu121

Compare with what you have reported:

[pip3] torch==2.2.0+cu118
[pip3] torchaudio==2.2.0+cu118
[pip3] torchvision==0.17.0

As you can see - a torchvision has no cuda prefix, and since nms operation is indeed implemented as C++ CUDA layer in torchvision this may cause some internal inconsistency at runtime. This is my best guess so far.

Daanfb · 2024-03-06T08:42:57Z

With conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia works. Thanks!!

BloodAxe added the 🐑 Duplicate This issue or pull request already exists label Mar 5, 2024

Daanfb closed this as completed Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YOLO-NAS-POSE doesn't work with GPU #1886

YOLO-NAS-POSE doesn't work with GPU #1886

Daanfb commented Mar 5, 2024

BloodAxe commented Mar 5, 2024

Daanfb commented Mar 5, 2024

BloodAxe commented Mar 5, 2024

Daanfb commented Mar 5, 2024

BloodAxe commented Mar 5, 2024

Daanfb commented Mar 6, 2024

BloodAxe commented Mar 6, 2024

Daanfb commented Mar 6, 2024

YOLO-NAS-POSE doesn't work with GPU #1886

YOLO-NAS-POSE doesn't work with GPU #1886

Comments

Daanfb commented Mar 5, 2024

🐛 Describe the bug

Versions

BloodAxe commented Mar 5, 2024

Daanfb commented Mar 5, 2024

BloodAxe commented Mar 5, 2024

Daanfb commented Mar 5, 2024

BloodAxe commented Mar 5, 2024

Daanfb commented Mar 6, 2024

BloodAxe commented Mar 6, 2024

Daanfb commented Mar 6, 2024