You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I'm very interested in your work. I train boxnet successfully with python 3.8 , torch 2.0.1 and cuda11.7. And then I want to finetune the unet, so I set '--train_unet' True and train on the same devices, but I get RuntimeError: CUDA error: an illegal instruction was encountered. How can I train the unet ? Thank you.
Traceback (most recent call last):
File "train_boxnet.py", line 619, in
trainer.fit(model, datamoule, ckpt_path=args.load_ckpt_path)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit
call._call_and_handle_interrupt(
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 63, in _call_and_handle_interrupt
trainer._teardown()
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _teardown
self.strategy.teardown()
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 490, in teardown
super().teardown()
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/strategies/parallel.py", line 125, in teardown
super().teardown()
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 492, in teardown
_optimizers_to_device(self.optimizers, torch.device("cpu"))
File "/opt/conda/lib/python3.8/site-packages/lightning_fabric/utilities/optimizer.py", line 28, in _optimizers_to_device
_optimizer_to_device(opt, device)
File "/opt/conda/lib/python3.8/site-packages/lightning_fabric/utilities/optimizer.py", line 34, in _optimizer_to_device
optimizer.state[p] = apply_to_collection(v, Tensor, move_data_to_device, device)
File "/opt/conda/lib/python3.8/site-packages/lightning_utilities/core/apply_func.py", line 59, in apply_to_collection
v = apply_to_collection(
File "/opt/conda/lib/python3.8/site-packages/lightning_utilities/core/apply_func.py", line 51, in apply_to_collection
return function(data, *args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/lightning_fabric/utilities/apply_func.py", line 101, in move_data_to_device
return apply_to_collection(batch, dtype=_TransferableDataType, function=batch_to)
File "/opt/conda/lib/python3.8/site-packages/lightning_utilities/core/apply_func.py", line 51, in apply_to_collection
return function(data, *args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/lightning_fabric/utilities/apply_func.py", line 95, in batch_to
data_output = data.to(device, **kwargs)
RuntimeError: CUDA error: an illegal instruction was encountered
The text was updated successfully, but these errors were encountered:
Hello, I am very interested in this project and want to test it. If it's not too much trouble, would you be willing to share the weights of the Boxnet with me? Also, I'm curious about the time in training boxnet. Thank you.
Hello, I'm very interested in your work. I train boxnet successfully with python 3.8 , torch 2.0.1 and cuda11.7. And then I want to finetune the unet, so I set '--train_unet' True and train on the same devices, but I get RuntimeError: CUDA error: an illegal instruction was encountered. How can I train the unet ? Thank you.
The text was updated successfully, but these errors were encountered: