immigrating to rtx 3080 error #2427

palram-vcr · 2020-11-19T12:22:53Z

system setup :
rtx 3080
python 3.7.4
windows 10
tensorflow : 2.5.0-dev20201118 (nightly build)
keras : 2.4.3
cuda : 11.0
cudnn : 8.0.4.30

background:
model was running fine on old GPU (rtx 2060) , after changing to the 3080 , using tensorflow would take a long time for some operations (not related to the model, for example the line : tf.constant([[1.0,2.0,3.0],[4.0,5.0,6.0]]) )
as well as displaying NAN for the loss values (all except classification loss which displayed one constant value) ,
the long waiting times was a tensorflow version issue ,which was fixed by upgrading to the nightly build (2.5.0-dev20201118) and the rest of the resulting requirements as specified in system setup above ,however the upgrade resulted in errors when running the model !!!

issue description:
on running training came across the following errror

Exception has occurred: TypeError
Could not build a TypeSpec for <KerasTensor: shape=(None, None, 4) dtype=float32 (created by layer 'tf.math.truediv')> with type KerasTensor
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\framework\type_spec.py", line 554, in type_spec_from_value
(value, type(value).name))
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\engine\keras_tensor.py", line 205, in from_tensor
type_spec = type_spec_module.type_spec_from_value(tensor)
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\engine\keras_tensor.py", line 606, in keras_tensor_from_tensor
out = keras_tensor_cls.from_tensor(tensor)
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\util\nest.py", line 672, in
structure[0], [func(*x) for x in entries],
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\util\nest.py", line 672, in map_structure
structure[0], [func(*x) for x in entries],
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\engine\base_layer.py", line 871, in _infer_output_signature
keras_tensor.keras_tensor_from_tensor, outputs)
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\engine\base_layer.py", line 824, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\engine\base_layer.py", line 1093, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "C:\Users\ashaf102\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\engine\base_layer.py", line 954, in call
input_list)
File "C:\AsafProj\DefectDetector\defect_detection\mask\mrcnn\model.py", line 1880, in build
x, K.shape(input_image)[1:3]))(input_gt_boxes)
File "C:\AsafProj\DefectDetector\defect_detection\mask\mrcnn\model.py", line 1841, in init
self.keras_model = self.build(mode=mode, config=config)
File "C:\AsafProj\DefectDetector\defect_detection\defectTrain.py", line 41, in init
self.model = modellib.MaskRCNN(mode="training", config=self.config,model_dir=self.logDir)
File "C:\AsafProj\DefectDetector\asaf_defect.py", line 28, in
trainer = Train.defectTrainer()
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\Lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\Lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\Lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\Lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\Lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)

Thanks in advance

palram-vcr · 2020-11-23T11:45:52Z

hi all.

after much research the following working setup was found:

update code from leekunhee fork of matterport repo to get compatability with TF > 2.0
https://github.com/leekunhee/Mask_RCNN
upgrade python to version 3.8
install tensorflow from repo : https://github.com/fo40225/tensorflow-windows-wheel/tree/master/2.3.0/py38/CPU%2BGPU/cuda110cudnn8avx2
CUDA : 11.1
cudnn: 8.0.5.39
mask rcnn requirements after above installation :
keras 2.4.3 (latest stable)
scikit-image
imgaug
IPython

palram-vcr · 2020-11-24T06:34:15Z

solved

BasemE · 2020-11-25T13:25:27Z

@palram-vcr What command u have used to install TensorFlow?

palram-vcr · 2020-11-25T13:50:32Z

pip install "path to wheel file"

Good luck

BasemE · 2020-11-25T15:18:53Z

Thanks, I installed everything exactly like you but I am getting this error when use tensorflow
`>>> import tensorflow as tf
2020-11-25 07:00:12.666991: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_110.dll

physical_devices = tf.config.list_physical_devices('GPU')
2020-11-25 07:00:17.026770: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-11-25 07:00:17.059313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:03:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.725GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s
2020-11-25 07:00:17.059502: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_110.dll
2020-11-25 07:00:17.067645: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_11.dll
2020-11-25 07:00:17.071521: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-11-25 07:00:17.073055: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-11-25 07:00:17.074451: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
2020-11-25 07:00:17.078856: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_11.dll
2020-11-25 07:00:17.079548: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_8.dll
`

BasemE · 2020-11-25T15:20:29Z

@palram-vcr Have u got this error

 dlerror: cusolver64_10.dll not found

palram-vcr · 2020-11-26T06:49:36Z

try to get this version of the dll from an earlier version of cuda and put it in the bin directory of the cuda 11,1 install
👍🏿

BasemE · 2020-11-27T10:05:31Z

@palram-vcr Thanks a lot. Your solution worked perfectly.

EvgeneKuklin · 2021-01-20T11:46:44Z

hi all.

after much research the following working setup was found:

1. update code from leekunhee fork of matterport repo to get compatability with TF > 2.0
   https://github.com/leekunhee/Mask_RCNN

2. upgrade python to version 3.8

3. install tensorflow from repo : https://github.com/fo40225/tensorflow-windows-wheel/tree/master/2.3.0/py38/CPU%2BGPU/cuda110cudnn8avx2

4. CUDA :  11.1

5. cudnn:  8.0.5.39

6. mask rcnn requirements after above installation :
   keras 2.4.3 (latest stable)
   scikit-image
   imgaug
   IPython

Thank you!

Also works with:
Nvidia RTX 3090
460.89-desktop-win10-64bit-international-dch-whql
cuda_11.0.2_451.48_win10
cudnn-11.0-windows-x64-v8.0.4.30

javoweb · 2021-01-25T16:49:45Z

@palram-vcr Hello! Maybe you have a modified version of TF for Linux that is working with your setup?

palram-vcr · 2021-02-04T08:00:46Z

@javoweb currently not running on a Linux machine, but you can check out the Nvidia implementation of Tensorflow that way you can leave the mask-rcnn files unchanged , link is :
https://docs.nvidia.com/deeplearning/frameworks/tensorflow-wheel-release-notes/tf-wheel-rel.html

javoweb · 2021-02-04T11:58:07Z

@palram-vcr Great, thanks!

DimChatz · 2021-02-04T18:10:33Z

@palram-vcr i am trying to install the tensorflow wheel using pip3 install path/to/folder but i get this error:
Defaulting to user installation because normal site-packages is not writeable
ERROR: Directory '/home/tzikos/2.3.0/py38/CPU+GPU/cuda110cudnn8avx2' is not installable. Neither 'setup.py' nor 'pyproject.toml' found

alcarazolabs · 2021-02-08T00:43:47Z

Thanks it worked with tensorflow 2.4.1 the training go more faster than tensorflow 1.5.0 which was used by matterport
https://github.com/leekunhee/Mask_RCNN

eladmeir · 2021-03-10T17:24:04Z

Hey all

Glad to hear that some of you got it, this issue is everywhere on this repo....

My question to the ones who have succeeded in making this work is - how come it is well known that using tensorflow.keras side-by-side with keras is not recommended at all, while this is the requirements of the solution.
Are you indeed using tensorflow.keras function in combination with keras function, or have you changed the code in some manner?

A special thanks for @palram-vcr for finding the solution, and @alcarazolabs for the update on the TF2.4.1 version

AndySung320 · 2021-03-22T14:36:23Z

@EvgeneKuklin
@palram-vcr
Thank you very much!!
Also work with:
RTX 3060
cuda_11.0.2_451.48_win10
cudnn-11.0-windows-x64-v8.0.4.30
TF: 2.3.0
Keras: 2.4.3

changbinlu · 2021-04-19T08:51:42Z

Thank you very much!!

bigeyesung · 2021-05-06T11:43:14Z

Also work with:
RTX 3060
Python 3.8
cuda_11.1 Linux
cudnn-8.0.5
TF: 2.4.1
Keras: 2.4.3

mahaairshad · 2023-10-01T14:31:17Z

The combination that worked for me with 3060Ti:

Python 3.8
CUDA Toolkit 11.1.1
CUDNN 8.0.5.39
tensorflow 2.4
keras 2.4.3

palram-vcr closed this as completed Nov 24, 2020

RisingPhoelix mentioned this issue Jan 25, 2021

TypeError: Could not build a TypeSpec for <KerasTensor: shape=(None, None, 4) #2458

Open

konstantin-frolov mentioned this issue Mar 10, 2021

Mask RCNN using CPU instead of GPU for RTX3090 #2503

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

immigrating to rtx 3080 error #2427

immigrating to rtx 3080 error #2427

palram-vcr commented Nov 19, 2020

palram-vcr commented Nov 23, 2020

palram-vcr commented Nov 24, 2020

BasemE commented Nov 25, 2020

palram-vcr commented Nov 25, 2020

BasemE commented Nov 25, 2020 •

edited

Loading

BasemE commented Nov 25, 2020

palram-vcr commented Nov 26, 2020

BasemE commented Nov 27, 2020

EvgeneKuklin commented Jan 20, 2021 •

edited

Loading

javoweb commented Jan 25, 2021

palram-vcr commented Feb 4, 2021

javoweb commented Feb 4, 2021

DimChatz commented Feb 4, 2021

alcarazolabs commented Feb 8, 2021

eladmeir commented Mar 10, 2021

AndySung320 commented Mar 22, 2021

changbinlu commented Apr 19, 2021

bigeyesung commented May 6, 2021

mahaairshad commented Oct 1, 2023

immigrating to rtx 3080 error #2427

immigrating to rtx 3080 error #2427

Comments

palram-vcr commented Nov 19, 2020

palram-vcr commented Nov 23, 2020

palram-vcr commented Nov 24, 2020

BasemE commented Nov 25, 2020

palram-vcr commented Nov 25, 2020

BasemE commented Nov 25, 2020 • edited Loading

BasemE commented Nov 25, 2020

palram-vcr commented Nov 26, 2020

BasemE commented Nov 27, 2020

EvgeneKuklin commented Jan 20, 2021 • edited Loading

javoweb commented Jan 25, 2021

palram-vcr commented Feb 4, 2021

javoweb commented Feb 4, 2021

DimChatz commented Feb 4, 2021

alcarazolabs commented Feb 8, 2021

eladmeir commented Mar 10, 2021

AndySung320 commented Mar 22, 2021

changbinlu commented Apr 19, 2021

bigeyesung commented May 6, 2021

mahaairshad commented Oct 1, 2023

BasemE commented Nov 25, 2020 •

edited

Loading

EvgeneKuklin commented Jan 20, 2021 •

edited

Loading