Tensorflow 2.3 #24

marcelotrevisani · 2020-10-16T09:51:34Z

Hello folks,

Do you have any news regarding tensorflow 2.3? or a perspective when it might be available on the main channel?

marcelotrevisani · 2020-10-16T09:57:48Z

friendly ping on @katietz as he was the last person to modify the recipe :)

roebel · 2020-11-25T11:54:22Z

The packages for CPU only are available since quite a while - I wonder whether there is a problem with the
packages for GPU? Are these to arrive or is GPU support dropped ?

Thanks

tensorflow                     2.3.0 eigen_py37h189e6a2_0  pkgs/main           
tensorflow                     2.3.0 eigen_py38h71ff20e_0  pkgs/main           
tensorflow                     2.3.0 mkl_py37h0481017_0  pkgs/main           
tensorflow                     2.3.0 mkl_py38hd53216f_0  pkgs/main

npanpaliya · 2020-11-27T10:43:21Z

Please have a look at this ContinuumIO/anaconda-issues#11967 (comment).

0x1997 · 2020-12-23T05:53:57Z

Any updates on tensorflow 2.4? Is it also blocked on ContinuumIO/anaconda-issues#11967?

npanpaliya · 2020-12-23T08:42:32Z

@0x1997 The project we are working on (Open-CE as mentioned in one of the related threads by @jayfurmanek), is about to publish another release which includes conda recipe for TF 2.4 (both GPU and CPU). For TF's conda recipe, you can refer to https://github.com/open-ce/tensorflow-feedstock.

katietz · 2021-03-01T14:30:45Z

I updated to tensorflow 2.4.1 for linux-64. The rc binaries can be found in my private channel 'ktietz' for testing. I will continue on Windows and MacOS builds soon too.

katietz · 2021-03-01T14:33:41Z

As side-note. New version supports eigen, mkl, and gpu version for linux-64.

roebel · 2021-03-01T16:40:01Z

I installed from your channel and this seems to work for me with python 3.7. I just loaded tensorflow for the moment and had it report the visible devices. That worked fine. I will put it into regular use over the following days and let you know if I find anything.

Many thanks for the update!

roebel · 2021-03-08T08:27:44Z

It mostly works fine, but this is an issue

WARNING:tensorflow:AutoGraph could not transform <bound method PulseWaveTable._linear_lookup of <tensorflow.python.eager.function.TfMethodTarget object at 0x7f1f4d18c610>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'

I found this thread serge-sans-paille/gast#53 explaining the problem is using gast=0.4.0 while tensorflow requires gast=0.3.3

Indeed the gast dependency for tensorflow 2.4.1 is still 0.3.3
https://libraries.io/pypi/tensorflow/2.4.1

while it appears you pinned it to

tensorflow-base 2.4.1 gpu_py39h29c2da4_0
----------------------------------------
file name   : tensorflow-base-2.4.1-gpu_py39h29c2da4_0.conda
name        : tensorflow-base
version     : 2.4.1
build       : gpu_py39h29c2da4_0
build number: 0
size        : 195.2 MB
license     : Apache 2.0
subdir      : linux-64
url         : https://repo.anaconda.com/pkgs/main/linux-64/tensorflow-base-2.4.1-gpu_py39h29c2da4_0.conda
md5         : aec0b7780731b25ecff1e146c646b518
timestamp   : 2021-03-01 09:39:26 UTC
dependencies: 
...
  - gast >=0.4.0,<0.4.1.0a0
...

jayfurmanek · 2021-03-11T22:14:06Z

Another problem here, and this is likely more of a problem with Anaconda cudatoolkit package, is XLA doesn't work on the gpu version.

A good test for this can be found here:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/g3doc/tutorials/jit_compile.ipynb

when running that, I get:

2021-03-11 21:50:04.976184: W tensorflow/compiler/xla/service/gpu/buffer_comparator.cc:592] Internal: ptxas exited with non-zero error code 256, output: 
Relying on driver to perform ptx compilation. 
Setting XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda  or modifying $PATH can be used to set the location of ptxas
This message will only be logged once.
2021-03-11 21:50:06.579105: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:70] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
2021-03-11 21:50:06.579157: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:71] Searched for CUDA in the following directories:
2021-03-11 21:50:06.579168: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74]   ./cuda_sdk_lib
2021-03-11 21:50:06.579176: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74]   /usr/local/cuda-10.1
2021-03-11 21:50:06.579183: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74]   .
2021-03-11 21:50:06.579191: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:76] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions.  For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2021-03-11 21:50:06.582894: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:324] libdevice is required by this HLO module but was not found at ./libdevice.10.bc
2021-03-11 21:50:06.583354: I tensorflow/compiler/jit/xla_compilation_cache.cc:333] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
2021-03-11 21:50:06.583775: W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at xla_ops.cc:238 : Internal: libdevice not found at ./libdevice.10.bc
Traceback (most recent call last):
  File "jit_compile.py", line 42, in <module>
    train_mnist(images, labels)
  File "/opt/conda/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/opt/conda/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 888, in _call
    return self._stateless_fn(*args, **kwds)
  File "/opt/conda/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2942, in __call__
    return graph_function._call_flat(
  File "/opt/conda/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1918, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/opt/conda/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 555, in call
    outputs = execute.execute(
  File "/opt/conda/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: libdevice not found at ./libdevice.10.bc [Op:__inference_train_mnist_204]

There are two changes that could be made to the cudatoolkit package to fix this:

the libdevice.10.bc package is in the wrong location. It's shipped in $CONDA_HOME/lib when it should probably be in $CONDA_HOME/lib64 or $CONDA_HOME/nvvm/libdevice/ or $CONDA_HOME/nvvmx/libdevice/ (or all three)
the ptxas binary, which is used by the XLA compiler, is not included in the package at all and could be dropped into $CONDA_HOME/bin

I could put up a PR against your cudatoolkit feedstock with these changes if it would be considered.

katietz · 2021-03-12T10:56:03Z

Sure, a PR would be welcome!

About the gast version. I added hotfix for it, so that all tensorflow 2.4.1 version will have gast 0.3.3 as dependency. Hotfix just needs to be reviewed internally.

andrewsali · 2021-04-09T08:57:06Z

@katietz any update on the gast 0.3.3 issue? It still seems that 0.4.0 is the dependency for TF 2.4.1

katietz · 2021-04-09T14:40:49Z

I made a hotpatch for it, and gast should be by this using 0.3.3. The recipe isn't touched for now.

andrewsali · 2021-04-12T10:47:58Z

Thanks @katietz , is there anything that needs to be done on the client (install side) to consume this repodata hotpatch?

Currently when trying to install tensorflow==2.4.1 and gast==0.3.3 together, getting an error:

Package gast conflicts for:
gast==0.3.3
tensorflow==2.4.1 -> tensorflow-base==2.4.1=gpu_py37h29c2da4_0 -> gast[version='>=0.4.0,<0.4.1.0a0']

roebel · 2021-06-22T12:51:11Z

@katietz I don't quite know what to make out of this. It still does not install correctly.I think the only way to handle this currently is install tf2.4 and then post install gast 0.3.3 with pip. Is this the intended procedure?

roebel · 2021-06-22T18:39:12Z

@katietz I don't quite know what to make out of this. It still does not install correctly.I think the only way to handle this currently is install tf2.4 and then post install gast 0.3.3 with pip and the --user flag. Is this the intended procedure?

jayfurmanek mentioned this issue Mar 12, 2021

Include ptxas in cudatoolkit package AnacondaRecipes/cudatoolkit-feedstock#15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensorflow 2.3 #24

Tensorflow 2.3 #24

marcelotrevisani commented Oct 16, 2020

marcelotrevisani commented Oct 16, 2020

roebel commented Nov 25, 2020

npanpaliya commented Nov 27, 2020

0x1997 commented Dec 23, 2020 •

edited

Loading

npanpaliya commented Dec 23, 2020

katietz commented Mar 1, 2021

katietz commented Mar 1, 2021

roebel commented Mar 1, 2021

roebel commented Mar 8, 2021 •

edited

Loading

jayfurmanek commented Mar 11, 2021 •

edited

Loading

katietz commented Mar 12, 2021

andrewsali commented Apr 9, 2021

katietz commented Apr 9, 2021

andrewsali commented Apr 12, 2021

roebel commented Jun 22, 2021

roebel commented Jun 22, 2021

Tensorflow 2.3 #24

Tensorflow 2.3 #24

Comments

marcelotrevisani commented Oct 16, 2020

marcelotrevisani commented Oct 16, 2020

roebel commented Nov 25, 2020

npanpaliya commented Nov 27, 2020

0x1997 commented Dec 23, 2020 • edited Loading

npanpaliya commented Dec 23, 2020

katietz commented Mar 1, 2021

katietz commented Mar 1, 2021

roebel commented Mar 1, 2021

roebel commented Mar 8, 2021 • edited Loading

jayfurmanek commented Mar 11, 2021 • edited Loading

katietz commented Mar 12, 2021

andrewsali commented Apr 9, 2021

katietz commented Apr 9, 2021

andrewsali commented Apr 12, 2021

roebel commented Jun 22, 2021

roebel commented Jun 22, 2021

0x1997 commented Dec 23, 2020 •

edited

Loading

roebel commented Mar 8, 2021 •

edited

Loading

jayfurmanek commented Mar 11, 2021 •

edited

Loading