Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import frozen graph with error "Input 0 of node X was passed float from Y:0 incompatible with expected float_ref." #77

Closed
pengwa opened this issue Jul 18, 2018 · 34 comments

Comments

@pengwa
Copy link
Collaborator

pengwa commented Jul 18, 2018

Note: create this issue for anybody who might come across the similar issue in the future.

when I tried to convert frozen DCGAN inference model (trained with https://github.com/carpedm20/DCGAN-tensorflow), the error was thrown as below:

------------------- start handling dcgan.pb ----------------------------
change working directory to /home/pengwang/community/tensorflow
------ summarize the frozen graph, to get the inputs and outputs name
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=/tmp/frozen/dcgan.pb
Found 2 possible inputs: (name=y, type=float(1), shape=[64,10]) (name=z, type=float(1), shape=[?,100])
No variables spotted.
Found 1 possible outputs: (name=generator/Sigmoid, op=Sigmoid)
Found 7080115 (7.08M) const parameters, 0 (0) variable parameters, and 6 control_edges
Op types used: 50 Const, 23 Identity, 8 Mul, 8 Reshape, 6 AssignSub, 6 Sub, 4 ConcatV2, 3 FusedBatchNorm, 3 Relu, 2 Add, 2 BiasAdd, 2 Conv2DBackpropInput, 2 Fill, 2 MatMul, 2 Placeholder, 1 Sigmoid
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/tmp/frozen/dcgan.pb --show_flops --input_layer=y,z --input_layer_type=float,float --input_layer_shape=64,10:-1,100 --output_layer=generator/Sigmoid
------ update the inputs and outputs name to format like input_name:index
python3 /home/pengwang/community/learning/onnx/update_name_with_index.py y,z
updated input names is y:0,z:1, output names is generator/Sigmoid:0
------ start convertion, tensorflow usage require caller program must not in tensorflow root folder, so switch to current user directory with cd
using tensorflow=1.9.0-rc0, onnx=1.2.1
2018-07-17 16:10:59.166646: I tensorflow/tools/graph_transforms/transform_graph.cc:318] Applying fold_batch_norms
2018-07-17 16:10:59.194705: I tensorflow/tools/graph_transforms/transform_graph.cc:318] Applying fold_old_batch_norms
Traceback (most recent call last):
 File "/home/pengwang/community/tensorflow/_python_build/tensorflow/python/framework/importer.py", line 418, in import_graph_def
   graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node generator/g_bn0/AssignMovingAvg was passed float from generator/g_bn0/moving_mean:0 incompatible with expected float_ref.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
   "__main__", mod_spec)
 File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
   exec(code, run_globals)
 File "/home/pengwang/community/tensorflow-onnx/tf2onnx/convert.py", line 100, in <module>
   main()
 File "/home/pengwang/community/tensorflow-onnx/tf2onnx/convert.py", line 80, in main
   tf.import_graph_def(graph_def, name='')
 File "/home/pengwang/community/tensorflow/_python_build/tensorflow/python/util/deprecation.py", line 432, in new_func
   return func(*args, **kwargs)
 File "/home/pengwang/community/tensorflow/_python_build/tensorflow/python/framework/importer.py", line 422, **in import_graph_def
   raise ValueError(str(e))
ValueError: Input 0 of node generator/g_bn0/AssignMovingAvg was passed float from generator/g_bn0/moving_mean:0 incompatible with expected float_ref.**

This is actually caused by the node AssignSub's first input is expected to be a float_ref, but actually after freeze_graph.py handling, it is a float. There is a discussion at davidsandberg/facenet#161 and https://www.bountysource.com/issues/36614355-unable-to-import-frozen-graph-with-batchnorm.

To get this fixed, we need to do extra work for the frozen graph, basically, at least we change the AssignSub to Sub in the graph. look at below code as an example:

import tensorflow as tf

from tensorflow.python.platform import gfile
model_path="/tmp/frozen/dcgan.pb"

# read graph definition
f = gfile.FastGFile(model_path, "rb")
gd = graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())

# fix nodes
for node in graph_def.node:
    if node.op == 'RefSwitch':
        node.op = 'Switch'
        for index in xrange(len(node.input)):
            if 'moving_' in node.input[index]:
                node.input[index] = node.input[index] + '/read'
    elif node.op == 'AssignSub':
        node.op = 'Sub'
        if 'use_locking' in node.attr: del node.attr['use_locking']

# import graph into session
tf.import_graph_def(graph_def, name='')
tf.train.write_graph(graph_def, './', 'good_frozen.pb', as_text=False)
tf.train.write_graph(graph_def, './', 'good_frozen.pbtxt', as_text=True)
@pengwa pengwa closed this as completed Jul 18, 2018
@xwyf05
Copy link

xwyf05 commented Sep 15, 2018

Hi,thanks for your solution about the issue at first!

I got the same error during freeze graph, after change the AssignSub to Sub, AssignAdd to Add... it works!
But I find the result different between the freeze graph and the original graph, that is say:
I give the same input to frozen graph and original graph, but got the different result after softmax....
could you please help me the different between AssignSub and Sub? or what should I do to got the same results?
Thanks

@pengwa
Copy link
Collaborator Author

pengwa commented Sep 16, 2018

@xwyf05 I assume there should be no difference on the result between AssignSub and Sub. You might have to validate whether that's caused by it. for example, you can choose nodes before and after AssignSub, to see whether the difference is caused by this op.

@xwyf05
Copy link

xwyf05 commented Sep 18, 2018

Hi:

Thanks for your suggestion. But according the results, they are difference between AssignSub and Sub. I did not find the exactly difference between them but it seems the sub can't calculate the moving_mean and moving_variances correctly..... According the assumption, I rewrite the parameters, and it works!
The rewrite according to the formula:
output = beta* (input - mean)/sqrt( var + epsilon)+ gamma)
= input* (beta/sqrt(var+epsilon)) + (gamma - mean/(sqrt(var+epsilon)))
= input*beta' + gamma'
beta_new =(beta/sqrt(var+epsilon)
gamma_new = (gamma - mean/(sqrt(var+epsilon)))
moving_mean_new = 0.0
moving_mean_var = 1.0
epsilon_new = 0.0

Thank you reply...

@pengwa
Copy link
Collaborator Author

pengwa commented Sep 18, 2018

@xwyf05 This is interesting. There is a difference on the behavior between Sub and AssignSub. After AssignSub node runs, its ref input tensor will get updated. I guess the input of AssignSub in your graph might be used by multiple consumer nodes (that could be executed in parallel), once AssignSub is triggered to run by the first consumer node, the input ref data is changed, then other consumer nodes will get updated one.

@blagodurov
Copy link

I was wondering, how would you replace 'Assign' op in this case? So far I'm doing the conversion as follows, but I encounter:
Input 0 of node conv1/7x7_s2_weight/Assign was passed float from conv1/7x7_s2_weight:0 incompatible with expected float_ref

for node in graph_def.node:
if node.op == 'RefSwitch':
node.op = 'Switch'
for index in xrange(len(node.input)):
if 'moving_' in node.input[index]:
node.input[index] = node.input[index] + '/read'
elif node.op == 'AssignSub':
node.op = 'Sub'
if 'use_locking' in node.attr: del node.attr['use_locking']
elif node.op == 'AssignAdd':
node.op = 'Add'
if 'use_locking' in node.attr: del node.attr['use_locking']

@pengwa
Copy link
Collaborator Author

pengwa commented Sep 24, 2018

@blagodurov there is still Assign op in your frozen graph? what node is "ref"ed, a const or a placeholder? (I assume there is no variables there, right?) If in this case, possible to use "Identity" to replace that?

@OneDirection9
Copy link

hi, @blagodurov :

I also meet the same problem as you:

ValueError: Input 0 of node x was passed float from y incompatible with expected float_ref.shell

Have you solved it yet? When replace Assign with Identity (recommended by @pengwa ), the following error will occurred:

ValueError: NodeDef mentions attr 'validate_shape' not in Op<name=Identity; signature=input:T -> output:T; attr=T:type>; NodeDef: InceptionV3/Conv2d_1a_3x3/weights/Assign = Identity[T=DT_FLOAT, _class=["loc:@InceptionV3/Conv2d_1a_3
x3/weights"], validate_shape=true, _device="/device:CPU:0"](InceptionV3/Conv2d_1a_3x3/weights, InceptionV3/Conv2d_1a_3x3/weights/Initializer/truncated_normal). (Check whether your GraphDef-interpreting binary is up to date with you
r GraphDef-generating binary.). 

Thanks.

@pengwa
Copy link
Collaborator Author

pengwa commented Dec 6, 2018

be noted about the error “Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). ”

This error should be caused by different tf versions, used to generate the graph, and load the graph.

@OneDirection9
Copy link

@pengwa

Thanks for your quick reply. I will try.

@blagodurov
Copy link

@OneDirection9
Here is what I ended up with. Let me what you think.

  for node in graph_def.node:
    if node.op == 'RefSwitch':
      node.op = 'Switch'
      for index in xrange(len(node.input)):
        if 'moving_' in node.input[index]:
          node.input[index] = node.input[index] + '/read'
    elif node.op == 'AssignSub':
      node.op = 'Sub'
      if 'use_locking' in node.attr: del node.attr['use_locking']
    elif node.op == 'AssignAdd':
      node.op = 'Add'
      if 'use_locking' in node.attr: del node.attr['use_locking']
    elif node.op == 'Assign':
      node.op = 'Identity'
      if 'use_locking' in node.attr: del node.attr['use_locking']
      if 'validate_shape' in node.attr: del node.attr['validate_shape']
      if len(node.input) == 2:
        # input0: ref: Should be from a Variable node. May be uninitialized.
        # input1: value: The value to be assigned to the variable.
        node.input[0] = node.input[1]
        del node.input[1]

@OneDirection9
Copy link

@blagodurov

Thanks for your reply.

@pengwa
Copy link
Collaborator Author

pengwa commented Jan 25, 2019

today I revisited this problem with error " Input 0 of node bilm/Assign was passed float from bilm/Variable:0 incompatible with expected float_ref." And this time, "bilm/Assign"'s first input is a ref. So @blagodurov code nicely changed the Assign op to Identity and removed unseless input. I think this would be needed as part of the pre-processing before conversion.

image

@TanCari
Copy link

TanCari commented Mar 1, 2019

Hi guys, the following simple graph is compiled correctly on my computer:

graph_unique = tf.Graph()

with graph_unique.as_default():
v = tf.get_variable(name='v', shape=[2], dtype=tf.float64)
x = v[:1]
y = v[1:]
c = tf.add(x, y, name='c')
gra = tf.gradients([c], [x])

When I try to write it as a stack of two graphs, though, I get an error.
I guess it is related with this post and I beg for some help.
Here my procedure:

adr_big = '' # please add a valid addres
adr_small = '' # " "

graph_small = tf.Graph()

with graph_small.as_default():
v = tf.get_variable(name='v', shape=[2], dtype=tf.float64)
x = tf.identity(v[:1], name='x')
y = tf.identity(v[1:], name='y')
s_small = tf.train.Saver()

with tf.Session(graph=graph_small) as sess:
sess.run(tf.global_variables_initializer())
s_small.save(sess, adr_small)

graph_big = tf.Graph()

with graph_big.as_default():
a = tf.get_variable(name='a', shape=[1], dtype=tf.float64)
b = tf.get_variable(name='b', shape=[1], dtype=tf.float64)
c = tf.add(a, b, name='c')
s = tf.train.Saver()

with tf.Session(graph=graph_big) as sess:
sess.run(tf.global_variables_initializer())
s.save(sess, adr_big)

graph_together = tf.Graph()

with graph_together.as_default():
tf.train.import_meta_graph(adr_small+'.meta', import_scope='g_small')
x = graph_together.get_tensor_by_name('g_small/x:0')
y = graph_together.get_tensor_by_name('g_small/y:0')
tf.train.import_meta_graph(adr_big+'.meta', import_scope='g_big', input_map={'a:0':x, 'b:0':y})
c = graph_together.get_tensor_by_name('g_big/c:0')
gra = tf.gradients([c],[x])

Tensorflow says:
[...]
InvalidArgumentError: Input 0 of node gg/a/Assign was passed double from g_small/x:0 incompatible with expected double_ref.
During handling of the above exception, another exception occurred:
[...]
ValueError: Input 0 of node gg/a/Assign was passed double from g_small/x:0 incompatible with expected double_ref.

Please notice: everything works correctly if the definition of graph_small above is replaced with the following one:

with graph_small.as_default():
tf.get_variable(name='x', shape=[1], dtype=tf.float64)
tf.get_variable(name='y', shape=[1], dtype=tf.float64)
s_small = tf.train.Saver()

Thanx a lot!

@pengwa
Copy link
Collaborator Author

pengwa commented Mar 2, 2019

@TanCari are you having issue using tf-onnx tool, if not, I think you should ask here in tf repo.

@WiiliamC
Copy link

Hello, guys
I have a similar error:
ValueError: Input 0 of node final_training_ops/Variable/Assign was passed float from final_training_ops/Variable:0 incompatible with expected float_ref.
and when I try to fix it with

        for node in graph_def.node:
            if node.op == 'Assign':
                node.op == 'Identity'

I find a weird error:
Instance of 'GraphDef' has no'node' member
at graph_def.node
Does anyone know how to fix this?
Btw, I always recieve this warning:
No name 'python' in module 'tensorflow'
at
from tensorflow.python.platform import gfile
But I can still run my program with it, which is pretty amazing.
Here is some information that may help:

os: Ubuntu 16.04
tensorflow version: 1.13.1
python version: both 2.7.12 and 3.5.2
IDE: VS code

So, any advice?

@nbcsm
Copy link
Collaborator

nbcsm commented Mar 13, 2019

it seems to be a TF issue actually, so you may want to ask in TF repo to get better help. :-)

@TanCari
Copy link

TanCari commented Mar 15, 2019

Hi!

I have posted the question in tensorflow repository, as a bug, but it got closed immediately:

tensorflow/tensorflow#26346 (comment)

A new, more concise version of my question is available here:

https://stackoverflow.com/questions/55176530/addin-variables-upstream-to-compute-hessian-matrix

@OneDirection9
Copy link

Hi, @TanCari

I think it's better to list your environment, e.g. Ubuntu 16.04, tensorflow 1.13.

This can help people to reproduce your question and solve it.

Thanks.

@arvidzt
Copy link

arvidzt commented May 6, 2019

with tf.Session() as sess:
    saved_model_dir = "saved_model_dir_signature"
    meta_graph_def = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], '')
    for node in sess.graph_def.node:
      if node.op == 'RefEnter':
        node.op = 'Enter'
        for index in range(len(node.input)):
          if 'moving_' in node.input[index]:
            node.input[index] = node.input[index] + '/read'
      if node.op == 'RefSwitch':
        node.op = 'Switch'
        for index in range(len(node.input)):
          if 'moving_' in node.input[index]:
            node.input[index] = node.input[index] + '/read'
      elif node.op == 'AssignSub':
        node.op = 'Sub'
        if 'use_locking' in node.attr: del node.attr['use_locking']
      elif node.op == 'AssignAdd':
        node.op = 'Add'
        if 'use_locking' in node.attr: del node.attr['use_locking']

How can i modify the graph_def in session? if i do it in this way, the model saved by sess didn't change from RefSwitch to Switch.
Can someone tell me how to modify the graph_def in sess? thanks.

@lucienwang1009
Copy link
Collaborator

with tf.Session() as sess:
    saved_model_dir = "saved_model_dir_signature"
    meta_graph_def = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], '')
    for node in sess.graph_def.node:
      if node.op == 'RefEnter':
        node.op = 'Enter'
        for index in range(len(node.input)):
          if 'moving_' in node.input[index]:
            node.input[index] = node.input[index] + '/read'
      if node.op == 'RefSwitch':
        node.op = 'Switch'
        for index in range(len(node.input)):
          if 'moving_' in node.input[index]:
            node.input[index] = node.input[index] + '/read'
      elif node.op == 'AssignSub':
        node.op = 'Sub'
        if 'use_locking' in node.attr: del node.attr['use_locking']
      elif node.op == 'AssignAdd':
        node.op = 'Add'
        if 'use_locking' in node.attr: del node.attr['use_locking']

How can i modify the graph_def in session? if i do it in this way, the model saved by sess didn't change from RefSwitch to Switch.
Can someone tell me how to modify the graph_def in sess? thanks.

You might need to re-import graph_def by tf.import_graph_def in a new session.

@ghoshaw
Copy link

ghoshaw commented May 24, 2019

In my experience, this is caused by using tf.cond in BN, so using the BN in tf.layers and tc.layers is a good choice, but note to use it correctly, refering to https://towardsdatascience.com/pitfalls-of-batch-norm-in-tensorflow-and-sanity-checks-for-training-networks-e86c207548c8
hope that helps!

@Freephi
Copy link

Freephi commented Jun 29, 2019

HI,@blagodurov, I did as you said, and the pb file was saved successfully with batch size 128.
But when I restore pb file and set batch size to 1, another error occurs: InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 345600 values, but the requested shape has 1474560 [[node import/Generator/Reshape ]]

@jenniferchiang
Copy link

@OneDirection9
Here is what I ended up with. Let me what you think.

这是我最后得到的结果,让我看看你的想法。

  for node in graph_def.node:
    if node.op == 'RefSwitch':
      node.op = 'Switch'
      for index in xrange(len(node.input)):
        if 'moving_' in node.input[index]:
          node.input[index] = node.input[index] + '/read'
    elif node.op == 'AssignSub':
      node.op = 'Sub'
      if 'use_locking' in node.attr: del node.attr['use_locking']
    elif node.op == 'AssignAdd':
      node.op = 'Add'
      if 'use_locking' in node.attr: del node.attr['use_locking']
    elif node.op == 'Assign':
      node.op = 'Identity'
      if 'use_locking' in node.attr: del node.attr['use_locking']
      if 'validate_shape' in node.attr: del node.attr['validate_shape']
      if len(node.input) == 2:
        # input0: ref: Should be from a Variable node. May be uninitialized.
        # input1: value: The value to be assigned to the variable.
        node.input[0] = node.input[1]
        del node.input[1]

Awesome! Thanks for your excellent and amazing code!

@carlosgalvezp
Copy link

@blagodurov I'm getting a similar error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node import/conv_transposed/IsVariableInitialized was passed float from import/conv_transposed/bias:0 incompatible with expected float_ref.

What's the magic bugfix for that op? :)

@akash-yadagouda
Copy link

akash-yadagouda commented Jun 13, 2020

ValueError: Input 0 of node Rbn1a/cond/AssignMovingAvg was passed float from Rbn1a/cond/AssignMovingAvg/Switch:1 incompatible with expected float_ref.

I got the above error when I try to load the .pd file in my TensorFlow code.
I solved the issue with the help of the above code

thank you
I was trying to solve this problem last 5 days but I was unable to solve it.
but your method works fine for me
any help
akashyadagoud@gmail.com

@guschmue
Copy link
Contributor

float_ref is a variable. Basically freezing the graph did not work completely. We see this rarely that the model implementation is like this. An example might be you have a dropout op and the keep_prob comes from a variable that is not initialized.

@hahadashi
Copy link

@OneDirection9 @carlosgalvezp @blagodurov hi all , i meet "fold_constants: Ignoring error Input 7 of node model/rnn/while was passed float from rnn/bias:0 incompatible with expected resource." when i use transform tool to deal the pb model.,can you give me some advice, thxs

@ashnaeldho
Copy link

@OneDirection9 @carlosgalvezp @blagodurov hi all,
When I tried to convert the retinaface model(https://github.com/peteryuX/retinaface-tf2), I'm getting the following error:

Input 0 of node RetinaFaceModel/cond/StatefulPartitionedCall/Switch_1 was passed float from Conv1/kernel_1:0 incompatible with expected resource.
I tried the above mentioned codes, but, it doesn't work for me.
Can someone help me to solve this.

@adong7639
Copy link

@OneDirection9
Here is what I ended up with. Let me what you think.

  for node in graph_def.node:
    if node.op == 'RefSwitch':
      node.op = 'Switch'
      for index in xrange(len(node.input)):
        if 'moving_' in node.input[index]:
          node.input[index] = node.input[index] + '/read'
    elif node.op == 'AssignSub':
      node.op = 'Sub'
      if 'use_locking' in node.attr: del node.attr['use_locking']
    elif node.op == 'AssignAdd':
      node.op = 'Add'
      if 'use_locking' in node.attr: del node.attr['use_locking']
    elif node.op == 'Assign':
      node.op = 'Identity'
      if 'use_locking' in node.attr: del node.attr['use_locking']
      if 'validate_shape' in node.attr: del node.attr['validate_shape']
      if len(node.input) == 2:
        # input0: ref: Should be from a Variable node. May be uninitialized.
        # input1: value: The value to be assigned to the variable.
        node.input[0] = node.input[1]
        del node.input[1]

thanks I solve this problem

@hoaquocphan
Copy link

Hi @ashnaeldho
I have the same error log as you when I tried to import graphdef.
do you have any solution to fix your issue?

@zj2050
Copy link

zj2050 commented Aug 12, 2021

Note: create this issue for anybody who might come across the similar issue in the future.

when I tried to convert frozen DCGAN inference model (trained with https://github.com/carpedm20/DCGAN-tensorflow), the error was thrown as below:

------------------- start handling dcgan.pb ----------------------------
change working directory to /home/pengwang/community/tensorflow
------ summarize the frozen graph, to get the inputs and outputs name
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=/tmp/frozen/dcgan.pb
Found 2 possible inputs: (name=y, type=float(1), shape=[64,10]) (name=z, type=float(1), shape=[?,100])
No variables spotted.
Found 1 possible outputs: (name=generator/Sigmoid, op=Sigmoid)
Found 7080115 (7.08M) const parameters, 0 (0) variable parameters, and 6 control_edges
Op types used: 50 Const, 23 Identity, 8 Mul, 8 Reshape, 6 AssignSub, 6 Sub, 4 ConcatV2, 3 FusedBatchNorm, 3 Relu, 2 Add, 2 BiasAdd, 2 Conv2DBackpropInput, 2 Fill, 2 MatMul, 2 Placeholder, 1 Sigmoid
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/tmp/frozen/dcgan.pb --show_flops --input_layer=y,z --input_layer_type=float,float --input_layer_shape=64,10:-1,100 --output_layer=generator/Sigmoid
------ update the inputs and outputs name to format like input_name:index
python3 /home/pengwang/community/learning/onnx/update_name_with_index.py y,z
updated input names is y:0,z:1, output names is generator/Sigmoid:0
------ start convertion, tensorflow usage require caller program must not in tensorflow root folder, so switch to current user directory with cd
using tensorflow=1.9.0-rc0, onnx=1.2.1
2018-07-17 16:10:59.166646: I tensorflow/tools/graph_transforms/transform_graph.cc:318] Applying fold_batch_norms
2018-07-17 16:10:59.194705: I tensorflow/tools/graph_transforms/transform_graph.cc:318] Applying fold_old_batch_norms
Traceback (most recent call last):
 File "/home/pengwang/community/tensorflow/_python_build/tensorflow/python/framework/importer.py", line 418, in import_graph_def
   graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node generator/g_bn0/AssignMovingAvg was passed float from generator/g_bn0/moving_mean:0 incompatible with expected float_ref.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
   "__main__", mod_spec)
 File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
   exec(code, run_globals)
 File "/home/pengwang/community/tensorflow-onnx/tf2onnx/convert.py", line 100, in <module>
   main()
 File "/home/pengwang/community/tensorflow-onnx/tf2onnx/convert.py", line 80, in main
   tf.import_graph_def(graph_def, name='')
 File "/home/pengwang/community/tensorflow/_python_build/tensorflow/python/util/deprecation.py", line 432, in new_func
   return func(*args, **kwargs)
 File "/home/pengwang/community/tensorflow/_python_build/tensorflow/python/framework/importer.py", line 422, **in import_graph_def
   raise ValueError(str(e))
ValueError: Input 0 of node generator/g_bn0/AssignMovingAvg was passed float from generator/g_bn0/moving_mean:0 incompatible with expected float_ref.**

This is actually caused by the node AssignSub's first input is expected to be a float_ref, but actually after freeze_graph.py handling, it is a float. There is a discussion at davidsandberg/facenet#161 and https://www.bountysource.com/issues/36614355-unable-to-import-frozen-graph-with-batchnorm.

To get this fixed, we need to do extra work for the frozen graph, basically, at least we change the AssignSub to Sub in the graph. look at below code as an example:

import tensorflow as tf

from tensorflow.python.platform import gfile
model_path="/tmp/frozen/dcgan.pb"

# read graph definition
f = gfile.FastGFile(model_path, "rb")
gd = graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())

# fix nodes
for node in graph_def.node:
    if node.op == 'RefSwitch':
        node.op = 'Switch'
        for index in xrange(len(node.input)):
            if 'moving_' in node.input[index]:
                node.input[index] = node.input[index] + '/read'
    elif node.op == 'AssignSub':
        node.op = 'Sub'
        if 'use_locking' in node.attr: del node.attr['use_locking']

# import graph into session
tf.import_graph_def(graph_def, name='')
tf.train.write_graph(graph_def, './', 'good_frozen.pb', as_text=False)
tf.train.write_graph(graph_def, './', 'good_frozen.pbtxt', as_text=True)

I copy up code , but run at tf.import_graph_def(graph_def, name='') also get error :Tensorflow.InvalidArgumentError:“Input 0 of node conv1/W/Assign was passed float from conv1/W:0 incompatible with expected float_ref.”

@flyzyh
Copy link

flyzyh commented Nov 11, 2021

Hey guys. How to fix the same problem using java?
I tried to import a *.pb file and met the same error.

@Hyrtsi
Copy link

Hyrtsi commented Mar 16, 2022

There are plenty of answers already. None of them fixed it for me so I'd like to share you how I fixed it. Hopefully I save someone elses time even though my solution isn't that fancy.

My error:

Traceback (most recent call last):
File "/home/<.....>/.venv/lib/python3.7/site-packages/tensorflow_core/python/framework/importer.py",
line 501, in _import_graph_def_internalgraph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError:
Input 0 of node hourglass/pre/BatchNorm/cond_1/AssignMovingAvg/Switch was passed float from
hourglass/pre/BatchNorm/moving_mean:0 incompatible with expected float_ref.

The command I used:

python3 -m tf2onnx.convert \
--checkpoint ../outputs/ELG_i60x36_f60x36_n32_m2/my_checkpoints/hourglass/model-666.meta \
--output model2.onnx \
--inputs Webcam/fifo_queue_DequeueMany:1 \
--outputs hourglass/hg_2/after/hmap/conv/BiasAdd:0,upscale/mul:0,radius/out/fc/BiasAdd:0 \
--verbose

The code and model I used: https://github.com/swook/GazeML -> ELG model

My system:

  • Ubuntu 20.04
  • Python 3.7.12
  • Tensorflow 1.15
  • Tensorflow-gpu 1.15
  • tf2onnx 1.9.3
  • onnx 1.11.0
  • onnxruntime 1.10.0

Solution:

  • downgrade tensorflow-gpu to 1.14
  • downgrade tensorflow to 1.14
  • rerun the same commands for converting than above and it works

I know it's supposed to work using tensorflow 1.15 but in my situation this is an acceptable solution.

(I tried removing the batchnorm layer entirely before realizing the solution above but then I ran into another problem. If you are in the same situation than me but you must use tf 1.15 then you're better off following the instructions of other people)

@Hyrtsi
Copy link

Hyrtsi commented Apr 12, 2022

Hey guys. How to fix the same problem using java? I tried to import a *.pb file and met the same error.

@flyzyh could you share the code you used and the error message you got?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests