Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error loading a frozen graph ( float incompatible with float_ref ) #161

Closed
mhaghighat opened this issue Feb 15, 2017 · 26 comments
Closed

Error loading a frozen graph ( float incompatible with float_ref ) #161

mhaghighat opened this issue Feb 15, 2017 · 26 comments

Comments

@mhaghighat
Copy link
Contributor

I froze the 20170131-234652 model using the freeze_graph.py, but I cannot load it in C++.

I first read the binaryproto successfully as:

tensorflow::GraphDef graph_def;
Status load_graph_status =  ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def);

But, it gives an error while creating the graph to be used for the session:

std::unique_ptr<tensorflow::Session> session(tensorflow::NewSession(tensorflow::SessionOptions()));
tensorflow::Status sessionCreateStatus = session->Create(graphDef);

The error is:

Invalid argument: Input 0 of node InceptionResnetV1/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/cond/AssignMovingAvg_1/Switch was passed float from InceptionResnetV1/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance:0 incompatible with expected float_ref.

Any ideas how to solve this problem?

Thanks in advance :)

@mhaghighat
Copy link
Contributor Author

BTW, this error also happens when I load the .pb model in Python:

ValueError: graph_def is invalid at node 'InceptionResnetV1/Conv2d_1a_3x3/BatchNorm/cond/AssignMovingAvg/Switch': Input tensor 'InceptionResnetV1/Conv2d_1a_3x3/BatchNorm/moving_mean:0' Cannot convert a tensor of type float32 to an input of type float32_ref.

@mhaghighat mhaghighat changed the title Loading a frozen graph in C++ ( float incompatible with float_ref ) Error loading a frozen graph ( float incompatible with float_ref ) Feb 15, 2017
@Lunrot
Copy link

Lunrot commented Feb 21, 2017

@mhaghighat
Copy link
Contributor Author

Hi @Lunrot,
I've already tried the solution mentioned in this link, but it does not work in our case.

@Lunrot
Copy link

Lunrot commented Feb 21, 2017

@mhaghighat
Does the same error occur in python?
I have not tested it(freeze & load) in C ++, but there is no error in python.

@mhaghighat
Copy link
Contributor Author

Yes, it gives this error in Python:

ValueError: graph_def is invalid at node 'InceptionResnetV1/Conv2d_1a_3x3/BatchNorm/cond/AssignMovingAvg/Switch': Input tensor 'InceptionResnetV1/Conv2d_1a_3x3/BatchNorm/moving_mean:0' Cannot convert a tensor of type float32 to an input of type float32_ref.

@Lunrot
Copy link

Lunrot commented Feb 21, 2017

my code

        saver = tf.train.import_meta_graph(os.path.join(os.path.expanduser(args.model_dir), 
            'model-' + os.path.basename(os.path.normpath(args.model_dir)) + '.meta'), clear_devices=True)
        tf.get_default_session().run(tf.global_variables_initializer())
        tf.get_default_session().run(tf.local_variables_initializer())
        saver.restore(sess, tf.train.latest_checkpoint(os.path.expanduser(args.model_dir)))
        
        output_node_names = 'embeddings'
        
        # for fixing the bug of batch norm
        gd = sess.graph.as_graph_def()
        for node in gd.node:            
            if node.op == 'RefSwitch':
                node.op = 'Switch'
                for index in xrange(len(node.input)):
                    if 'moving_' in node.input[index]:
                        node.input[index] = node.input[index] + '/read'
            elif node.op == 'AssignSub':
                node.op = 'Sub'
                if 'use_locking' in node.attr: del node.attr['use_locking']
            elif node.op == 'AssignAdd':
                node.op = 'Add'
                if 'use_locking' in node.attr: del node.attr['use_locking']
        
        converted_graph_def = graph_util.convert_variables_to_constants(sess, gd, output_node_names.split(","))
        tf.train.write_graph(converted_graph_def, args.output_dir, args.output_filename, as_text=False)

@mhaghighat
Copy link
Contributor Author

@Lunrot:

I had done the same, but the resulted protobuf was not loading successfully. The only difference between my code and yours was that I was feeding sess.graph.as_graph_def() directly to the second argument of the convert_variables_to_constants as in:

output_graph_def = graph_util.convert_variables_to_constants(
                sess, sess.graph.as_graph_def(), output_node_names.split(","))

However, if I change it to the way that you've done, creating gd = sess.graph.as_graph_def(), and then:

output_graph_def = graph_util.convert_variables_to_constants(
                sess, gd, output_node_names.split(","))

there is no error anymore.

I know it sounds absurd, but this is the case!!!
The problem is solved but I'm still confused, why???

@Lunrot
Copy link

Lunrot commented Feb 22, 2017

Since gd (=sess.graph.as_graph_def()) has changed in the bug finxing, gd shoud be used instead of sess.graph.as_graph_def().

@mhaghighat
Copy link
Contributor Author

I had the sess.graph.as_graph_def() in the bug fixing loop as:

for node in sess.graph.as_graph_def().node:  
    [perform all the fixing]

but the resulted protobuf still had the issue.

Maybe, it cannot alter the nodes in the original sess.graph.as_graph_def(); so a copy of it (i.e., gd) needs to be created on which we can perform the bug fixing. Can it be right?!

@ugtony
Copy link

ugtony commented Feb 22, 2017

@Lunrot, thanks for your code. I made the same mistake as mhaghighat did(also thank @mhaghighat for finding out the difference).

I noted that tf.gfile.GFile is replaced by tf.train.write_graph. What's the difference between the two functions? Can they be used for save/load interchangeably?

I found some of the tensors' shape information are eliminated from the frozen model.
For example, the input shape was (?, 160, 160, 3) in the original model but became in the frozen model. It made me unable to use tensor.get_shape() to check the input shape.

For curiosity, I fed the network with some inputs with different shapes:
(90, 160, 160, 3), (90, 220, 280, 3), (90, 160, 160, 1)
to see if the network can work with any input shape. The last one failed. It means that shapes doesn't imply all shapes are allowed.
So now, I have to hard code a proper input size in my program, which is not so convenient.
Do you have any idea why the shape information was gone after the freezing operation?

@Lunrot
Copy link

Lunrot commented Feb 22, 2017

@mhaghighat I think so.
@ugtony (90, 160, 160, 1) is probably a black and white image. If you change that image to (90,160,160,3), I think it will work correctly.

@ugtony
Copy link

ugtony commented Feb 22, 2017

Hi @Lunrot,
I can understand why (90, 160, 160, 1) doesn't work and why (90, 220, 280, 3) does work for this network. But without shape information, it's not easy to use the frozen model for those who haven't seen inception_resnet_v1.py before.

@Lunrot
Copy link

Lunrot commented Feb 22, 2017

@ugtony
The CNN model is just a classifier. It is common to perform image preprocessing (image resizing, cropping, etc.) on CNN input values.

@tengshaofeng
Copy link

@mhaghighat
hi, can you show me your args? python freeze_graph.py (args).
and can you show me the code how you load the frozen model?

@tengshaofeng
Copy link

@mhaghighat
@Lunrot
@ugtony
@rtkaleta
@scotthong
hi, all
after Converting model.ckpt to model.pb, how to load the model with the model.pb?

@ugtony
Copy link

ugtony commented Feb 22, 2017

@tengshaofeng
A great tutorial is here.

@mhaghighat
Copy link
Contributor Author

@tengshaofeng:
I just submitted a pull request with the updated freeze_graph.py. This is how I call the function:

python freeze_graph.py ~/models/facenet/20170131-234652 ~/models/facenet/20170131-234652/facenet.pb

For loading the protobuf graph in Python, you can use:

def load_graph(frozen_graph_filename):
    with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())

    with tf.Graph().as_default() as graph:
        tf.import_graph_def(
            graph_def, 
            input_map=None, 
            return_elements=None, 
            op_dict=None, 
            producer_op_list=None
        )
        
    return graph


graph = load_graph('./facenet/facenet.pb')

For loading it in C++, you can use:

tensorflow::GraphDef graphDef;
tensorflow::ReadBinaryProto(tensorflow::Env::Default(), "facenet.pb", &graphDef);

std::unique_ptr<tensorflow::Session> session = tensorflow::NewSession(tensorflow::SessionOptions());
tensorflow::Status sessionCreateStatus = session->Create(graphDef);

@davidsandberg
Copy link
Owner

This has been fixed in #172.

@tengshaofeng
Copy link

@mhaghighat
@ugtony
thanks so much.

@NicoCoallier
Copy link

In my case , I had this error because I was saving the totally of my variable into constant. When I selected only the correct operations in the ouput_node_names, the loading was a success . EX: output_node_names = "Loss/predictions"

@mhaghighat
Copy link
Contributor Author

@cvJie: This works for me:

tensorflow::Tensor phaseTrain(tensorflow::DT_BOOL, tensorflow::TensorShape());
phaseTrain.scalar<bool>()() = false;
std::vector<std::pair<std::string, tensorflow::Tensor>> inputs = { 
{ "input", faceTensor } ,
{ "phase_train", phaseTrain } 
};

@arvidzt
Copy link

arvidzt commented May 6, 2019

with tf.Session() as sess:
    saved_model_dir = "saved_model_dir_signature"
    meta_graph_def = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], '')
    for node in sess.graph_def.node:
      if node.op == 'RefEnter':
        node.op = 'Enter'
        for index in range(len(node.input)):
          if 'moving_' in node.input[index]:
            node.input[index] = node.input[index] + '/read'
      if node.op == 'RefSwitch':
        node.op = 'Switch'
        for index in range(len(node.input)):
          if 'moving_' in node.input[index]:
            node.input[index] = node.input[index] + '/read'
      elif node.op == 'AssignSub':
        node.op = 'Sub'
        if 'use_locking' in node.attr: del node.attr['use_locking']
      elif node.op == 'AssignAdd':
        node.op = 'Add'
        if 'use_locking' in node.attr: del node.attr['use_locking']

How can i modify the graph_def in session? if i do it in this way, the model saved by sess didn't change from RefSwitch to Switch.
Can someone tell me how to modify the graph_def in sess? thanks.

@Priyashbhugra
Copy link

Hello

I am facing this issue

raise ValueError(str(e))
ValueError: Input 0 of node import/global_step/Assign was passed int32 from import/global_step:0 incompatible with expected int32_ref.

in line below code while loading frozen.pb file

    with tf.Graph().as_default() as graph:
        **tf.import_graph_def(
            graph_def,
            input_map=None,
            return_elements=None,
            op_dict=None,
            producer_op_list=None
        )**

here is my full code:
model_dir = '/home/priyash/avod/avod/data/outputs/pyramid_cars_with_aug_example/checkpoint_freeze/pyramid_cars_with_aug_example-00120000.pb'
log_dir = '/home/priyash/avod/avod/data/outputs/pyramid_cars_with_aug_example/checkpoint_freeze/logs/'

with tf.Session() as sess:
model_filename = model_dir
with gfile.FastGFile(model_filename, 'rb') as f:

    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())
    with tf.Graph().as_default() as graph:
        tf.import_graph_def(
            graph_def,
            input_map=None,
            return_elements=None,
            op_dict=None,
            producer_op_list=None
        )
    print(graph)
    #
    # g_in = tf.import_graph_def(graph_def)
    # print(g_in)

train_writer = tf.summary.FileWriter(log_dir)
train_writer.add_graph(sess.graph)

@glennford49
Copy link

FusedBatchNorm/Switch:1 incompatible with expected half error,
im trying to convert it to fp16 precision

@xiexie123
Copy link

FusedBatchNorm/Switch:1 incompatible with expected half error,
im trying to convert it to fp16 precision

hi I have the save error. do you fix it?

PythonDevMaster added a commit to PythonDevMaster/faceNet that referenced this issue Feb 26, 2024
The problem with batch norm nodes is fixed based on the following issues:
tensorflow/tensorflow#3628
davidsandberg/facenet#161
roberto-devp pushed a commit to roberto-devp/facenet that referenced this issue Sep 24, 2024
The problem with batch norm nodes is fixed based on the following issues:
tensorflow/tensorflow#3628
davidsandberg/facenet#161
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests