SSD MobileNet train fails with Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>) #5715

anonym24 · 2018-11-07T14:57:54Z

retraining with faster_rcnn_inception_v2_coco_2018_01_28 model and faster_rcnn_inception_v2_pets.config works ok

but

python3 train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config

with ssd_mobilenet_v1_coco_2018_01_28 pretrained model

fails with next error

ERROR:tensorflow:==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'init_ops/report_uninitialized_variables/boolean_mask/GatherV2:0' shape=(?,) dtype=string>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
  File "train.py", line 184, in <module>
    tf.app.run()  File "/home/user/.local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))  File "/home/user/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 306, in new_func
    return func(*args, **kwargs)  File "train.py", line 180, in main
    graph_hook_fn=graph_rewriter_fn)  File "/media/user/DATA/tensorflow1/models/research/object_detection/legacy/trainer.py", line 415, in train
    saver=saver)  File "/home/user/.local/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 791, in train
    should_retry = True  File "/home/user/.local/lib/python3.6/site-packages/tensorflow/python/util/tf_should_use.py", line 189, in wrapped
    return _add_should_use_warning(fn(*args, **kwargs))
==================================
E1107 16:45:19.883316 139764880684864 tf_logging.py:105] ==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'init_ops/report_uninitialized_variables/boolean_mask/GatherV2:0' shape=(?,) dtype=string>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
  File "train.py", line 184, in <module>
    tf.app.run()  File "/home/user/.local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))  File "/home/user/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 306, in new_func
    return func(*args, **kwargs)  File "train.py", line 180, in main
    graph_hook_fn=graph_rewriter_fn)  File "/media/user/DATA/tensorflow1/models/research/object_detection/legacy/trainer.py", line 415, in train
    saver=saver)  File "/home/user/.local/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 791, in train
    should_retry = True  File "/home/user/.local/lib/python3.6/site-packages/tensorflow/python/util/tf_should_use.py", line 189, in wrapped
    return _add_should_use_warning(fn(*args, **kwargs))
==================================

Full log:
full_logs.zip

ssd_mobilenet_v1_coco.config

# SSD with Mobilenet v1 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 30
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "/media/user/DATA/tensorflow1/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "/media/user/DATA/tensorflow1/models/research/object_detection/train.record"
  }
  label_map_path: "/media/user/DATA/tensorflow1/models/research/object_detection/training/labelmap.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/media/user/DATA/tensorflow1/models/research/object_detection/test.record"
  }
  label_map_path: "/media/user/DATA/tensorflow1/models/research/object_detection/training/labelmap.pbtxt"
  shuffle: false
  num_readers: 1
}

System information

What is the top-level directory of the model you are using: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2018_01_28.tar.gz
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 16.04
TensorFlow installed from (source or binary): pip tensorflow-gpu https://www.tensorflow.org/install/pip
TensorFlow version (use command below): 1.12.0
Bazel version (if compiling from source): -
CUDA/cuDNN version: 9.0/7
GPU model and memory: GTX 1060 6GB

The text was updated successfully, but these errors were encountered:

anonym24 · 2018-11-08T10:04:36Z

I tried to use model_train.py instead of train.py - #5719

Seems it started to work with model_train.py for ssd_mobilenet_v1_coco.config + ssd_mobilenet_v1_coco_2018_01_28 combination
Though training is very slow and it eats a lot of cpu

But it fails for ssd_mobilenet_v1_quantized_300x300_coco14_sync.config + ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18

anonym24 · 2018-11-09T10:00:50Z

seems the real issues with train.py are related to OOM - #2034 (comment)

seems only with batch_size value 1 it works for train.py

anonym24 · 2018-11-09T10:04:35Z

Trying to get back to train.py again cause model_train.py works ugly - eats a lot of CPU and it's very slowly:

With train.py and batch_size value 1 it finally at least started to train but after some steps (~100-200) it fails retraining SSD MobileNet:

python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config

INFO:tensorflow:global step 171: loss = 14.8309 (0.360 sec/step)
I1109 12:00:57.197321  7740 tf_logging.py:115] global step 171: loss = 14.8309 (0.360 sec/step)
INFO:tensorflow:global step 172: loss = 11.7885 (0.351 sec/step)
I1109 12:00:57.549896  7740 tf_logging.py:115] global step 172: loss = 11.7885 (0.351 sec/step)
INFO:tensorflow:global step 173: loss = 12.5532 (0.369 sec/step)
I1109 12:00:57.919557  7740 tf_logging.py:115] global step 173: loss = 12.5532 (0.369 sec/step)
INFO:tensorflow:global step 174: loss = 13.3306 (0.328 sec/step)
I1109 12:00:58.248665  7740 tf_logging.py:115] global step 174: loss = 13.3306 (0.328 sec/step)
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Incompatible shapes: [2,1917] vs. [3,1]
         [[node Loss/Match/cond/mul_4 (defined at C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py:175)  = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Loss/Match/cond/one_hot, Loss/Match/cond/Cast_2)]]
         [[{{node gradients/FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_128/BatchNorm/FusedBatchNorm_grad/FusedBatchNormGrad/_1497}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2718_...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'Loss/Match/cond/mul_4', defined at:
  File "train.py", line 184, in <module>
    tf.app.run()
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\platform\app.py", line 125, in run
    _sys.exit(main(argv))
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 306, in new_func
    return func(*args, **kwargs)
  File "train.py", line 180, in main
    graph_hook_fn=graph_rewriter_fn)
  File "C:\tensorflow1\models\research\object_detection\legacy\trainer.py", line 290, in train
    clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
  File "C:\tensorflow1\models\research\slim\deployment\model_deploy.py", line 193, in create_clones
    outputs = model_fn(*args, **kwargs)
  File "C:\tensorflow1\models\research\object_detection\legacy\trainer.py", line 205, in _create_losses
    losses_dict = detection_model.loss(prediction_dict, true_image_shapes)
  File "C:\tensorflow1\models\research\object_detection\meta_architectures\ssd_meta_arch.py", line 680, in loss
    keypoints, weights)
  File "C:\tensorflow1\models\research\object_detection\meta_architectures\ssd_meta_arch.py", line 853, in _assign_targets
    groundtruth_weights_list)
  File "C:\tensorflow1\models\research\object_detection\core\target_assigner.py", line 483, in batch_assign_targets
    anchors, gt_boxes, gt_class_targets, unmatched_class_label, gt_weights)
  File "C:\tensorflow1\models\research\object_detection\core\target_assigner.py", line 182, in assign
    valid_rows=tf.greater(groundtruth_weights, 0))
  File "C:\tensorflow1\models\research\object_detection\core\matcher.py", line 241, in match
    return Match(self._match(similarity_matrix, valid_rows),
  File "C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py", line 194, in _match
    _match_when_rows_are_non_empty, _match_when_rows_are_empty)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2086, in cond
    orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\control_flow_ops.py", line 1930, in BuildCondBranch
    original_result = fn()
  File "C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py", line 175, in _match_when_rows_are_non_empty
    tf.cast(tf.expand_dims(valid_rows, axis=-1), dtype=tf.float32))
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\math_ops.py", line 866, in binary_op_wrapper
    return func(x, y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\math_ops.py", line 1131, in _mul_dispatch
    return gen_math_ops.mul(x, y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_math_ops.py", line 5358, in mul
    "Mul", x=x, y=y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op
    op_def=op_def)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Incompatible shapes: [2,1917] vs. [3,1]
         [[node Loss/Match/cond/mul_4 (defined at C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py:175)  = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Loss/Match/cond/one_hot, Loss/Match/cond/Cast_2)]]
         [[{{node gradients/FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_128/BatchNorm/FusedBatchNorm_grad/FusedBatchNormGrad/_1497}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2718_...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

I1109 12:00:58.279880  7740 tf_logging.py:115] Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Incompatible shapes: [2,1917] vs. [3,1]
         [[node Loss/Match/cond/mul_4 (defined at C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py:175)  = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Loss/Match/cond/one_hot, Loss/Match/cond/Cast_2)]]
         [[{{node gradients/FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_128/BatchNorm/FusedBatchNorm_grad/FusedBatchNormGrad/_1497}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2718_...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'Loss/Match/cond/mul_4', defined at:
  File "train.py", line 184, in <module>
    tf.app.run()
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\platform\app.py", line 125, in run
    _sys.exit(main(argv))
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 306, in new_func
    return func(*args, **kwargs)
  File "train.py", line 180, in main
    graph_hook_fn=graph_rewriter_fn)
  File "C:\tensorflow1\models\research\object_detection\legacy\trainer.py", line 290, in train
    clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
  File "C:\tensorflow1\models\research\slim\deployment\model_deploy.py", line 193, in create_clones
    outputs = model_fn(*args, **kwargs)
  File "C:\tensorflow1\models\research\object_detection\legacy\trainer.py", line 205, in _create_losses
    losses_dict = detection_model.loss(prediction_dict, true_image_shapes)
  File "C:\tensorflow1\models\research\object_detection\meta_architectures\ssd_meta_arch.py", line 680, in loss
    keypoints, weights)
  File "C:\tensorflow1\models\research\object_detection\meta_architectures\ssd_meta_arch.py", line 853, in _assign_targets
    groundtruth_weights_list)
  File "C:\tensorflow1\models\research\object_detection\core\target_assigner.py", line 483, in batch_assign_targets
    anchors, gt_boxes, gt_class_targets, unmatched_class_label, gt_weights)
  File "C:\tensorflow1\models\research\object_detection\core\target_assigner.py", line 182, in assign
    valid_rows=tf.greater(groundtruth_weights, 0))
  File "C:\tensorflow1\models\research\object_detection\core\matcher.py", line 241, in match
    return Match(self._match(similarity_matrix, valid_rows),
  File "C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py", line 194, in _match
    _match_when_rows_are_non_empty, _match_when_rows_are_empty)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2086, in cond
    orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\control_flow_ops.py", line 1930, in BuildCondBranch
    original_result = fn()
  File "C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py", line 175, in _match_when_rows_are_non_empty
    tf.cast(tf.expand_dims(valid_rows, axis=-1), dtype=tf.float32))
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\math_ops.py", line 866, in binary_op_wrapper
    return func(x, y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\math_ops.py", line 1131, in _mul_dispatch
    return gen_math_ops.mul(x, y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_math_ops.py", line 5358, in mul
    "Mul", x=x, y=y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op
    op_def=op_def)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Incompatible shapes: [2,1917] vs. [3,1]
         [[node Loss/Match/cond/mul_4 (defined at C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py:175)  = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Loss/Match/cond/one_hot, Loss/Match/cond/Cast_2)]]
         [[{{node gradients/FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_128/BatchNorm/FusedBatchNorm_grad/FusedBatchNormGrad/_1497}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2718_...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Traceback (most recent call last):
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,1917] vs. [3,1]
         [[{{node Loss/Match/cond/mul_4}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Loss/Match/cond/one_hot, Loss/Match/cond/Cast_2)]]
         [[{{node gradients/FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_128/BatchNorm/FusedBatchNorm_grad/FusedBatchNormGrad/_1497}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2718_...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 184, in <module>
    tf.app.run()
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\platform\app.py", line 125, in run
    _sys.exit(main(argv))
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 306, in new_func
    return func(*args, **kwargs)
  File "train.py", line 180, in main
    graph_hook_fn=graph_rewriter_fn)
  File "C:\tensorflow1\models\research\object_detection\legacy\trainer.py", line 415, in train
    saver=saver)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\contrib\slim\python\slim\learning.py", line 770, in train
    sess, train_op, global_step, train_step_kwargs)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\contrib\slim\python\slim\learning.py", line 487, in train_step
    run_metadata=run_metadata)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,1917] vs. [3,1]
         [[node Loss/Match/cond/mul_4 (defined at C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py:175)  = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Loss/Match/cond/one_hot, Loss/Match/cond/Cast_2)]]
         [[{{node gradients/FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_128/BatchNorm/FusedBatchNorm_grad/FusedBatchNormGrad/_1497}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2718_...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'Loss/Match/cond/mul_4', defined at:
  File "train.py", line 184, in <module>
    tf.app.run()
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\platform\app.py", line 125, in run
    _sys.exit(main(argv))
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 306, in new_func
    return func(*args, **kwargs)
  File "train.py", line 180, in main
    graph_hook_fn=graph_rewriter_fn)
  File "C:\tensorflow1\models\research\object_detection\legacy\trainer.py", line 290, in train
    clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
  File "C:\tensorflow1\models\research\slim\deployment\model_deploy.py", line 193, in create_clones
    outputs = model_fn(*args, **kwargs)
  File "C:\tensorflow1\models\research\object_detection\legacy\trainer.py", line 205, in _create_losses
    losses_dict = detection_model.loss(prediction_dict, true_image_shapes)
  File "C:\tensorflow1\models\research\object_detection\meta_architectures\ssd_meta_arch.py", line 680, in loss
    keypoints, weights)
  File "C:\tensorflow1\models\research\object_detection\meta_architectures\ssd_meta_arch.py", line 853, in _assign_targets
    groundtruth_weights_list)
  File "C:\tensorflow1\models\research\object_detection\core\target_assigner.py", line 483, in batch_assign_targets
    anchors, gt_boxes, gt_class_targets, unmatched_class_label, gt_weights)
  File "C:\tensorflow1\models\research\object_detection\core\target_assigner.py", line 182, in assign
    valid_rows=tf.greater(groundtruth_weights, 0))
  File "C:\tensorflow1\models\research\object_detection\core\matcher.py", line 241, in match
    return Match(self._match(similarity_matrix, valid_rows),
  File "C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py", line 194, in _match
    _match_when_rows_are_non_empty, _match_when_rows_are_empty)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2086, in cond
    orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\control_flow_ops.py", line 1930, in BuildCondBranch
    original_result = fn()
  File "C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py", line 175, in _match_when_rows_are_non_empty
    tf.cast(tf.expand_dims(valid_rows, axis=-1), dtype=tf.float32))
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\math_ops.py", line 866, in binary_op_wrapper
    return func(x, y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\math_ops.py", line 1131, in _mul_dispatch
    return gen_math_ops.mul(x, y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_math_ops.py", line 5358, in mul
    "Mul", x=x, y=y, name=name)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op
    op_def=op_def)
  File "C:\Users\Admin\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Incompatible shapes: [2,1917] vs. [3,1]
         [[node Loss/Match/cond/mul_4 (defined at C:\tensorflow1\models\research\object_detection\matchers\argmax_matcher.py:175)  = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Loss/Match/cond/one_hot, Loss/Match/cond/Cast_2)]]
         [[{{node gradients/FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_128/BatchNorm/FusedBatchNorm_grad/FusedBatchNormGrad/_1497}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2718_...chNormGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

anonym24 · 2018-11-09T10:10:51Z

seems issue related to #5391

anonym24 · 2018-11-09T10:13:29Z

I guess legacy train.py isn't going to be updated, so the only solution is to use model_train.py

Madhukaran · 2020-02-20T14:25:57Z

You have broken the serialized dataset(i.e, data corroupt) further training models could be added. thus delete the model and train from the fresh

EthiopianOne · 2020-06-16T04:35:14Z

Hey guys, I had the same issue, and after a loooong annoying searching and trying to debug it worked for me when i switched to model_main.py.

get rid of that evil legacy\train.py.

anonym24 closed this as completed Nov 8, 2018

anonym24 reopened this Nov 9, 2018

anonym24 closed this as completed Nov 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSD MobileNet train fails with Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>) #5715

SSD MobileNet train fails with Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>) #5715

anonym24 commented Nov 7, 2018 •

edited

Loading

anonym24 commented Nov 8, 2018 •

edited

Loading

anonym24 commented Nov 9, 2018

anonym24 commented Nov 9, 2018 •

edited

Loading

anonym24 commented Nov 9, 2018

anonym24 commented Nov 9, 2018 •

edited

Loading

Madhukaran commented Feb 20, 2020

EthiopianOne commented Jun 16, 2020

SSD MobileNet train fails with Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>) #5715

SSD MobileNet train fails with Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>) #5715

Comments

anonym24 commented Nov 7, 2018 • edited Loading

System information

anonym24 commented Nov 8, 2018 • edited Loading

anonym24 commented Nov 9, 2018

anonym24 commented Nov 9, 2018 • edited Loading

anonym24 commented Nov 9, 2018

anonym24 commented Nov 9, 2018 • edited Loading

Madhukaran commented Feb 20, 2020

EthiopianOne commented Jun 16, 2020

anonym24 commented Nov 7, 2018 •

edited

Loading

anonym24 commented Nov 8, 2018 •

edited

Loading

anonym24 commented Nov 9, 2018 •

edited

Loading

anonym24 commented Nov 9, 2018 •

edited

Loading