get the trained model to work with gcloud local predict #2

brandondutra · 2017-02-28T18:08:18Z

in /src/prediction/_model.py

s/'SERVING'/tag_constants.SERVING/g

that will help in making the output model work with gcloud local predict

nikhilk · 2017-03-08T23:49:44Z

I thought it was oddly quite painful to refer to the saved_model APIs, based on the ways they were organized... so opted for choosing the more legible literal value.

I don't think using the literal should get in the way of using gcloud... since matching strings should suffice.

I did fix the other issue where in glcoud the model would not be initialized. Perhaps that was getting in the way.

brandondutra · 2017-03-10T21:54:07Z

iris model trained from your master branch does not work with gcloud local predict.

gcloud ml-engine local predict --model-dir TOUT/model/ --text-instances iris/data/eval.csv
Traceback (most recent call last):
File "lib/googlecloudsdk/command_lib/ml/local_predict.py", line 133, in
main()
File "lib/googlecloudsdk/command_lib/ml/local_predict.py", line 128, in main
instances=instances)
File "/usr/local/google/home/brandondutra/cml-virtualenv/local/lib/python2.7/site-packages/google/cloud/ml/prediction/_prediction_lib.py", line 737, in local_predict
client = SessionClient(*load_model(model_dir))
File "/usr/local/google/home/brandondutra/cml-virtualenv/local/lib/python2.7/site-packages/google/cloud/ml/prediction/_prediction_lib.py", line 338, in load_model
model_path)
File "/usr/local/google/home/brandondutra/cml-virtualenv/local/lib/python2.7/site-packages/google/cloud/ml/session_bundle/_session_bundle.py", line 66, in load_session_bundle_from_path
meta_graph_filename)
RuntimeError: Expected meta graph file missing TOUT/model/export.meta

This is the code that fails

  try:
    from tensorflow.contrib.session_bundle import bundle_shim  # pylint: disable=g-import-not-at-top
    from tensorflow.python.saved_model import tag_constants  # pylint: disable=g-import-not-at-top
    # We expect that the customer will export saved model and use
    # tag_constants.SERVING for serving graph. This assumption also extends to
    # model server.
    session, meta_graph = (
        bundle_shim.load_session_bundle_or_saved_model_bundle_from_path(
            model_path, tags=[tag_constants.SERVING]))
  except Exception:  # pylint: disable=broad-except
    session, meta_graph = session_bundle.load_session_bundle_from_path(
        model_path) # line 338

  if session is None:
    raise PredictionError(PredictionError.FAILED_TO_LOAD_MODEL,
                          "Could not load model from %s" % model_path)

Because it got to line 338, the try failed. My guess is because of tag_constants.SERVING, which I think is 'serve'. You use 'SERVING'. If I manually edit my gcloud files to use 'SERVING', I get an error later on:
_get_interface
"map.".format(INPUTS_KEY)))
google.cloud.ml.prediction._prediction_lib.PredictionError: (5, 'Invalid value for collection: inputs. Should be a tensor alias map.')

which makes we realize we might be calling tf.add_to_collection wrong in _ff.py. The old iris sample called it like so: tf.add_to_collection(OUTPUTS_KEY, json.dumps(outputs))

I'm confused by these parts, because I am not directly calling tf.add_to_collection in the structured data package, so maybe some util function I call does this for me.

brandondutra · 2017-03-10T23:55:40Z

$ gcloud ml-engine versions create v1 --model tfx --origin gs://cloud-ml-dev_bdt/TOUT/model
Creating version (this might take a few minutes)......failed.
ERROR: (gcloud.ml-engine.versions.create) Create Version failed. Model validation failed: SavedModel must contain at least one MetaGraphDef with serve tag

brandondutra · 2017-03-11T05:08:47Z

Got gcloud local predict working with this proof of concept:

def _build_signature(inputs, outputs):
  def tensor_alias(tensor):
    local_name = tensor.name.split('/')[-1]
    return local_name.split(':')[0]

  input_map = {}
  output_map = {}
  for tensor in inputs:
    input_map[tensor_alias(tensor)] = tensor
  for tensor in outputs:
    output_map[tensor_alias(tensor)] = tensor

  outputs_ref = tf.get_collection_ref('outputs')
  inputs_ref = tf.get_collection_ref('inputs')

  del outputs_ref[:]
  del inputs_ref[:]

  return tf.saved_model.signature_def_utils.predict_signature_def(
                input_map, 
                output_map)

I also used tag_constants.SERVING.

Will research what gcloud is doing with the output collection if it is defined on Monday. This may mean we are not allowed to use the 'output' collection to pass values, so maybe use 'tenorfx_output'.

nikhilk · 2017-03-11T22:25:40Z

I was able to update just 'SERVING' to 'serve' and make things work.

Note that I actually had to change gcloud code -- for some reason the v1 version of gcloud is loading prediction_lib_beta in my setup of gcloud. Odd. However, the real test is running this on the service.

Maybe there should be a "tfx predict" as well as a CLI wrapper around tensorfx.predictions.Model.load() and .predict().

brandondutra · 2017-03-12T21:02:35Z

Getting gcloud local predict is a real test too: that's the first thing a ml-engine user will try!

Also, did your last update undo your table init work? Because gcloud predict did not work for me.

gcloud ml-engine local predict --model-dir /tmp/tensorfx/iris/csv/model --text-instances iris/data/eval.csv 
ERROR:root:expected string or buffer
Traceback (most recent call last):
  File "/Users/brandondutra/google-cloud-sdk/lib/googlecloudsdk/command_lib/ml/local_predict.py", line 133, in <module>
    main()
  File "/Users/brandondutra/google-cloud-sdk/lib/googlecloudsdk/command_lib/ml/local_predict.py", line 128, in main
    instances=instances)
  File "/Users/brandondutra/cml-virtualenv/lib/python2.7/site-packages/google/cloud/ml/prediction/_prediction_lib.py", line 731, in local_predict
    client = SessionClient(*load_model(model_dir))
  File "/Users/brandondutra/cml-virtualenv/lib/python2.7/site-packages/google/cloud/ml/prediction/_prediction_lib.py", line 351, in load_model
    signature = _get_legacy_signature(graph)
  File "/Users/brandondutra/cml-virtualenv/lib/python2.7/site-packages/google/cloud/ml/prediction/_prediction_lib.py", line 361, in _get_legacy_signature
    input_map, output_map = _get_interfaces(graph)
  File "/Users/brandondutra/cml-virtualenv/lib/python2.7/site-packages/google/cloud/ml/prediction/_prediction_lib.py", line 292, in _get_interfaces
    "map.".format(INPUTS_KEY)))
google.cloud.ml.prediction._prediction_lib.PredictionError: (5, 'Invalid value for collection: inputs. Should be a tensor alias map.')

Did your changes remove your table init work?

gcloud ml-engine predict  ...
{
  "error": "Prediction failed: Exception during model execution: AbortionError(code=StatusCode.FAILED_PRECONDITION, details=\"Table not initialized.\n\t [[Node: output/labels/label = LookupTableFind[Tin=DT_INT64, Tout=DT_STRING, _class=[\"loc:@output/labels/label_table/hash_table\"], _output_shapes=[[-1]], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](output/labels/label_table/hash_table, output/labels/arg_max, output/labels/label_table/hash_table/Const)]]\")"
}

nikhilk · 2017-03-13T00:46:01Z

The line to make saved model work is still there: https://github.com/TensorLab/tensorfx/blob/master/src/training/_model.py#L256

Some how I think you've got an older version of gcloud (or there is a major bug in the GA version of gcloud). The trace indicates its trying to load a model with 'inputs', and 'outputs' collection keys whose value is a json-serialized dictionary of tensor aliases to tensor names (search for _get_legacy_signature in code search). You'll find it in prediction_lib_beta.py ... and I am puzzled how that is in the path if you're using the GA command rather than the beta command.

nikhilk · 2017-03-13T05:30:40Z

FWIW, added a command line tool to do local predictions in af8becc.

Definitely need to validate with the service as the ultimate test. Hopefully it isn't trying to old models with the old json collections.

brandondutra changed the title ~~use tag_constants from tensorflow.python.saved_model~~ get the trained model to work with gcloud local predict Mar 10, 2017

nikhilk self-assigned this Mar 12, 2017

nikhilk added this to the 0.5 Release milestone Mar 12, 2017

nikhilk added issue.bug P1 labels Mar 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get the trained model to work with gcloud local predict #2

get the trained model to work with gcloud local predict #2

brandondutra commented Feb 28, 2017

nikhilk commented Mar 8, 2017

brandondutra commented Mar 10, 2017

brandondutra commented Mar 10, 2017

brandondutra commented Mar 11, 2017 •

edited

Loading

nikhilk commented Mar 11, 2017

brandondutra commented Mar 12, 2017

nikhilk commented Mar 13, 2017

nikhilk commented Mar 13, 2017

get the trained model to work with gcloud local predict #2

get the trained model to work with gcloud local predict #2

Comments

brandondutra commented Feb 28, 2017

nikhilk commented Mar 8, 2017

brandondutra commented Mar 10, 2017

brandondutra commented Mar 10, 2017

brandondutra commented Mar 11, 2017 • edited Loading

nikhilk commented Mar 11, 2017

brandondutra commented Mar 12, 2017

nikhilk commented Mar 13, 2017

nikhilk commented Mar 13, 2017

brandondutra commented Mar 11, 2017 •

edited

Loading