Cannot use vector as input struct type due to: java.lang.ClassCastException: scala.collection.convert.Wrappers$JListWrapper cannot be cast to ml.combust.mleap.tensor.Tensor #3

make · 2018-11-30T15:29:03Z

I am trying to deploy bundled a Spark ML NaiveBayesModel with sagemaker-sparkml-serving-container.

I am running sagemaker-sparkml-serving-container with following command:

SCHEMA='{"input":[{"name":"features","type":"double","struct":"vector"}],"output":{"name":"prediction","type":"double"}}'
BUNDLE=/tmp/naivebayes_bundle
docker run -p 8080:8080 -e SAGEMAKER_SPARKML_SCHEMA="$SCHEMA" -v $BUNDLE:/opt/ml/model sagemaker-sparkml-serving:2.2 serve

When calling /invocations with:

curl -i -H "content-type:application/json" http://localhost:8080/invocations -d '{"data":[[1.0,2.0,3.0]]}'

Following error is thrown:

java.lang.ClassCastException: scala.collection.convert.Wrappers$JListWrapper cannot be cast to ml.combust.mleap.tensor.Tensor
	at ml.combust.mleap.runtime.transformer.classification.NaiveBayesClassifier$$anonfun$1.apply(NaiveBayesClassifier.scala:19) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.Row$class.udfValue(Row.scala:241) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.ArrayRow.udfValue(ArrayRow.scala:17) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.Row$class.withValues(Row.scala:225) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.ArrayRow.withValues(ArrayRow.scala:17) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.DefaultLeapFrame$$anonfun$withColumns$1$$anonfun$apply$3$$anonfun$4.apply(DefaultLeapFrame.scala:79) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.DefaultLeapFrame$$anonfun$withColumns$1$$anonfun$apply$3$$anonfun$4.apply(DefaultLeapFrame.scala:79) ~[sparkml-serving-2.2.jar:2.2]
	at scala.collection.immutable.Stream.map(Stream.scala:418) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.DefaultLeapFrame$$anonfun$withColumns$1$$anonfun$apply$3.apply(DefaultLeapFrame.scala:79) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.DefaultLeapFrame$$anonfun$withColumns$1$$anonfun$apply$3.apply(DefaultLeapFrame.scala:78) ~[sparkml-serving-2.2.jar:2.2]
	at scala.util.Success$$anonfun$map$1.apply(Try.scala:237) ~[sparkml-serving-2.2.jar:2.2]
	at scala.util.Try$.apply(Try.scala:192) ~[sparkml-serving-2.2.jar:2.2]
	at scala.util.Success.map(Try.scala:237) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.DefaultLeapFrame$$anonfun$withColumns$1.apply(DefaultLeapFrame.scala:77) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.DefaultLeapFrame$$anonfun$withColumns$1.apply(DefaultLeapFrame.scala:72) ~[sparkml-serving-2.2.jar:2.2]
	at scala.util.Success.flatMap(Try.scala:231) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.DefaultLeapFrame.withColumns(DefaultLeapFrame.scala:71) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.frame.MultiTransformer$class.transform(Transformer.scala:121) ~[sparkml-serving-2.2.jar:2.2]
	at ml.combust.mleap.runtime.transformer.classification.NaiveBayesClassifier.transform(NaiveBayesClassifier.scala:13) ~[sparkml-serving-2.2.jar:2.2]
	at com.amazonaws.sagemaker.utils.ScalaUtils.transformLeapFrame(ScalaUtils.java:44) ~[sparkml-serving-2.2.jar:2.2]
	at com.amazonaws.sagemaker.controller.ServingController.processInputData(ServingController.java:176) ~[sparkml-serving-2.2.jar:2.2]
	at com.amazonaws.sagemaker.controller.ServingController.transformRequestJson(ServingController.java:118) ~[sparkml-serving-2.2.jar:2.2]

Created bundle with following dependencies:

org.apache.spark:spark-core_2.11:2.4.0
org.apache.spark:spark-mllib_2.11:2.4.0
ml.combust.mleap:mleap-spark_2.11:0.12.0

Kotlin code that creates the bundle:

val model = NaiveBayes()
        .setModelType("multinomial")
        .fit(data)
SimpleSparkSerializer().serializeToBundle(model, "file:/tmp/naivebayes_bundle", model.transform(data))

The text was updated successfully, but these errors were encountered:

orchidmajumder · 2018-12-01T03:34:43Z

Hey, thanks for using the sagemaker-sparkml-serving. From the stack-trace you mentioned, it looks like for some reason, your model is returning output of type Array instead of a single value.

Please change the schema to output an Array instead of a single value and see if it gives you a valid output. You may need to extract some information out from the response depending on your underlying use-case.

Schema should be changed like this:

SCHEMA='{"input":[{"name":"features","type":"double","struct":"vector"}],"output":{"name":"prediction","type":"double",struct:"array"}}'

make · 2018-12-03T06:50:02Z

Thanks for fast response. Your suggestion doesn't fix the problem. It throws exactly the same exception and stack trace.

It seems that input data features are given as JListWrapper for prediction instead of Tensor.
https://github.com/combust/mleap/blob/master/mleap-runtime/src/main/scala/ml/combust/mleap/runtime/transformer/classification/NaiveBayesClassifier.scala#L19

orchidmajumder · 2018-12-03T16:33:55Z

It looks like your bundle is created with Spark 2.4 and MLeap 0.12.0. At this point, MLeap does not support beyond Spark 2.3 and this container is only tested with Spark 2.2 and MLeap 0.9.6.

As NaiveBayes is available in Spark 2.2 as well, it'll be easier for me to replicate if you can switch to Spark 2.2.1, MLeap 0.9.6 and try to reproduce the same error again.

jorgeglezlopez · 2019-10-18T20:29:53Z

@make I encountered the same problem you did and after a lot of debugging I figured it out. It has nothing to do with the version of Spark or MLeap, it is produced because inside DataConversionHelper the function convertInputDataToJavaType assumes that whenever the DataStructureType is not empty or BASIC, it will be an array.

Therefore, the code as it is right now, will never create a Vector and will not work with any pipeline that requires as an entry point a Vector (such as any trained estimator that requires features). I fixed the code and will try to create a pull request over the weekend.

hdamani09 · 2019-11-04T08:55:00Z

@jorgeglezlopez Hi there, I'm using MLeap 0.14.0 with Spark 2.4.3. I deployed a model to sagemaker endpoint and still am facing the same issue. Do you know by when will the changes with the updater Docker image for 2.4 support will be pushed? Thanks

timf-bonobos · 2020-09-23T18:30:28Z

There is a fix for this that should be merged into master #11

prashantprakash · 2020-12-28T20:13:55Z

I have been trying to use the latest code here and getting similar error.

Commands.

git clone https://github.com/aws/sagemaker-sparkml-serving-container.git

cd sagemaker-sparkml-serving-container

docker build -t sagemaker-sparkml-serving:2.4 .

docker run -p 8080:8080 -e SAGEMAKER_SPARKML_SCHEMA='{"input":[{"name":"features","type":"double","struct":"vector"}],"output":{"name":"probability","type":"double","struct":"vector"}}' -v /Users/prasprak/mldocker/open_models/mleap_model/tar/logreg/:/opt/ml/model sagemaker-sparkml-serving:2.4 serve

Note: My input is of type vector and output is also of type vector.

For Invocations.

curl -i -H "Accept: application/jsonlines;data=text" -H "content-type:application/json" -d "{"data":[[-1.0, 1.5, 1.2]]}" http://localhost:8080/invocations

java.lang.ClassCastException: scala.collection.convert.Wrappers$JListWrapper cannot be cast to ml.combust.mleap.tensor.Tensor
at ml.combust.mleap.runtime.transformer.classification.LogisticRegression$$anonfun$1.apply(LogisticRegression.scala:19) ~[sparkml-serving-2.4.jar:2.4]
at ml.combust.mleap.runtime.frame.Row$class.udfValue(Row.scala:241) ~[sparkml-serving-2.4.jar:2.4]

Also i see the fix which is done here is not merged to master. I tried to pull the branch where the fix is provided but getting different error with that.

himanishk mentioned this issue Mar 2, 2020

Update version fix vector #11

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot use vector as input struct type due to: java.lang.ClassCastException: scala.collection.convert.Wrappers$JListWrapper cannot be cast to ml.combust.mleap.tensor.Tensor #3

Cannot use vector as input struct type due to: java.lang.ClassCastException: scala.collection.convert.Wrappers$JListWrapper cannot be cast to ml.combust.mleap.tensor.Tensor #3

make commented Nov 30, 2018

orchidmajumder commented Dec 1, 2018 •

edited

Loading

make commented Dec 3, 2018

orchidmajumder commented Dec 3, 2018

jorgeglezlopez commented Oct 18, 2019

hdamani09 commented Nov 4, 2019 •

edited

Loading

timf-bonobos commented Sep 23, 2020

prashantprakash commented Dec 28, 2020

Cannot use vector as input struct type due to: java.lang.ClassCastException: scala.collection.convert.Wrappers$JListWrapper cannot be cast to ml.combust.mleap.tensor.Tensor #3

Cannot use vector as input struct type due to: java.lang.ClassCastException: scala.collection.convert.Wrappers$JListWrapper cannot be cast to ml.combust.mleap.tensor.Tensor #3

Comments

make commented Nov 30, 2018

orchidmajumder commented Dec 1, 2018 • edited Loading

make commented Dec 3, 2018

orchidmajumder commented Dec 3, 2018

jorgeglezlopez commented Oct 18, 2019

hdamani09 commented Nov 4, 2019 • edited Loading

timf-bonobos commented Sep 23, 2020

prashantprakash commented Dec 28, 2020

orchidmajumder commented Dec 1, 2018 •

edited

Loading

hdamani09 commented Nov 4, 2019 •

edited

Loading