-
Notifications
You must be signed in to change notification settings - Fork 802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support distributed batch inferencing job on Apache Spark cluster #890
Comments
@Talador12 sorry I haven't got a chance to fill in the issue description yet. This one is very different from #666 and #957. This ticket is about applying ML model packaged with BentoML to large data set on a Spark cluster. It should work for models trained with any of the ML frameworks that BentoML supports(e.g. Tensorflow, Scikit-learn etc). While #666 is about support serving Spark MLlib model in BentoML. Note that users can already do this with BentoML & spark today. Although we want to provide a set of tools on top of the existing BentoML input adapters API to make working with Spark's data types more easily. |
I would like to pick up this work and here is the design doc |
Have been discussing this with @xuzikun2003 @bojiang and here's an update: We are investigating making BentoService class/instance pickle serializable by hooking the pickle interface to BentoML's own save and load implementation. This should allow users to create Spark UDFs with BentoML packaged ML models more easily. Note that this will be a separate effort from the design doc shared above, which shows BentoML's own batch inference API. BentoML's batch inference jobs API is a high-level API for launching and managing batch inference jobs. Whereas the Spark UDF integration gives the user more flexibility when working with Spark application. |
This reads as an appropriate measure - Spark UDFs were made for this kind of custom code/integration. Thank you for taking the initiative on this! I will continue to follow and provide user feedback when I can :) |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
is this on the roadmap for BentoML 1.0? |
@Talador12 @alexdivet We are focusing on streaming and batching now after building a solid foundation with 1.0. Would love to hear any feedback. |
No description provided.
The text was updated successfully, but these errors were encountered: