Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KLIP-2: multilingual UDFs #2553

Conversation

mitch-seymour
Copy link
Contributor

@mitch-seymour mitch-seymour commented Mar 12, 2019

Description

Design proposal to support multilingual UDFs (e.g. UDFs written in Python, JavaScript, or Ruby).

I have a working POC of multilingual UDFs here, so feel free to experiment with some of the new features suggested in this design proposal (e.g. the new CREATE OR REPLACE FUNCTION query).

Finally, I created this as KLIP-2 because KLIP-1 is under review (but not yet merged). If this proposal is merged first, I can update the KLIP number accordingly. I will also update the design-proposals/README.md once this is ready to be merged.

Testing done

N/A

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@mitch-seymour mitch-seymour requested a review from a team as a code owner March 12, 2019 14:30
@ghost
Copy link

ghost commented Mar 12, 2019

@confluentinc It looks like @mitch-seymour just signed our Contributor License Agreement. 👍

Always at your service,

clabot

Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mitch-seymour! This KLIP is super promising and I'm excited to see us make progress in getting this integrated - I completely agree with you that this will expand the potential user base for KSQL and hit on some of the bigger pain points. The way I see it, there are a few discussions that need to be resolved.

GraalVM

This is obviously the proverbial elephant in the room. Personally, I love to evangelize new tech, especially ones as promising as Graal, but running KSQL with Graal is a seemingly significant change and there are a few questions I'd like answered before KSQL officially supports it:

  • What is the performance profile of KSQL/KStreams on GraalVM when compared to Hotspot (at the end you mention it might improve)? We don't really have a good end-to-end performance test yet, and driving blind scares me just a little. Perhaps we should inquire with the KStream devs to see if they have anything (since KSQL is relatively lightweight as it stands) we could benchmark.
  • Which JVM do we recommend to users/which should be the default/should we support both VMs?
    • If we decide to support both, how do we provide a comprehensive testing story that encompasses both VMs?
    • If we decide to go forward with only Graal, then can we also replace our codehaus compiler with the Graal compiler for even non-custom UDFs?
  • What is the migration path toward running KSQL with GraalVM? It seems like a drop-in replacement, but I want to make sure that if I have two KSQL servers (e.g. one on Hotspot and one on Graal) that they are fully compatible.
  • What alternatives do we have?

Inline/Hotswap Functionality

There are a few tricky scenarios when it comes to allowing a CREATE OR REPLACE functionality during runtime:

  • What happens if you REPLACE (or DROP) a function in the middle of execution (a query is already running)?
  • An extension of the above, what happens if a function is replaced on one running instance before another one?
  • Do we need GraalVM to introduce hotswap/inlining for Java UDFs or can we leverage our codehaus compiler for that? If this is the case, maybe we want to split this out into two KLIPs (one for the syntax and Java inlining, and one for Graal + Multilingual) so that we can at least make progress on that front first and introduce most of the code changes.

As a side-note, we need to also make sure that the implementation supports all modes of execution (e.g. headless/interactive).

Syntax & Misc

With regards to syntax and compiling varying languages there are a few more open questions:

  • Can we detect/prevent malicious code? In our UDF compiler, we detect calls to something like System#exit (and a few other suspects) to make sure that UDFs are safe. There are also potential knobs and bells around allowing/disallowing access to fs/network. Can we do this for other languages?
  • Can GraalVM handle arbitrary imports (at least of standard packages) in these other languages?
  • In your proposal, the symbol $$ is used for custom code. We should make sure that we escape all custom code appropriately so that the parsing is sane.
  • We should try to stick as close to existing SQL systems as possible. It looks like SQL-Server has the following syntax (and Postgres is quite similar, though more verbose):
CREATE [ OR ALTER ] FUNCTION [ schema_name. ] function_name   
( [ { @parameter_name [ AS ][ type_schema_name. ] parameter_data_type   
    [ = default ] [ READONLY ] }   
    [ ,...n ]  
  ]  
)  
RETURNS return_data_type  
    [ WITH <function_option> [ ,...n ] ]  
    [ AS ]  
    BEGIN   
        function_body   
        RETURN scalar_expression  
    END  
[ ; ]  

@agavra agavra added the design-proposal Tag KLIP Prs with this label label Mar 12, 2019
@agavra agavra requested a review from a team March 12, 2019 17:42
@mitch-seymour
Copy link
Contributor Author

@agavra thanks for the quick response! You raise some great questions. I'm going to try to address a few at a time over the next couple of days if that's okay (my schedule is a little crazy this week). Of course, anyone else is welcome to jump in as well :)

GraalVM

  • What is the performance profile of KSQL/KStreams on GraalVM when compared to Hotspot (at the end you mention it might improve)? We don't really have a good end-to-end performance test yet, and driving blind scares me just a little. Perhaps we should inquire with the KStream devs to see if they have anything (since KSQL is relatively lightweight as it stands) we could benchmark.

I think reaching out to the KStream devs is a good idea. Understanding the performance implications at a macro level may be difficult without an end-to-end test. Do you know anyone in particular on that team who might be able to help out?

  • Which JVM do we recommend to users/which should be the default/should we support both VMs?

I think if we release this as an experimental feature, the default + supported VM should initially be HotSpot.

  • If we decide to support both, how do we provide a comprehensive testing story that encompasses both VMs?

I don't have a ton of insight into the current CI process, but from what I can tell from the Jenkins logs, tests are executed within a Docker container running oracle-jdk8. Maybe we could add a parallel stage that also runs the tests within a GraalVM container (with Python, Java, and Ruby pre-installed via gu)? Any tests we add that rely on Graal could either leverage mocks so that they run exactly the same across both VMs, or perhaps we could add a VM check for tests that rely on Graal and execute accordingly.

  • What alternatives do we have?

Nashorn was the only other alternative on my radar for multilingual UDFs.

@mitch-seymour
Copy link
Contributor Author

mitch-seymour commented Mar 18, 2019

Syntax & Misc

With regards to syntax and compiling varying languages there are a few more open questions:

  • Can we detect/prevent malicious code? In our UDF compiler, we detect calls to something like System#exit (and a few other suspects) to make sure that UDFs are safe. There are also potential knobs and bells around allowing/disallowing access to fs/network. Can we do this for other languages?

There are some safeguards available in the Graal SDK that we can leverage for safer execution. Here's an example of how we might build the Polyglot execution context in order to leverage these safeguards:

Context.Builder builder = Context.newBuilder(new String[]{"python", "ruby", "js"})
    .allowIO(true)
    .allowHostAccess(true)
    .allowNativeAccess(true);

We could create corresponding configs that allow us to set these properties on the context object. e.g.

ksql.functions.multilingual.security.io.enabled=true
ksql.functions.multilingual.security.hostaccess.enabled=true
ksql.functions.multilingual.security.nativeaccess.enabled=true

Also, to clarify the behavior of a System#exit equivalent. Doing something like this:

context.eval("python", "lambda x: __import__('sys').exit(1)");

Would result in Graal throwing a PolyglotException that we could catch and report back to the user without impacting the KSQL runtime process.

  • Can GraalVM handle arbitrary imports (at least of standard packages) in these other languages?

Yes, this is possible, though not all of the standard libraries may be available for every language yet. For example, graalpython does not currently have the full standard library available.

  • In your proposal, the symbol $$ is used for custom code. We should make sure that we escape all custom code appropriately so that the parsing is sane.

Not sure the best way to do this actually. The new antlr rules should match the script boundaries ($$) automatically, so would it be up to the user to escape any usages of $$ in their inline script?

  • We should try to stick as close to existing SQL systems as possible. It looks like SQL-Server has the following syntax (and Postgres is quite similar, though more verbose):
CREATE [ OR ALTER ] FUNCTION [ schema_name. ] function_name   
( [ { @parameter_name [ AS ][ type_schema_name. ] parameter_data_type   
    [ = default ] [ READONLY ] }   
    [ ,...n ]  
  ]  
)  
RETURNS return_data_type  
    [ WITH <function_option> [ ,...n ] ]  
    [ AS ]  
    BEGIN   
        function_body   
        RETURN scalar_expression  
    END  
[ ; ]  

Sure, looks like the main change here would be the order of the WITH clause. That should be an easy change. Unless we also want to use SQL-Server's ALTER instead of Postgres' REPLACE. Also supporting BEGIN / END should be possible, as well.

@mitch-seymour
Copy link
Contributor Author

mitch-seymour commented Mar 18, 2019

Inline/Hotswap Functionality

There are a few tricky scenarios when it comes to allowing a CREATE OR REPLACE functionality during runtime:

  • What happens if you REPLACE (or DROP) a function in the middle of execution (a query is already running)?

That's a good question. I have been thinking about this for a couple of days now and I have mixed feelings about this one. In headless mode, this seems less of an issue since clients wouldn't be able to issue a REPLACE or DROP command via the API. Ideally, the new function commands would be placed in the query file before they are needed.

In interactive mode, it seems like the safest, most predictable option is to allow any running queries to continue using the UDF instance they were using before the inline UDF was changed or dropped. Creating a new query (or restarting an existing query) would lead to a new attempt to instantiate the UDF, which would either pull in the new logic (REPLACE) or error if the function was dropped (DROP).

If we pursue this option though, we may want to be more careful about using the terms hot-reloading / hotswapping, since we'd only be hotswapping the UDF factory, and not any active UDF instances that are tied to running queries. If we decide to pursue the more magical option of automatically updating running queries, I think we'll want to be especially careful about how we handle the DROP case. Anyways, I'd love to hear others' thoughts / opinions here.

  • Do we need GraalVM to introduce hotswap/inlining for Java UDFs or can we leverage our codehaus compiler for that? If this is the case, maybe we want to split this out into two KLIPs (one for the syntax and Java inlining, and one for Graal + Multilingual) so that we can at least make progress on that front first and introduce most of the code changes.

No, we can use the existing codehaus compiler for inlining Java UDFs. I was able to get a very basic version of this working today (note: the parameter types are hardcoded for now because I got sidetracked trying to resolve some merge conflicts, and I didn't tie it into the existing classes yet). So if starting there makes more sense, I'm happy to create a separate KLIP.

As a side-note, we need to also make sure that the implementation supports all modes of execution (e.g. headless/interactive).

Yes, definitely. My POC was focused on interactive mode but I can focus on headless mode next :)

@agavra
Copy link
Contributor

agavra commented Mar 19, 2019

Thanks for the responses! Anything I didn't explicitly respond to I agree with 😄

I think reaching out to the KStream devs is a good idea. Understanding the performance implications at a macro level may be difficult without an end-to-end test. Do you know anyone in particular on that team who might be able to help out?

I'll look into it and see if I can find anything. I did look into the kafka code base and found some benchmarks that can be run (see https://github.com/apache/kafka/blob/trunk/tests/kafkatest/benchmarks/streams/streams_simple_benchmark_test.py) but I'm not sure how much that will help.

There are some safeguards available in the Graal SDK that we can leverage for safer execution.

Great! We should see how that maps to what we're currently doing and get that aligned. Answers my question as far as the KLIP goes though (maybe we should add a section on compatibility for it).

In interactive mode, it seems like the safest, most predictable option is to allow any running queries to continue using the UDF instance they were using before the inline UDF was changed or dropped. Creating a new query (or restarting an existing query) would lead to a new attempt to instantiate the UDF, which would either pull in the new logic (REPLACE) or error if the function was dropped (DROP).

This seems the most intuitive to me and simplifies things quite a bit, and I don't think we should hot-swap functions for running queries. It'd be nice to get more input from other contributors though (cc. @hjafarpour @MichaelDrogalis) and a section in the KLIP about it.

No, we can use the existing codehaus compiler for inlining Java UDFs.

Whether or not we actually need a separate KLIP (as I originally suggested) or just stage this single KLIP I'm not sure, but if this is possibility I think we should go for it. We'd be able to get some of this functionality in quickly and have time for the GraalVM discussion.

@MichaelDrogalis
Copy link
Contributor

@agavra @mitch-seymour I would prefer to take the immutable route and avoid supporting hot reloading functions. I think what you outlined for replacing functions and having only new queries pick up the new code is a good approach. With that in mind, I recommend we build the concept of versioning into functions at the outside for observability's sake.

@mitch-seymour
Copy link
Contributor Author

mitch-seymour commented Mar 20, 2019

@agavra @MichaelDrogalis thank you both for the feedback. Since we're in agreement about not re-instantiating a UDF in a running query, I can work in that direction :) Also, I agree that inline Java UDFs would be a good place to start. I'm going to rework some of the POC code to support inline Java UDFs and then submit an accompanying KLIP and PR if that sounds good to you. That will allow us to narrow down the current KLIP to just the Graal-related stuff and not the common syntax changes that are needed for both.

Oh, and once I revisit this KLIP after working on the inline Java UDFs, I can add details about the Graal safeguards for safer code execution.

@hjafarpour
Copy link
Contributor

Thanks @mitch-seymour for the great write up! I'm going to second what @MichaelDrogalis mentioned on avoiding reload of functions for running queries. Indeed, I would go one step further and would prevent altering or dropping functions that are being used in any running queries. If users want to change a function that is being used with any running query, they either should terminate those queries and restart them with the updated function or they should declare the changed function as a new function with different name. Versioning functions can help as @MichaelDrogalis mentioned but I think preventing the same function to have different behavior at the same time would help in reducing confusions and mistakes greatly!

@mitch-seymour mitch-seymour mentioned this pull request Mar 26, 2019
7 tasks
@mitch-seymour
Copy link
Contributor Author

Just an update, I just post a WIP PR for inline Java UDFs (see #2605). The inline UDFs can be versioned, but I still need to address some of the other feedback in this KLIP as well. Anyways, just wanted to share the above for visibility's sake :)

@MichaelDrogalis
Copy link
Contributor

@mitch-seymour Thanks! This is awesome stuff.

@hjafarpour That would make it pretty hard to do a rolling upgrade on a compatible change, though.

@gjones-r7
Copy link

gjones-r7 commented Sep 17, 2020

Hi. I found this pull request after seeing a Confluent blog post from 2 years ago:

https://www.confluent.io/blog/build-udf-udaf-ksql-5-0/

The next step in the ksqlDB UD(A)F roadmap is to support building UD(A)Fs with programming languages other than Java. This way, even a data scientist who only understands Python or JavaScript can write and deploy their own functions. In addition to this, there are many new and exciting features to come, so be sure to keep an eye out for the next ksqlDB releases.

Is this still considered to be on the roadmap? Many thanks.

@mitch-seymour
Copy link
Contributor Author

@gjones-r7 I created this a while back but got busy with other things unfortunately (including writing a book on ksqlDB lol). I was planning to resubmit this KLIP at some point and follow up on some of these ideas, as well as the POCs I created for this work, but realistically won't have time until maybe January. I'd love to see someone else pick this up if they have the bandwidth, otherwise I plan to revisit early next year.

Copy link

cla-assistant bot commented Nov 15, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@vvcephei
Copy link
Member

Closing stale PR. I'm sorry that we didn't get this done. Please resubmit as a new PR if you still want to get it in.

@vvcephei vvcephei closed this Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design-proposal Tag KLIP Prs with this label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants