-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windoze/jdbc python #124
Windoze/jdbc python #124
Conversation
Hi @windoze , we want to retrospectively follow https://github.com/linkedin/feathr/blob/main/docs/dev_guide/pull_request_guideline.md. Could you create a github issue for your PR? Also let's sync first to align on the technical direction with @xiaoyongzhu |
This PR addresses #102 |
@@ -30,6 +30,7 @@ def submit_feathr_job(self, job_name: str, main_jar_path: str, main_class_name: | |||
arguments (str): all the arugments you want to pass into the spark job | |||
job_tags (str): tags of the job, for exmaple you might want to put your user ID, or a tag with a certain information | |||
configuration (Dict[str, str]): Additional configs for the spark job | |||
properties (Dict[str, str]): Additional System Properties for the spark job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this system property for?
I'm not sure if I understand the background of this PR, but from the associated link, I feel this PR is to expose the JDBC sources in the python package.@windoze maybe you can add a bit more description to make it clear for the reviewers? |
Please add more details to the description @windoze |
This is the Python part corresponding to #101.
In #101, I added Scala code to handle JDBC sources using multiple sources need different auth credential, but the Python client still needs this update to let user do: 1. Create JdbcSource, 2. Pass required parameter to Spark job.
This PR adds a new Spark job argument
--system-properties
, which is used to pass secrets from Python client to Spark job, as we shall not store secrets directly inside the config files. The key of each entry is the data source name with_USER
/_PASSWORD
or_TOKEN
suffices, depends on the data source auth type, and the value is taken from the current environment variables with the corresponding key from the Python client side.