-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 3883: SSL support for HttpServer and Akka #3571
Spark 3883: SSL support for HttpServer and Akka #3571
Conversation
Can one of the admins verify this patch? |
@pwendell could you please request this pr to be tested? |
Jenkins, test this please. Jenkins, add to whitelist. |
Test build #24120 has started for PR 3571 at commit
|
Test build #24120 has finished for PR 3571 at commit
|
Test FAILed. |
@pwendell is it possible to access log file somehow? I don't know how to replicate the problems - what is the operating system Jenkins is running on? |
@jacek-lewandowski from a quick look at the diff, it seems you didn't change anything w.r.t. the configuration. In master, there's no need to add a new config file nor all the different ways of loading it - all daemons should be loading spark-defaults.conf and so you could just use SparkConf for everything like I suggested in the old PR. Did you have a chance to look at that? |
@vanzin I just rebased it and fixed so that the tests passes. I'd like to figure out why Jenkins tests fail and then continue the changes. However, I saw in Spark code that it uses SparkConf now in master and worker. |
@vanzin I'm going to finish it very soon; will you be available to help me with those failing tests? |
Test logs are gone, so an admin would have to re-trigger them. I'm a little short on time to be able to test this myself at the moment... |
@vanzin I did some changes but I'm not sure about using Spark configuration in this case. At least it can be not so clear. I mean such cases as running executors. |
It could work if |
Hi @jacek-lewandowski , Yes, that's an issue. That's why I suggested in your original PR to use Yarn's approach here: Code has moved, here's the code in master now: |
Is it any reason why |
Which property file would they load? You want them to load the property file defined when launching the application, not the default file installed on the node since (i) it may not exist or (ii) it may differ from the one used to launch the job. So while it would be great to be able to make them load a property file, it feels like a separate feature. |
54aad3d
to
71599b5
Compare
Test build #24640 has started for PR 3571 at commit
|
Test build #24640 has finished for PR 3571 at commit
|
Test FAILed. |
Ok, I think I've found a solution, which is somehow based to what you've shown me. I've pushed some changes for you to take a look before I continue. The idea for Similar approach for Initially, I wanted to use "daemon" settings as defaults ("daemon" namespace is used to configure SSL for all daemon processes, like Master and Worker). However, it could introduce some vulnerability because the application could access keystores of those daemon processes. So, in short - the administration will have to configure SSL in at least two namespaces on each node: daemon and executor. It is also recommended to configure it in "driver" namespace to provide good defaults. In the environment, where authentication is not required, all the namespaces can be configured with the same settings. What do you think @vanzin ? |
Hi @jacek-lewandowski , I think that's an improvement (assuming that you'd remove all the code dealing with the separate ssl.conf, which is still there), but there's still something I'm no seeing. How are the SSL properties propagated from the launcher to the executors? If I set different options in my launcher that do not match those in the worker's configuration, it seems that things will fail with your new code. The reason why I pointed out the Yarn code is because that's exactly what it does. The properties you set in your launch process are used by the executors. You don't need to configure anything on the edge nodes. In standalone mode, of course you need to configure the worker daemons themselves, but you don't need to manage client configuration in every worker node. Anyway, it seems you're not planning to handle Yarn in this change at all (I didn't notice any changes there?), so we can always treat that as a future enhancement. I'll let other reviewers chime in on whether they think this limitation is acceptable in standalone mode. If I misunderstood something about your code, apologies, and please correct me! Looking at just the last commit, the only comment I had was the somehow confouding use of |
@vanzin the executor configuration set on the edge nodes is considered a default and the one which is used to connect to driver to fetch the configuration set in launcher. If the launcher configuration override the settings in "executor" namespace, the new settings will be used. However, still I have some doubts about custom SSL settings on executors. Imagine the following situation: Say you want to have separate keys for executors, the executors communicate with the Master. It is not a problem to add Master public key to the trust store of the executor. However, how would you add the executor public key to the trust store of the master? You'd need to reconfigure it and restart, or I'm missing something. According to Yarn mode, I'll provide support for it. Now this is my primary task so I'll work on this ticket full time :) |
71599b5
to
ff5a776
Compare
Test build #24741 has started for PR 3571 at commit
|
Test build #24741 has finished for PR 3571 at commit
|
Test FAILed. |
@jacek-lewandowski my understanding is that you don't need the master to know the public key of every executor. The master can either disable cert validation, or have a CA cert in its trust store that can then be used by an admin to sign the user's certificate, which the user can then use in his app. If you do something like that, a default configuration for apps wouldn't work, because each app can potentially have a different certificate. I can see this being particularly desirable for long-runnning apps such as Spark Streaming. Also, it seems there are legitimate style check failures in your patch - that's why tests aren't running. |
ff5a776
to
c4838b8
Compare
SPARK-3883: Added Scala docs for SSL
Test build #26544 has started for PR 3571 at commit
|
@JoshRosen I've just added the documentation. I hope there is enough of it :) |
@jacek-lewandowski Thanks for adding docs; I'll look over them now. |
import com.typesafe.config.{Config, ConfigFactory, ConfigValueFactory} | ||
import org.eclipse.jetty.util.ssl.SslContextFactory | ||
|
||
/** SSLOptions class is a common container for SSL configuration options. It offers methods to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor style nit, but do you mind placing the /**
on its own line? All of the Java/Scaladoc comments should generally look like
/**
* comments comments comments [...]
*/
Test build #26544 has finished for PR 3571 at commit
|
Test FAILed. |
Jenkins, retest this please. |
Overall, this looks good to me, so I'm going to just wait for Jenkins to pass then pull this in for 1.3. We can fix any documentation touch-ups or other things during the QA period. |
Test build #26548 has started for PR 3571 at commit
|
Test build #26548 has finished for PR 3571 at commit
|
Test FAILed. |
I don't get why the tests failed. I only added documentation! |
Jenkins, retest this please. |
Test build #26556 has started for PR 3571 at commit
|
Test build #26556 has finished for PR 3571 at commit
|
Test PASSed. |
I've fixed the documentation formatting nits myself, so I'm going to merge this into |
SPARK-3883: SSL support for Akka connections and Jetty based file servers.
This story introduced the following changes:
Refer to #3571 for discussion and details