Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6435] spark-shell --jars option does not add all jars to classpath #5227

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion bin/spark-class2.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,10 @@ if not "x%JAVA_HOME%"=="x" set RUNNER=%JAVA_HOME%\bin\java

rem The launcher library prints the command to be executed in a single line suitable for being
rem executed by the batch interpreter. So read all the output of the launcher into a variable.
for /f "tokens=*" %%i in ('cmd /C ""%RUNNER%" -cp %SPARK_ASSEMBLY_JAR% org.apache.spark.launcher.Main %*"') do (
set LAUNCHER_OUTPUT=%temp%\spark-class-launcher-output-%RANDOM%.txt
"%RUNNER%" -cp %SPARK_ASSEMBLY_JAR% org.apache.spark.launcher.Main %* > %LAUNCHER_OUTPUT%
for /f "tokens=*" %%i in (%LAUNCHER_OUTPUT%) do (
set SPARK_CMD=%%i
)
del %LAUNCHER_OUTPUT%
%SPARK_CMD%
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ static String quoteForBatchScript(String arg) {
boolean needsQuotes = false;
for (int i = 0; i < arg.length(); i++) {
int c = arg.codePointAt(i);
if (Character.isWhitespace(c) || c == '"' || c == '=') {
if (Character.isWhitespace(c) || c == '"' || c == '=' || c == ',' || c == ';') {
needsQuotes = true;
break;
}
Expand All @@ -260,15 +260,14 @@ static String quoteForBatchScript(String arg) {
quoted.append('"');
break;

case '=':
quoted.append('^');
break;

default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you run SparkLauncherSuite on Windows? I added this specifically to fix an issue with that path (see 92a9cfb).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run SparkLauncherSuite on Windows and it's OK.
If double-quotation is parsed properly, = in double-quotation is not need to be escaped.

break;
}
quoted.appendCodePoint(cp);
}
if (arg.codePointAt(arg.length() - 1) == '\\') {
quoted.append("\\");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indented too far. Why would a backslash need escaping only in the final position?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In batch, backslash \ followed by double-quotation " is parsed as a escape charactor.
For example, \" means " itself.
We face a problem in this case:
java.exe -cp "C:\path\to\directory\" ...
The closing " doesn't work properly, so we should escape only the last \.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I assumed you would escape all backslashes if they are the escape character. Is that not necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backslash is escape character only when followed by ". I assumed it's the only case when the string ends with backslash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so it's not a general escape character, got it.

}
quoted.append("\"");
return quoted.toString();
}
Expand Down
6 changes: 1 addition & 5 deletions launcher/src/main/java/org/apache/spark/launcher/Main.java
Original file line number Diff line number Diff line change
Expand Up @@ -101,12 +101,9 @@ public static void main(String[] argsArray) throws Exception {
* The method quotes all arguments so that spaces are handled as expected. Quotes within arguments
* are "double quoted" (which is batch for escaping a quote). This page has more details about
* quoting and other batch script fun stuff: http://ss64.com/nt/syntax-esc.html
*
* The command is executed using "cmd /c" and formatted in single line, since that's the
* easiest way to consume this from a batch script (see spark-class2.cmd).
*/
private static String prepareWindowsCommand(List<String> cmd, Map<String, String> childEnv) {
StringBuilder cmdline = new StringBuilder("cmd /c \"");
StringBuilder cmdline = new StringBuilder("");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you're sure about this, but we definitely don't always need cmd /c to run a windows command directly from Java? In Linux you have to have bash to interpret most of what we're used to typing on the command line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we are using cmd.exe /c in existing *.cmd is that the environment would be polluted by the executing commands.
But we already have *2.cmd and *.cmd calls *2.cmd by cmd.exe /c, so I think we don't need another cmd.exe /c when execute java.exe.
Rather, if we use cmd.exe /c, we face another problem.
We should execute like this command:
"C:\Program Files\Java\jdk1.7.0_67\bin\java.exe" -cp "C:\path\to\somewhere\;..." org.apache.spark.deploy.SparkSubmit ...
But the Launcher returns this string:
cmd.exe /c ""C:\Program Files\Java\jdk1.7.0_67\bin\java.exe" -cp "C:\path\to\somewhere\;..." org.apache.spark.deploy.SparkSubmit ..."
cmd.exe /c needs double-quoted command string, but the java.exe path and other arguments might be already double-quoted, so it the returned string is weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I trust your judgment on this. This is confined to Windows support and I believe you're closest to this code. Note that you don't need the argument "" now and there's still an indent issue above. Take a moment to see if it would be easy to add a test in SparkLauncherSuite; not sure if it is. Otherwise LGTM.

for (Map.Entry<String, String> e : childEnv.entrySet()) {
cmdline.append(String.format("set %s=%s", e.getKey(), e.getValue()));
cmdline.append(" && ");
Expand All @@ -115,7 +112,6 @@ private static String prepareWindowsCommand(List<String> cmd, Map<String, String
cmdline.append(quoteForBatchScript(arg));
cmdline.append(" ");
}
cmdline.append("\"");
return cmdline.toString();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,10 @@ public void testWindowsBatchQuoting() {
assertEquals("\"a b c\"", quoteForBatchScript("a b c"));
assertEquals("\"a \"\"b\"\" c\"", quoteForBatchScript("a \"b\" c"));
assertEquals("\"a\"\"b\"\"c\"", quoteForBatchScript("a\"b\"c"));
assertEquals("\"ab^=\"\"cd\"\"\"", quoteForBatchScript("ab=\"cd\""));
assertEquals("\"ab=\"\"cd\"\"\"", quoteForBatchScript("ab=\"cd\""));
assertEquals("\"a,b,c\"", quoteForBatchScript("a,b,c"));
assertEquals("\"a;b;c\"", quoteForBatchScript("a;b;c"));
assertEquals("\"a,b,c\\\\\"", quoteForBatchScript("a,b,c\\"));
}

@Test
Expand Down