Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-24319][SPARK SUBMIT] Fix spark-submit execution where no main class is required. #21450

Closed
wants to merge 7 commits into from

Conversation

gaborgsomogyi
Copy link
Contributor

@gaborgsomogyi gaborgsomogyi commented May 29, 2018

What changes were proposed in this pull request?

With PR 20925 now it's not possible to execute the following commands:

  • run-example
  • run-example --help
  • run-example --version
  • run-example --usage-error
  • run-example --status ...
  • run-example --kill ...

In this PR the execution will be allowed for the mentioned commands.

How was this patch tested?

Existing unit tests extended + additional written.

@gaborgsomogyi
Copy link
Contributor Author

test this please

@SparkQA
Copy link

SparkQA commented May 29, 2018

Test build #91262 has finished for PR 21450 at commit a69850b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gaborgsomogyi
Copy link
Contributor Author

cc @vanzin

@SparkQA
Copy link

SparkQA commented May 29, 2018

Test build #91267 has finished for PR 21450 at commit a69850b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented May 29, 2018

This doesn't seem to be addressing the issue reported in the bug. The exact same error happens with your patch:

$ ./bin/run-example 
Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.
        at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitArgs(SparkSubmitCommandBuilder.java:185)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:300)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:166)
        at org.apache.spark.launcher.Main.main(Main.java:86)

@gaborgsomogyi
Copy link
Contributor Author

Updated the description to reflect the no arg case as well.

@SparkQA
Copy link

SparkQA commented May 30, 2018

Test build #91299 has finished for PR 21450 at commit 9052dff.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -229,7 +238,7 @@
args.add(join(",", pyFiles));
}

if (isExample) {
if (isExample && requiresMainClass) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situation will requiresAppResource be different from requiresMainClass at this point?

It seems to me they're basically the same. The only point in your code where they differ is in the no-arg constructor, which is not used by the run-example script.

Seems to me like you could rename the existing variable to isSpecialCommand or something and use it instead for both cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I was hesitating to merge them and loose the 2 different meaning but it's better. Added a small comment.

private void testCLIOpts(String opt) throws Exception {
List<String> helpArgs = Arrays.asList(opt, "driver-20160531171222-0000");
private void testCLIOpts(String appResource, String opt, List<String> params) throws Exception {
List<String> helpArgs = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're changing this, probably better to rename the variable since these are not help arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

@@ -190,6 +194,23 @@ public void testSparkRShell() throws Exception {
env.get("SPARKR_SUBMIT_ARGS"));
}

@Test
public void testExamplesRunnerNoArg() throws Exception {
Copy link
Member

@BryanCutler BryanCutler May 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When no args are given this is what I get as the output:

$ bin/run-example
Error: Must specify a primary resource (JAR or Python or R file)
Run with --help for usage help or --verbose for debug output

It doesn't really make sense asking for a primary resource when running an example, how about just printing usage? This is what spark-submit does with no args

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the output should make sense also.

When the launcher library detects an invalid command it should be running spark-submit --usage-error, which should print usage with the overridden description from the run-example script. That's done in Main.java when it catches an IllegalArgumentException.

Seems that's not exactly what's happening here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it prints out the usage.

@SparkQA
Copy link

SparkQA commented Jun 1, 2018

Test build #91390 has finished for PR 21450 at commit 12a5145.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Jun 4, 2018

This is still missing something.

$ ./bin/run-example --master local
Exception in thread "main" java.lang.IllegalArgumentException: Missing application resource.
        at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:241)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitArgs(SparkSubmitCommandBuilder.java:189)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildSparkSubmitCommand(SparkSubmitCommandBuilder.java:304)
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:170)
        at org.apache.spark.launcher.Main.main(Main.java:86)

@HyukjinKwon
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Jun 9, 2018

Test build #91598 has finished for PR 21450 at commit 12a5145.

  • This patch fails Java style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 12, 2018

Test build #91698 has finished for PR 21450 at commit b03e3de.

  • This patch fails Java style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gaborgsomogyi
Copy link
Contributor Author

The issue seems unrelated.

[error] (kubernetes-integration-tests/*:checkstyle) java.io.FileNotFoundException: checkstyle-config.xml (No such file or directory)

Trying to rebase the changes to the latest master.

@SparkQA
Copy link

SparkQA commented Jun 12, 2018

Test build #91700 has finished for PR 21450 at commit d5a9ca3.

  • This patch fails Java style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • logInfo(s\"Using output committer class $
  • public class JavaPowerIterationClusteringExample
  • class GBTClassificationModel(TreeEnsembleModel, JavaClassificationModel, JavaMLWritable,
  • class PowerIterationClustering(HasMaxIter, HasWeightCol, JavaParams, JavaMLReadable,
  • class PrefixSpan(JavaParams):
  • public class MaskExpressionsUtils
  • case class ArrayRemove(left: Expression, right: Expression)
  • trait MaskLike
  • trait MaskLikeWithN extends MaskLike
  • case class Mask(child: Expression, upper: String, lower: String, digit: String)
  • case class MaskFirstN(
  • case class MaskLastN(
  • case class MaskShowFirstN(
  • case class MaskShowLastN(
  • case class MaskHash(child: Expression)
  • abstract class FileFormatDataWriter(
  • class EmptyDirectoryDataWriter(
  • class SingleDirectoryDataWriter(
  • class DynamicPartitionDataWriter(
  • class WriteJobDescription(
  • case class WriteTaskResult(commitMsg: TaskCommitMessage, summary: ExecutedWriteSummary)
  • case class ExecutedWriteSummary(

OptionParser parser = new OptionParser(true);
parser.parse(submitArgs);
this.requiresAppResource = parser.requiresAppResource;
if (!submitArgs.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this check? parser.isSpecialCommand is initialized to false, so seems like it's already handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed as it leads to the same result.

* Prepare spark commands with the appropriate command builder.
* If printLaunchCommand is set then the commands will be printed to the stderr.
*/
private static List<String> buildCommand(AbstractCommandBuilder builder,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the Spark style for multi-line parameter lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Just for the sake of my understanding what does many mean in the following? All?
For Java code, Apache Spark follows Oracle’s Java code conventions. Many Scala guidelines below also apply to Java.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read that as "unless it cannot apply to Java at all, follow the Scala style".


@Test
public void testExamplesRunnerWithMasterNoMainClass() throws Exception {
testExamplesRunnerWithMasterNoMainClassEx.expect(IllegalArgumentException.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight preference for a try...catch in this case, which would also be less code, but either is fine. But I'd pick a shorter and more generic name for the rule (since it could be reused in other tests if needed later).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

@SparkQA
Copy link

SparkQA commented Jun 13, 2018

Test build #91763 has finished for PR 21450 at commit 8a8067e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* If printLaunchCommand is set then the commands will be printed to the stderr.
*/
private static List<String> buildCommand(
AbstractCommandBuilder builder,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style is to double indent parameter lists... when in doubt, look at the rest of the code, not the style guide.

@SparkQA
Copy link

SparkQA commented Jun 14, 2018

Test build #91853 has finished for PR 21450 at commit 5a737d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Jun 14, 2018

Looks good. "run-example --kill blah" is kinda weird, but not worth it to handle that case.

Merging to master.

@asfgit asfgit closed this in 18cb0c0 Jun 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants