Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20075][CORE][WIP] Support classifier, packaging in Maven coordinates #17416

Closed
wants to merge 1 commit into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Mar 24, 2017

Do not merge

Initial attempt to add classifier, packaging support. Ivy integration still not quite right.

What changes were proposed in this pull request?

This adds support for group:artifact:packaging:classifier:version support in the Maven/Ivy package infrastructure available to spark-submit et al via --packages.

How was this patch tested?

Existing tests and updated tests.

@srowen srowen changed the title [SPARK-20075][CORE][WIP [SPARK-20075][CORE][WIP] Support classifier, packaging in Maven coordinates Mar 24, 2017
@@ -941,6 +965,12 @@ private[spark] object SparkSubmitUtils {
val ri = ModuleRevisionId.newInstance(mvn.groupId, mvn.artifactId, mvn.version)
val dd = new DefaultDependencyDescriptor(ri, false, false)
dd.addDependencyConfiguration(ivyConfName, ivyConfName + "(runtime)")
if (mvn.classifier.isDefined) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @brkyvz and @BryanCutler (and maybe @vanzin) for a look. This does not quite work yet, but it's close. I believe it's this stanza that's not quite right, where I try to add classifier info to Ivy's description of an artifact.

For example ./bin/spark-shell --packages edu.stanford.nlp:stanford-corenlp:jar:models:3.4.1 works and seems to resolve the right artifact with classifier locally, no errors, but it doesn't seem to be on the classpath.

I don't need you to debug it for me but mostly wondering if you know Ivy much better than I (not seen it until today) and can see an obvious error in this approach. I can't quite figure Ivy's concept that maps to "classifier" in Maven.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this looks ok to me. This seems like the right way to add an artifact with classifier, but I'm not certain. Does the artifact show up in the ResolveReport at the end of SparkSubmit.resolveMavenCoordinates?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling we might be missing it when we copy the files from the local repo to the ivy cache. The way the resolution works is as follows:

  1. Ivy searches local cache
  2. Ivy searches remote repos
  3. Ivy downloads from remote repo to local repo
  4. Ivy copies the file from local repo to cache

Then the files are served from the local cache. Step 4 is called retrieval and happens on line 1159. I think that's what you're missing

@SparkQA
Copy link

SparkQA commented Mar 24, 2017

Test build #75179 has finished for PR 17416 at commit 1524390.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

case Array(groupId, artifactId, version) =>
new MavenCoordinate(groupId, artifactId, version)
case Array(groupId, artifactId, packaging, version) =>
new MavenCoordinate(groupId, artifactId, Some(packaging), None, version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can a classifier be given without the packaging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like it, given https://maven.apache.org/pom.html#Maven_Coordinates . If there are four dimensions, the third is packaging.

@srowen
Copy link
Member Author

srowen commented Mar 28, 2017

Thanks @BryanCutler , here's the current full output after trying to fix per #17416 (comment) . Looks like progress I think but still missed something here. Working on it ...

Parsed arguments:
  master                  local[*]
  deployMode              null
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          null
  driverMemory            null
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               org.apache.spark.repl.Main
  primaryResource         spark-shell
  name                    Spark shell
  childArgs               []
  jars                    null
  packages                edu.stanford.nlp:stanford-corenlp:jar:models:3.4.1
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file null:
  

    
Ivy Default Cache set to: /Users/srowen/.ivy2/cache
The jars for the packages stored in: /Users/srowen/.ivy2/jars
:: loading settings :: url = jar:file:/Users/srowen/Documents/Cloudera/spark/assembly/target/scala-2.11/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
edu.stanford.nlp#stanford-corenlp added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
	confs: [default]
	found edu.stanford.nlp#stanford-corenlp;3.4.1 in local-m2-cache
	found com.io7m.xom#xom;1.2.10 in local-m2-cache
	found xml-apis#xml-apis;1.3.03 in local-m2-cache
	found xerces#xercesImpl;2.8.0 in local-m2-cache
	found xalan#xalan;2.7.0 in local-m2-cache
	found joda-time#joda-time;2.1 in local-m2-cache
	found de.jollyday#jollyday;0.4.7 in local-m2-cache
	found javax.xml.bind#jaxb-api;2.2.7 in local-m2-cache
	found com.googlecode.efficient-java-matrix-library#ejml;0.23 in local-m2-cache
	found javax.json#javax.json-api;1.0 in local-m2-cache
downloading file:/Users/srowen/.m2/repository/edu/stanford/nlp/stanford-corenlp/3.4.1/stanford-corenlp-3.4.1-models.jar ...
	[SUCCESSFUL ] edu.stanford.nlp#stanford-corenlp;3.4.1!stanford-corenlp.jar (264ms)
downloading file:/Users/srowen/.m2/repository/com/io7m/xom/xom/1.2.10/xom-1.2.10.jar ...
	[SUCCESSFUL ] com.io7m.xom#xom;1.2.10!xom.jar (2ms)
downloading file:/Users/srowen/.m2/repository/joda-time/joda-time/2.1/joda-time-2.1.jar ...
	[SUCCESSFUL ] joda-time#joda-time;2.1!joda-time.jar (3ms)
downloading file:/Users/srowen/.m2/repository/de/jollyday/jollyday/0.4.7/jollyday-0.4.7.jar ...
	[SUCCESSFUL ] de.jollyday#jollyday;0.4.7!jollyday.jar (2ms)
downloading file:/Users/srowen/.m2/repository/com/googlecode/efficient-java-matrix-library/ejml/0.23/ejml-0.23.jar ...
	[SUCCESSFUL ] com.googlecode.efficient-java-matrix-library#ejml;0.23!ejml.jar (2ms)
downloading file:/Users/srowen/.m2/repository/javax/json/javax.json-api/1.0/javax.json-api-1.0.jar ...
	[SUCCESSFUL ] javax.json#javax.json-api;1.0!javax.json-api.jar(bundle) (1ms)
downloading file:/Users/srowen/.m2/repository/xml-apis/xml-apis/1.3.03/xml-apis-1.3.03.jar ...
	[SUCCESSFUL ] xml-apis#xml-apis;1.3.03!xml-apis.jar (2ms)
downloading file:/Users/srowen/.m2/repository/xerces/xercesImpl/2.8.0/xercesImpl-2.8.0.jar ...
	[SUCCESSFUL ] xerces#xercesImpl;2.8.0!xercesImpl.jar (4ms)
downloading file:/Users/srowen/.m2/repository/xalan/xalan/2.7.0/xalan-2.7.0.jar ...
	[SUCCESSFUL ] xalan#xalan;2.7.0!xalan.jar (6ms)
downloading file:/Users/srowen/.m2/repository/javax/xml/bind/jaxb-api/2.2.7/jaxb-api-2.2.7.jar ...
	[SUCCESSFUL ] javax.xml.bind#jaxb-api;2.2.7!jaxb-api.jar (2ms)
:: resolution report :: resolve 7773ms :: artifacts dl 295ms
	:: modules in use:
	com.googlecode.efficient-java-matrix-library#ejml;0.23 from local-m2-cache in [default]
	com.io7m.xom#xom;1.2.10 from local-m2-cache in [default]
	de.jollyday#jollyday;0.4.7 from local-m2-cache in [default]
	edu.stanford.nlp#stanford-corenlp;3.4.1 from local-m2-cache in [default]
	javax.json#javax.json-api;1.0 from local-m2-cache in [default]
	javax.xml.bind#jaxb-api;2.2.7 from local-m2-cache in [default]
	joda-time#joda-time;2.1 from local-m2-cache in [default]
	xalan#xalan;2.7.0 from local-m2-cache in [default]
	xerces#xercesImpl;2.8.0 from local-m2-cache in [default]
	xml-apis#xml-apis;1.3.03 from local-m2-cache in [default]
	:: evicted modules:
	xml-apis#xml-apis;2.0.2 by [xml-apis#xml-apis;1.3.03] in [default]
	---------------------------------------------------------------------
	|                  |            modules            ||   artifacts   |
	|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
	---------------------------------------------------------------------
	|      default     |   11  |   10  |   10  |   1   ||   10  |   10  |
	---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
	confs: [default]
	10 artifacts copied, 0 already retrieved (212755kB/187ms)
Main class:
org.apache.spark.repl.Main
Arguments:

System properties:
(SPARK_SUBMIT,true)
(spark.app.name,Spark shell)
(spark.jars,file:/Users/srowen/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar,file:/Users/srowen/.ivy2/jars/com.io7m.xom_xom-1.2.10.jar,file:/Users/srowen/.ivy2/jars/joda-time_joda-time-2.1.jar,file:/Users/srowen/.ivy2/jars/de.jollyday_jollyday-0.4.7.jar,file:/Users/srowen/.ivy2/jars/com.googlecode.efficient-java-matrix-library_ejml-0.23.jar,file:/Users/srowen/.ivy2/jars/javax.json_javax.json-api-1.0.jar,file:/Users/srowen/.ivy2/jars/xml-apis_xml-apis-1.3.03.jar,file:/Users/srowen/.ivy2/jars/xerces_xercesImpl-2.8.0.jar,file:/Users/srowen/.ivy2/jars/xalan_xalan-2.7.0.jar,file:/Users/srowen/.ivy2/jars/javax.xml.bind_jaxb-api-2.2.7.jar)
(spark.submit.deployMode,client)
(spark.master,local[*])
Classpath elements:
/Users/srowen/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar
/Users/srowen/.ivy2/jars/com.io7m.xom_xom-1.2.10.jar
/Users/srowen/.ivy2/jars/joda-time_joda-time-2.1.jar
/Users/srowen/.ivy2/jars/de.jollyday_jollyday-0.4.7.jar
/Users/srowen/.ivy2/jars/com.googlecode.efficient-java-matrix-library_ejml-0.23.jar
/Users/srowen/.ivy2/jars/javax.json_javax.json-api-1.0.jar
/Users/srowen/.ivy2/jars/xml-apis_xml-apis-1.3.03.jar
/Users/srowen/.ivy2/jars/xerces_xercesImpl-2.8.0.jar
/Users/srowen/.ivy2/jars/xalan_xalan-2.7.0.jar
/Users/srowen/.ivy2/jars/javax.xml.bind_jaxb-api-2.2.7.jar


Warning: Local jar /Users/srowen/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar does not exist, skipping.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/03/28 22:31:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/28 22:31:39 ERROR SparkContext: Failed to add file:/Users/srowen/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar to Spark environment
java.io.FileNotFoundException: Jar /Users/srowen/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar not found
	at org.apache.spark.SparkContext.liftedTree1$1(SparkContext.scala:1820)
	at org.apache.spark.SparkContext.addJar(SparkContext.scala:1817)
	at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:466)
	at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:466)
	at scala.collection.immutable.List.foreach(List.scala:381)
...

@BryanCutler
Copy link
Member

So is the problem that it downloads stanford-corenlp-3.4.1-models.jar but thinks it is stanford-corenlp-3.4.1.jar?

It looks like it might be possible to add the classifier to ModuleRevisionId.newInstance, have you tried just doing that instead of dd.addDependencyArtifact?

If I have some time, I'll give it a shot to run it and see what's going on..

@srowen
Copy link
Member Author

srowen commented Mar 30, 2017

Yeah, it clearly downloads the right models .jar file, puts it in the .ivy cache correctly (I can find it there as expected) but then spark.jars isn't set correctly:

...
(spark.jars,file:/Users/srowen/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar)
...

Let me push some more updates including the one you suggested, which didn't seem to change this, but I wouldn't expect it to. Still tracing down exactly what sets this value of spark.jars.

@SparkQA
Copy link

SparkQA commented Mar 30, 2017

Test build #75389 has finished for PR 17416 at commit dec7bfb.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Mar 30, 2017

The essence of the problem is that the ResolveReport from ivy.resolve no longer contains any information about the classifier. I can't see why or what needs to be configured in Ivy to make that happen.

It is possible to write some code to stitch that extra info back onto the resolved artifacts, I guess. I think that would still leave some corner cases that would not work, like, depending on two artifacts that are identical except for a classifier.

@sethah
Copy link
Contributor

sethah commented Apr 14, 2017

@srowen Can you confirm what happens when the jars are not found in your local m2 cache? Do you still find the -models jar in the ivy2 cache?

@srowen
Copy link
Member Author

srowen commented Apr 14, 2017

@sethah yeah it downloads the right models jar and copies it into the ivy cache. It just doesn't know that it's what belongs on the classpath. spark.jars doesn't get the memo and I can see why, because the ResolveReport doesn't return the fact that these artifacts have classifiers attached.

I can't figure it out; I'd be surprised if it can't work but don't know enough Ivy to fix it. I was thinking of manually mapping the input artifacts' classifiers to the resolved ones' classifiers but that won't work in the general case.

This is mostly seeing if you see something I missed. It's a corner case though causes a problem for a few key artifacts that need classifiers. If it's not obvious how to continue I think I'd shelve this for now and admit defeat.

@BryanCutler
Copy link
Member

@srowen , I finally had some time to look into this and I was able to get the correct jar on the classpath. The fix was to use the code you had in the previous commit for SparkSubmit.addDependenciesToIvy so that the extraAttributes is set with dd.addDependencyArtifact and doesn't need to be in the ModuleRevisionId - so it was my bad advice that probably screwed this up :<

The reason is that when the DefaultDependencyDescriptor gets resolved in DefaultModuleDescriptor.java, if there are no artifacts defined, it adds 1 but does not copy over the extraAttributes, that's why the resolve report doesn't know about it. But if there are artifacts (which come from addDependencyArtifact) then the extraAttributes are carried over. wow, this is really confusing - hopefully this makes sense, see the code below BasicResolver. getDependency(DependencyDescriptor dd, ResolveData data) calls DefaultModuleDescriptor.newDefaultInstance

public static DefaultModuleDescriptor newDefaultInstance(ModuleRevisionId mrid,
            DependencyArtifactDescriptor[] artifacts) {
        DefaultModuleDescriptor moduleDescriptor = new DefaultModuleDescriptor(mrid, "release",
                null, true);
        moduleDescriptor.addConfiguration(new Configuration(DEFAULT_CONFIGURATION));
        if (artifacts != null && artifacts.length > 0) {
            for (int i = 0; i < artifacts.length; i++) {
                moduleDescriptor.addArtifact(DEFAULT_CONFIGURATION,
                    new MDArtifact(moduleDescriptor, artifacts[i].getName(),
                            artifacts[i].getType(), artifacts[i].getExt(), artifacts[i].getUrl(),
                            artifacts[i].getExtraAttributes()));
            }
        } else {
            moduleDescriptor.addArtifact(DEFAULT_CONFIGURATION, new MDArtifact(moduleDescriptor,
                    mrid.getName(), "jar", "jar"));
        }
        moduleDescriptor.setLastModified(System.currentTimeMillis());
        return moduleDescriptor;
    }

I think that some other code you added in the second commit was also required, which is maybe why it didn't work for you in the first place, but give it another try. Here is the output from my test, looks like it should work now:

bin/spark-submit --packages edu.stanford.nlp:stanford-corenlp:jar:models:3.4.1 -v examples/src/main/python/pi.py
Using properties file: /home/bryan/git/spark/conf/spark-defaults.conf
Adding default property: spark.history.fs.logDirectory=/home/bryan/git/spark/logs/history
Adding default property: spark.eventLog.dir=/home/bryan/git/spark/logs/history
Adding default property: drill.enable_unsafe_memory_access=false
Warning: Ignoring non-spark config property: drill.enable_unsafe_memory_access=false
Parsed arguments:
  master                  local[*]
  deployMode              null
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          /home/bryan/git/spark/conf/spark-defaults.conf
  driverMemory            null
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               null
  primaryResource         file:/home/bryan/git/spark/examples/src/main/python/pi.py
  name                    pi.py
  childArgs               []
  jars                    null
  packages                edu.stanford.nlp:stanford-corenlp:jar:models:3.4.1
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file /home/bryan/git/spark/conf/spark-defaults.conf:
  (spark.history.fs.logDirectory,/home/bryan/git/spark/logs/history)
  (spark.eventLog.dir,/home/bryan/git/spark/logs/history)

    
Ivy Default Cache set to: /home/bryan/.ivy2/cache
The jars for the packages stored in: /home/bryan/.ivy2/jars
:: loading settings :: url = jar:file:/home/bryan/git/spark/assembly/target/scala-2.11/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
edu.stanford.nlp#stanford-corenlp added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
	confs: [default]
	found edu.stanford.nlp#stanford-corenlp;3.4.1 in central
downloading https://repo1.maven.org/maven2/edu/stanford/nlp/stanford-corenlp/3.4.1/stanford-corenlp-3.4.1-models.jar ...
	[SUCCESSFUL ] edu.stanford.nlp#stanford-corenlp;3.4.1!stanford-corenlp.jar (118730ms)
:: resolution report :: resolve 1164ms :: artifacts dl 118732ms
	:: modules in use:
	edu.stanford.nlp#stanford-corenlp;3.4.1 from central in [default]
	---------------------------------------------------------------------
	|                  |            modules            ||   artifacts   |
	|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
	---------------------------------------------------------------------
	|      default     |   1   |   1   |   0   |   0   ||   1   |   1   |
	---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
	confs: [default]
	0 artifacts copied, 1 already retrieved (0kB/5ms)
Main class:
org.apache.spark.deploy.PythonRunner
Arguments:
file:/home/bryan/git/spark/examples/src/main/python/pi.py
/home/bryan/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-models-3.4.1.jar
System properties:
(SPARK_SUBMIT,true)
(spark.submit.pyFiles,/home/bryan/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-models-3.4.1.jar)
(spark.history.fs.logDirectory,/home/bryan/git/spark/logs/history)
(spark.files,file:/home/bryan/git/spark/examples/src/main/python/pi.py,file:/home/bryan/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-models-3.4.1.jar)
(spark.app.name,pi.py)
(spark.jars,file:/home/bryan/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-models-3.4.1.jar)
(spark.submit.deployMode,client)
(spark.eventLog.dir,/home/bryan/git/spark/logs/history)
(spark.master,local[*])
Classpath elements:
/home/bryan/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-models-3.4.1.jar

@srowen
Copy link
Member Author

srowen commented Apr 15, 2017

Thanks @BryanCutler ! so, eh, do you mean you have a working version of this change? feel free to open it as a PR to master or to this PR or whatever, whatever's simplest. I'd be pleased to have a look and get it in.

@BryanCutler
Copy link
Member

Sure, I can PR against your branch. I didn't do anything new, just used the prev version of your addDependenciesToIvy method and it seemed to work for me. I also tried out what would happen if you tried to get 2 artifacts with the same mvn coords, just different classifiers, and it just picks the last one. Not sure if it was spark or ivy that ignored the other though... for example:

--packages edu.stanford.nlp:stanford-corenlp:3.4.1,edu.stanford.nlp:stanford-corenlp:jar:models:3.4.1

both get downloaded but only the models get on the classpath

@srowen
Copy link
Member Author

srowen commented Apr 18, 2017

OK @BryanCutler whatever's easiest for you. Whatever you have working, I can test and get merged.

@BryanCutler
Copy link
Member

@srowen , I opened srowen#1 in case you didn't see it

@SparkQA
Copy link

SparkQA commented Apr 18, 2017

Test build #75914 has finished for PR 17416 at commit 23a3d0d.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Apr 19, 2017

Thanks @BryanCutler ! yes that's working better. I can run, for example, ./bin/spark-shell --packages edu.stanford.nlp:stanford-corenlp:jar:models:3.4.1 and I can see that it's now on the classpath, because I can load its files with getResourceAsStream.

There's a smaller corner case that doesn't quite seem to work, and that is adding both a classified and unclassified version of the same artifact, as in ./bin/spark-shell --packages edu.stanford.nlp:stanford-corenlp:jar:models:3.4.1,edu.stanford.nlp:stanford-corenlp:3.4.1 The classifier version isn't loaded in this case.

Unless you see a clean fix for that too (I don't yet, still looking), I'm still inclined to merge this as it makes a significant set of artifacts now addressable, and loading via spark-shell is a bit best-effort anyway (this problem doesn't exist for a packaged app).

I need to add a test.

@SparkQA
Copy link

SparkQA commented Apr 19, 2017

Test build #75936 has finished for PR 17416 at commit 4126702.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Apr 19, 2017

Well it's a little different, now that I've modified the tests to cover the classifier case. In this case, when the main artifact has a classifier, it doesn't find the dependency that the test sets up. It's definitely declared correctly in the POM that's created. I'm guessing somehow adding the DependencyArtifactDescriptor prevents it from attempting resolution or something. Still looking into it...

@BryanCutler
Copy link
Member

There's a smaller corner case that doesn't quite seem to work, and that is adding both a classified and unclassified version of the same artifact,

Yeah, I'm not sure why exactly that doesn't work. I know I saw it downloads them both so I think it should work, but I didn't have time to look where it goes wrong. I agree that could be addressed later and this would be fine to be merged, pending the test.

@srowen
Copy link
Member Author

srowen commented Apr 20, 2017

Yeah, I think it's a little bigger than that -- the test suggests that it won't load dependencies of an artifact with a classifier, which is better than nothing maybe, but still broken enough that I'm worried it is confusing to commit this.

I may noodle on this a bit more but I couldn't make a breakthrough yesterday. Just checking if you had any bright ideas from your investigation but if not I may shelve this one for now.

@BryanCutler
Copy link
Member

Oh, I see. I would have expected it to resolve those dependencies too, but it's hard to tell from the Ivy code what it will do. Was this part of a unit test you added here, or just some manual testing you were doing?

@srowen
Copy link
Member Author

srowen commented Apr 24, 2017

Yeah @BryanCutler basically I was just modifying the existing test to include an artifact with a classifier. If you just add one to the first line in this test in SparkSubmitUtilsSuite I think you'd see it:

  test("search for artifact at local repositories") {
    val main = new MavenCoordinate("my.great.lib", "mylib", "0.1")
    val dep = "my.great.dep:mydep:0.5"

It won't find the dependency then.

@srowen
Copy link
Member Author

srowen commented Apr 27, 2017

Closing this for now -- happy to revive it if anyone has other ideas or wants to run with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants