JavaModule.assembly produces invalid or corrupt jar file (was: publishLocal as altenative to jars : which are corrupt often and at random) #528

siddhartha-gadgil · 2019-01-18T04:08:37Z

I am getting a lot of corrupt jars, even though I have cleared/checked everything in sight several times. This happens to fat jars in sbt too, but there we have the alternative (for use in Jupyter, for example) of publishLocal. Can mill provide such an option?

lihaoyi · 2019-01-18T04:52:49Z

Mill already supports .publishLocal

lefou · 2019-01-19T18:56:44Z

@siddhartha-gadgil So, what exactly is your issue? What is corrupt? And how did this happen?

siddhartha-gadgil · 2019-01-20T02:12:00Z

Firstly, I have set up _publishLocal_ and this is working fine for the case I needed now (Jupyter Notebooks with Almond). In my work desktop, but not my home laptop, the output of assembly was corrupted for some modules. This meant that executable binaries would crash with "mainClass ... not found" and if I loaded in ammonite using `import $cp.myjar` subsequent commands would not find the contents. I found online that I should try `jar tf`, and this indeed confirmed cooruption. Presumably some source/doc jar of a dependency is corrupted, causing the upstream corruption. In fact it would be best for me to have a thinFat jar for running, excluding the sources and docs (especially of dependencies). Is that already an option?

…

On Sun, Jan 20, 2019 at 12:26 AM Tobias Roeser ***@***.***> wrote: @siddhartha-gadgil <https://github.com/siddhartha-gadgil> So, what exactly is your issue? What is corrupt? And how did this happen? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#528 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADatpLqa_Pf9PZdbnjLCZOqsjaRbKg4hks5vE2ptgaJpZM4aHMcz> .

lefou · 2019-01-20T09:49:49Z

To be honest, I have no clue what you wanna tell us. It would be helpful to be more concrete. E.g. the exact mill cmdline you used when mill crashed. The exact error message.If possible, the build.sc...

Looks like you should better ask on https://gitter.im/lihaoyi/mill.

siddhartha-gadgil · 2019-01-20T11:11:16Z

I did not mean that mill crashes, but that the executable jar generated by mill myproject.assembly crashes, and in a random and hard to diagnose (at least for me) way. The reason is almost certainly some cached corrupt jar, and probably a source/documentation jar.

My main request was to publishLocal, as an alternative, which @lihaoyi pointed out is already part of mill.
A feature that would be nice is a minimal fat-jar, i.e. excluding source and documentation dependencies, so reducing the chance of corrupt jars fouling this up.
Even better would be better diagnostics while building jars, though this may not be a mill issue.

lihaoyi · 2019-01-20T11:39:50Z

@siddhartha-gadgil do you happen to be using Mill concurrently? Mill doesn't have proper concurrency control, so while concurrent usually works, if tasks happen to overlap in what they are doing things blow up in odd ways.

If you want diagnostics while building jars, feel free to build the jars yourself by copy-pasting the Mill code into your build.sc and add whatever diagnostics you'd like.

siddhartha-gadgil · 2019-01-20T11:44:34Z

Quite possibly that is happening. But the main problem is probably a giant dependency on the Stanford Parser (coupled with not great internet) - random bits get corrupted in the cache, and any piece corrupted seems to make the jar unusable.

I'll try manually building by copy-pasting, with checks thrown in for corrupt jars.

lihaoyi · 2019-01-20T11:50:21Z

@siddhartha-gadgil for internet-related issues with upstream dependencies, those issues should be persistent until you clean your coursier cache in ~/.coursier. If you are seeing things behave nondeterministically without clearing that cache, it is unlikely to be internet related.

Also, if you think things are being corrupted while building jars, use mill inspect and mill show to trawl the dependency graph of your task that created the corrupted jar and look at the input jars/folders to see if they contain what you expect. This might help you narrow down the corrupted jar to the actual culprit doing the corruption

siddhartha-gadgil · 2019-01-20T12:40:24Z

It is not non-deterministic - I have a natural experiment because of two work systems, and I meant built on one and not on the other. I also meant changes when some dependencies change. I have tried trawling and resetting, but it is not easy manually at least on large scale.

For example, today I used publishLocal, got a crash in ammonite, deleting the corresponding coursier file and re-published to fix the error. But without the pointer from running (or some other script-based way) it would not be practical to find which dependency is corrupt.

siddhartha-gadgil · 2019-01-20T13:45:13Z

This may actually be a mill issue, but I will get more data and report. I generated a list in mill of the _upstreamAssemblyClasspath_, and checked that "jar tf _" loaded successfully in each of them. However the same command on the output gives "java.util.zip.ZipException: invalid END header (bad central directory size)"

…

On Sun, Jan 20, 2019 at 5:20 PM Li Haoyi ***@***.***> wrote: @siddhartha-gadgil <https://github.com/siddhartha-gadgil> internet-related issues with upstream dependencies, those issues *should* be persistent until you clean your coursier cache in ~/.coursier. If you are seeing things behave nondeterministically without clearing that cache, it is unlikely to be internet related. Also, if you think things are being corrupted while building jars, use mill inspect and mill show to trawl the dependency graph of your task that created the corrupted jar and look at the input jars/folders to see if they contain what you expect. This might help you narrow down the corrupted jar to the actual culprit doing the corruption — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#528 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADatpLP5r-pGTEqtrDdDFH8vciuGvYGPks5vFFf_gaJpZM4aHMcz> .

siddhartha-gadgil · 2019-01-20T13:58:20Z

Sorry for the probably wrong report. Looks by searching that the sheer size may be the issue (over 65k files) because of the large dependency. I confirmed that unzip worked fine. On Sun, Jan 20, 2019 at 7:14 PM Siddhartha Gadgil < siddhartha.gadgil@gmail.com> wrote:

…

This may actually be a mill issue, but I will get more data and report. I generated a list in mill of the _upstreamAssemblyClasspath_, and checked that "jar tf _" loaded successfully in each of them. However the same command on the output gives "java.util.zip.ZipException: invalid END header (bad central directory size)" On Sun, Jan 20, 2019 at 5:20 PM Li Haoyi ***@***.***> wrote: > @siddhartha-gadgil <https://github.com/siddhartha-gadgil> > internet-related issues with upstream dependencies, those issues *should* > be persistent until you clean your coursier cache in ~/.coursier. If you > are seeing things behave nondeterministically without clearing that cache, > it is unlikely to be internet related. > > Also, if you think things are being corrupted while building jars, use mill > inspect and mill show to trawl the dependency graph of your task that > created the corrupted jar and look at the input jars/folders to see if they > contain what you expect. This might help you narrow down the corrupted jar > to the actual culprit doing the corruption > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#528 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/ADatpLP5r-pGTEqtrDdDFH8vciuGvYGPks5vFFf_gaJpZM4aHMcz> > . >

lefou · 2021-01-28T10:44:30Z

FYI, it looks like the assembly target has an issue with very large assemblies and an non-empty prependShellScript. Here is a workaround:

override def prependShellScript: T[String] = ""

Ammonite-Bot · 2021-01-28T10:48:59Z

@lefou yes I have seen that misbehaviour before, when I was building some 500mb assemblies. I worked around it by disabling the prepend shell script

…

On Thu, 28 Jan 2021 at 6:44 PM, Tobias Roeser ***@***.***> wrote: Reopened #528 <#528>. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#528 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AE5HBDH2RO4O2EYDWHBFKR3S4E52FANCNFSM4GQ4Y4ZQ> .

lefou · 2021-01-28T10:52:12Z

So, I think we should create a test case that reproduced this issue and then fix it. After a first glace at the code I think it could be, a proper fix will need to go to os-lib.

lefou · 2021-05-21T06:38:32Z

I recently fixed some issues with left open file handles in assembly processing (which constantly failed windows tests, #1327). There is a minimal chance this issue just vanishes after that fix, too. It would be nice, If someone could report if this issue is still present with mill >= 0.9.7-9-848292.

lefou · 2021-10-01T06:13:20Z

This is probably fixed?

lefou · 2021-10-01T06:24:27Z

Please reopen or comment if you find this issue is still valid!

lefou · 2023-07-05T14:36:07Z

This issue is still present. See #2650 for a reproduction.

Reprodcution of issues * #528 * #2650

…ng (#3140) This is an attempt to fix the issue of invalid assembly files which occurs when the following two conditions are met: * The JAR file has more than `65535` ZIP entries * The JAR has a prepended shell script The issue was reported and analyzed in the following tickets: * #528 * #2650 This issue also hits other build tools, but it seems Mill is the only tools which automatically enables shell script prepending by default. Since there is no real fix available, we simply try to detect the issue after the fact and fail the assemble task with a actionable error message. To make the fix binary compatible, I had to deprecated the `upstreamAssembly` target in favor to the new `upstreamAssembly2` target, which returns also the added ZIP entry count. Since this can be a behavioral change when users have overridden the `upstreamAssembly` target, I also added some warning messages with will detect this at runtime and provide actionable help. Pull request: #3140

abbadmus · 2024-05-31T03:04:18Z

why do i keep getting this error

./out/app/assembly.dest/out.jar
Error: Could not find or load main class MyApp
Caused by: java.lang.ClassNotFoundException: MyApp

lefou · 2024-05-31T07:14:12Z

why do i keep getting this error

You probably have a too large assembly and hit an issue with the JVM/JDK. Latest Mill snapshots and the upcoming version 0.11.8 will detect this and recommend a fix.

In the meantime, just use the workaround from #528 (comment).

lihaoyi closed this as completed Jan 18, 2019

lefou added the invalid This issue is invalid or lacks required information label Apr 18, 2019

lefou reopened this Jan 28, 2021

lefou changed the title ~~publishLocal as altenative to jars : which are corrupt often and at random~~ JavaModule.assembly produces invalid or corrupt jar file (was: publishLocal as altenative to jars : which are corrupt often and at random) Jan 28, 2021

lefou added bug The issue represents an bug and removed invalid This issue is invalid or lacks required information labels Jan 29, 2021

lefou mentioned this issue Jan 29, 2021

Push all bytes when reading from a file channel com-lihaoyi/os-lib#50

Merged

lefou closed this as completed Oct 1, 2021

lefou mentioned this issue Oct 1, 2021

Document prependShellScript (was: Assembly JAR produces invalid zip listing) #580

Closed

lefou mentioned this issue Jul 4, 2023

assembly error #2650

Closed

lefou reopened this Jul 5, 2023

lefou added the workaround-available label Jul 5, 2023

lefou mentioned this issue Jul 5, 2023

Analyze assembly error - Reproduction of #528 and #2650 #2655

Closed

lefou added a commit that referenced this issue Jul 5, 2023

Analyze assembly error - Reproduction of #528 and #2650

1b12362

Reprodcution of issues * #528 * #2650

lefou linked a pull request Jul 5, 2023 that will close this issue

Analyze assembly error - Reproduction of #528 and #2650 #2655

Closed

lefou added a commit that referenced this issue May 3, 2024

Analyze assembly error - Reproduction of #528 and #2650

e98c4da

Reprodcution of issues * #528 * #2650

lefou mentioned this issue May 3, 2024

Detect assemblies with too many entries to fail shell script prepending #3140

Merged

lefou added this to the 0.11.8 milestone May 6, 2024

lefou linked a pull request May 6, 2024 that will close this issue

Detect assemblies with too many entries to fail shell script prepending #3140

Merged

lefou closed this as completed May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JavaModule.assembly produces invalid or corrupt jar file (was: publishLocal as altenative to jars : which are corrupt often and at random) #528

JavaModule.assembly produces invalid or corrupt jar file (was: publishLocal as altenative to jars : which are corrupt often and at random) #528

siddhartha-gadgil commented Jan 18, 2019

lihaoyi commented Jan 18, 2019

lefou commented Jan 19, 2019

siddhartha-gadgil commented Jan 20, 2019 via email

lefou commented Jan 20, 2019

siddhartha-gadgil commented Jan 20, 2019

lihaoyi commented Jan 20, 2019

siddhartha-gadgil commented Jan 20, 2019

lihaoyi commented Jan 20, 2019 •

edited

Loading

siddhartha-gadgil commented Jan 20, 2019

siddhartha-gadgil commented Jan 20, 2019 via email

siddhartha-gadgil commented Jan 20, 2019 via email

lefou commented Jan 28, 2021

Ammonite-Bot commented Jan 28, 2021 via email

lefou commented Jan 28, 2021

lefou commented May 21, 2021 •

edited

Loading

lefou commented Oct 1, 2021

lefou commented Oct 1, 2021

lefou commented Jul 5, 2023

abbadmus commented May 31, 2024 •

edited

Loading

lefou commented May 31, 2024

JavaModule.assembly produces invalid or corrupt jar file (was: publishLocal as altenative to jars : which are corrupt often and at random) #528

JavaModule.assembly produces invalid or corrupt jar file (was: publishLocal as altenative to jars : which are corrupt often and at random) #528

Comments

siddhartha-gadgil commented Jan 18, 2019

lihaoyi commented Jan 18, 2019

lefou commented Jan 19, 2019

siddhartha-gadgil commented Jan 20, 2019 via email

lefou commented Jan 20, 2019

siddhartha-gadgil commented Jan 20, 2019

lihaoyi commented Jan 20, 2019

siddhartha-gadgil commented Jan 20, 2019

lihaoyi commented Jan 20, 2019 • edited Loading

siddhartha-gadgil commented Jan 20, 2019

siddhartha-gadgil commented Jan 20, 2019 via email

siddhartha-gadgil commented Jan 20, 2019 via email

lefou commented Jan 28, 2021

Ammonite-Bot commented Jan 28, 2021 via email

lefou commented Jan 28, 2021

lefou commented May 21, 2021 • edited Loading

lefou commented Oct 1, 2021

lefou commented Oct 1, 2021

lefou commented Jul 5, 2023

abbadmus commented May 31, 2024 • edited Loading

lefou commented May 31, 2024

lihaoyi commented Jan 20, 2019 •

edited

Loading

lefou commented May 21, 2021 •

edited

Loading

abbadmus commented May 31, 2024 •

edited

Loading