-
-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect assemblies with too many entries to fail shell script prepending #3140
Conversation
This PR is based on #2655
val problematicEntryCount = 65535 | ||
if ( | ||
prependScript.isDefined && | ||
(upstream.addedEntries + created.addedEntries) > problematicEntryCount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's really worth keeping track of the number of entries incrementally? v.s. creating the assembly first, scanning it, and then failing if it has too many entries. If the performance of scanning the assembly is acceptable (milliseconds to tens of milliseconds?) then that would save us a bunch of book-keeping passing around addedEntries
values, and a bunch of churn in replacing upstreamAssembly
with upstreamAssembly2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably depends. When we scan the jar right after it's written, it should be in the file system cache of any reasonable OS and scanning should be fast. On the other side, just remembering the added count is quasi for free, esp. when we assume, that the upstream-assembly is the larger portion of the assembly and keeps stable.
I think I want to experiment what the result looks like and how it performs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issues with scanning afterwards:
- Scanning and counting entries is slow. Scanning the jar of test case
noExe.large
took almost 3 seconds (75516 entries), although we just wrote it - Java's
ZipInputStream
andJarInputStream
aren't able to find anyZipEntry
in a prependend jar making scanning after packaging non-trivial - Shelling out to some OS-installed zip tools doesn't seem right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This is an attempt to fix the issue of invalid assembly files which occurs when the following two conditions are met:
65535
ZIP entriesThe issue was reported and analyzed in the following tickets:
This issue also hits other build tools, but it seems Mill is the only tools which automatically enables shell script prepending by default.
Since there is no real fix available, we simply try to detect the issue after the fact and fail the assemble task with a actionable error message.
To make the fix binary compatible, I had to deprecated the
upstreamAssembly
target in favor to the newupstreamAssembly2
target, which returns also the added ZIP entry count. Since this can be a behavioral change when users have overridden theupstreamAssembly
target, I also added some warning messages with will detect this at runtime and provide actionable help.