-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java API mismatch when building Gerrit and JGit from source on 0.16.1 #6027
Comments
This is #5888, which is fixed in 0.17rc1. We're adding better regression test coverage in this area. |
I'm not sure I'm following, the problem you are linked to explicitly passing
but we don't do that. And another specific problem here, that the wrong byte code generation only affects sources compiled form external repository (jgit) and not the main repository (gerrit). How that? |
No, the problem I linked to is that default toolchain in 0.16.1 compiles against the JDK 9 APIs. ByteBuffer#mark is a covariant override that was added in JDK 9, and does not exist on JDK 8.
Does the main repository include any code that calls I think the external repository thing is a red herring, I'd expect the same behaviour if you compiled that code directly in your main repo. |
Ah, thanks for clarifying, that makes sense! But then, this is a pretty much show stopper for many Bazel users, so that I'm surprised, that we don't see 0.16.2 release that fixed that major breakage ASAP. |
Can you explain Gerrit Code Review and JGit communities, why we are not seeing Patch Release for 0.16.1 for such severe breakage in Bazel latest release, leaving us without a fix for weeks/months? (Of course, we can build Bazel from source or use 0.17.rc1, but I think the appropriate reaction here is to patch latest release, or alternatively, as adequate reaction immediately promote 0.17rc1 as stable release). |
Uh, this is bad. I also thought this was #5888, but it's a different issue: #5888 came about because we did not use any bootclasspath with a JDK 8 282b883 fixes this (turns out, @meisterT and @buchgr got hit by this on our continuous integration), but it was not cherry-picked into 0.16.1 because that release was pushed on Aug 13 and the bug was discovered on Aug 16. @davido , can you check if 0.17rc1 fixes this? If it doesn't, we have an even bigger problem. |
Yes, 0.17-rc1 fixes this. (Refer to the Gerrit mailing list discussion from here, where @davido reports the same - reproducible with 0.16.1, not with 0.17-rc1.)
I guess this only affects developers working on JGit-Gerrit, basically. And if older Bazel versions are not affected, this is not a big deal, I think? For me, being a bit of a newbie in Bazel (and Java, and Gerrit development), it was more like "I'm totally puzzled" and wasted a few hours until @davido saved me. 🤷♂️ 😄 |
That's good news! (I also classified mentally this as "not a big deal", since we haven't heard about this from anyone else other than @davido , thus my surprise about his classification of "severe" -- is this especially painful for JGit/Gerrit for some reason?) |
Even though, JGit is built with Maven, we support Bazel driven build toolchain for this project [1]. Today we know, that 0.16.1 and (earlier Bazel stable releases?) produces broken byte code that cannot be run with default Java 8. This is enough reason for me to conduct a patch release. Not to mention, that @gertvdijk and myself spent hours investigating this breakage reported here, without linking that to broken Bazel distribution. I find it disappointing, that 282b883 was fixed 14 days ago, and today, we still don't have a working latest Bazel release that fixed that severe problem (yes, producing broken byte code is severe problem to me). Moreover I have rejected the upgrade of our CI to Bazel 0.16.1 release: [2] after learning that this Bazel release is inherently broken.
We can handle this and wait a couple of days for stable 0.17 release. |
Expected date of release for 0.17 is September 6, but it will depend on release blocking bugs and cherrypicks that are needed. If you want to request a cherrypick, please ask it on #5059. |
Thanks, that would be fine. |
That's still #5888. If no If we didn't use any bootclasspath at all the compilation would fail with |
I hit the same problem when running JGit tests using bazel 0.16.1: $ git clone https://git.eclipse.org/jgit/jgit I installed bazel 0.17.0rc1, this fixes the problem |
Yikes - that's a much simpler case to run into this issue. Does that make a valid point to mark it "severe", instead "not a big deal"? @lberki |
bazel is a secondary build system for jgit which is mostly used by those also working on Gerrit. Hence I think we can live with 0.17.0rc1 for now and update to 0.17.0 when you release it next week. |
@cushon : indeed, this is a special case of #5888 . @gertvdijk: the issue is in fact severe in terms of how broken we are, but it does not meet the bar of "widespread breakage" for a patch release, especially if @laurentlb's estimate is correct about 0.17 being out next week. @laurentlb: WDYT? I'm somewhat worried about #6035 throwing a wrench in the works, but it we release 0.17 on September 6 and it works for JGit/Gerrit, let's not do a patch release. @davido @gertvdijk : how feasible is it to not update to 0.16.1 for the time being? |
I don't have a big voice in this (being just a random newbie-Bazel/Gerrit developer running into it), but I can also live with no new 0.16.x release for the time being. |
@lberki As I said in my previous comment we can handle that and wait for 0.17. What is worring me, is that there was no appropriate communication from Bazel-Java-Team side and that the severity of #5888 and the consequences it cause (apparently only Bazel code is affected?) were underestimated. Since minute 1. after you have confirmed, that 0.16.1 that was released in the wild, and under some circumstances (it doesn't matter how rare they are), will produce broken byte code for your users, I would expect that one of two things happen:
(We are all developers here and we are all making mistakes and we fix them later and that's all fine. It's about communication about severe breakages in stable releases that should be improved and avoid your users spending hours investigation in vane -- for known and fixed but not yet released problems.) |
I think the disconnect here is that your expectations about how breakages should be handled are not in line with our ideas about the same. Which is, of course, because there aren't clear guidelines (or at least I don't know about them...) and which I why I started this thread here: https://groups.google.com/d/msg/bazel-discuss/wYocSoGTJOg/aJnD4ZqUCAAJ It's not strictly about handling breakages, but about dealing with incompatible changes we need to make and these issues are related to each other; we currently do incompatible changes without much thought, which, of course, means that we are unable to treat breakages as automatic showstoppers because, well, they are expected to some degree. I don't think that's a tenable position forever, but it a balance needs to be struck between backwards compatibility and the ability to remove functionality we regret having. |
Yes, I'm aware of this thread and I agree that there should be an agreement on the process of how experimental flags and features are discontinued or promoted and renamed to stable flags and features. Still, I have a strong impression that we are talking past each other here. I'm talking about the outcome of
Do you agree that the second failure is much more painful, because it is much harder to track down, understand and fix? To me that problem is fundamentally different from discontinuation policy for experimental flags and features and as such requires adequate reaction when discovered in stable Bazel releases (as mentioned in my previous comment). |
For what it's worth, this affected ~everyone in Gerrit Code Review team at Google. We didn't report it because we didn't realize it was a bazel bug (actually we didn't understand that until today). |
This is a duplicate of #5888. Please feel free to close it. |
Check the bazel version using the checker from bazel_skylib, and require at least version 0.17.1 which is the minimum version that does not suffer from the Java API mismatch issue [1]. The implementation is borrowed from the Gerrit project. [1] bazelbuild/bazel#6027 Change-Id: I224250088a1f5072fcaa3ec81228f4d2cb8cb389 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Check the bazel version using the checker from bazel_skylib, and require at least version 0.17.1 which is the minimum version that does not suffer from the Java API mismatch issue [1]. The implementation is borrowed from the Gerrit project. [1] bazelbuild/bazel#6027 Change-Id: I224250088a1f5072fcaa3ec81228f4d2cb8cb389 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
To simplify developing process with Gerrit's main dependency JGit, we support building gerrit with building JGit from source instead of consuming the binary artifacts from Central or from Gerrit's Google storage bucket.
When building Gerrit in that way on Bazel 0.16.1 generated byte code for JGit is screwed up and the final artifact cannot be run on Java 8 any more. Note, that this error doesn't show up any more in 0.17rc1, but I would like to make sure, we understand the problem (and the fix). Besides that, 0.16.1 is the latest Bazel stable release so that I think it worth reporting.
Also note, that if we consume JGit from Central and not build it from source, from an external repository on Bazel 0.16.1, then all is fine, and final artifact can be run on Java 8. This is the stack trace we are seeing:
To reproduce:
And run:
Now try to set-up a gerrit site by running:
and accept all suggested defaults would fail with the stack trace mentioned above.
FWIW: OP was reported on repo-discuss mailing list: [1].
My environment:
[1] https://groups.google.com/d/topic/repo-discuss/d_qGIW0PnpU/discussion
//CC @cushon
The text was updated successfully, but these errors were encountered: