-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bringup Java 11 b04 - implement j.l.String.valueOfCodePoint() #1466
Conversation
Change looks reasonable to me. @fjeremic Any comments on the string compression part? |
Rather than manual verification, please add the automated test. Did you manually test with string compression enabled? |
Manually verified with both command line options Will work on automated tests tomorrow. |
if (enableCompression) { | ||
string = new String(compressedAsciiTable[codePoint], 0, 1, true); | ||
} else { | ||
string = new String(decompressedAsciiTable[codePoint], 0, 1, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use:
Constructor here instead? The one currently being used will have to perform many of the checks again. Similarly for all other assignments to string
in this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String(byteArray, coder)
works as well, will update the PR accordingly.
String string; | ||
if ((codePoint < Character.MIN_CODE_POINT) || (codePoint > Character.MAX_CODE_POINT)) { | ||
/*[MSG "K0800", "Invalid Unicode code point"]*/ | ||
throw new IllegalArgumentException(com.ibm.oti.util.Msg.getString("K0800")); //$NON-NLS-1$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to put case as the last else
case as it occurs least often. The JIT will have to do less work when block shuffling and if it every got it wrong we wouldn't pay the performance penalty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the next check for <= 255
need to expand then? It would also need to validate codePoint > MIN_CODE_POINT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will following checking help JIT performance?
if ((codePoint < Character.MIN_CODE_POINT) {
// error handling
} else if (codePoint <= 255) {
} else if (codePoint < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
} else if (codePoint <= Character.MAX_CODE_POINT) {
} else {
// error handling
}
This saves the less often check (codePoint > Character.MAX_CODE_POINT)
to the last.
ceaaaf3
to
7e7b607
Compare
@fjeremic the PR is ready for another review with following updates:
@llxia please arrange a test team review. Additional note: three project files failed Copyright check however this kind of files don't have copyright header either such as |
// Save less often check (codePoint > Character.MAX_CODE_POINT) last | ||
if (codePoint < Character.MIN_CODE_POINT) { | ||
/*[MSG "K0800", "Invalid Unicode code point"]*/ | ||
throw new IllegalArgumentException(com.ibm.oti.util.Msg.getString("K0800")); //$NON-NLS-1$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include the string or the the code point in the exception? It will make it much easier to debug one of these issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, now an exception looks like:
Standalone tests:
Caused by: java.lang.IllegalArgumentException: Invalid Unicode code point - -1
at java.lang.String.valueOfCodePoint(java.base@11-internal/String.java:4589)
Caused by: java.lang.IllegalArgumentException: Invalid Unicode code point - 1114112
at java.lang.String.valueOfCodePoint(java.base@11-internal/String.java:4589)
Automated tests:
FAILED: testValueOfCodePoint
java.lang.AssertionError: Unicode code point mismatch at 0 expected [1] but found [0]
at org.testng.Assert.fail(Assert.java:96)
at org.testng.Assert.failNotEquals(Assert.java:776)
at org.testng.Assert.assertEqualsImpl(Assert.java:137)
at org.testng.Assert.assertEquals(Assert.java:118)
at org.testng.Assert.assertEquals(Assert.java:652)
at org.openj9.test.java_lang.Test_String.testValueOfCodePoint(Test_String.java:49)
try { | ||
String str = (String) method.invoke(null, i); | ||
Assert.assertEquals(i, str.codePointAt(0), "Unicode code point mismatch at " + i); | ||
} catch (IllegalAccessException | IllegalArgumentException | InvocationTargetException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test cases shouldn't catch unexpected exceptions. Let the test code throw them and the test harness will (usually) print the relevant stacktrace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, removed catching of unexpected exceptions.
@renfeiw , could you review the test change? Thanks. |
static String valueOfCodePoint(int codePoint) { | ||
String string; | ||
// Save less often check (codePoint > Character.MAX_CODE_POINT) last | ||
if (codePoint < Character.MIN_CODE_POINT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vouch to say this is also an uncommon path and should be moved with the else
path. Do we agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check that before checking } else if (codePoint <= 255) {
or we need to move it into the check for <= 255
. Either way, its got to be first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but merging the two checks has the benefit of smaller footprint (since the first and the last blocks throw the same exception they can be merged) and less code to jump over to get to the 0 <= codePoint <= 255
case. i.e.
Current:
if (x < 0) {
X;
} else if (x < 55) {
Y;
} else if (x < 888) {
Z;
} else {
X;
}
can be simplified to:
if (x >= 0 && x < 55) {
Y;
} else if (x >= 0 && x < 888) {
Z;
} else {
X;
}
Which both reduces footprint and makes the lest often taken case the last (X;
). Though we do technically introduce another check. The JIT should do the right thing here though in whichever way it is written. I suppose if the JIT does not do this then we have a JIT bug we need to address. I'm fine with the code being either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly dislike the duplication of the second form. If we can't optimize the first form reasonably then we're going to have problems with most user code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you are suggesting code like following:
if (codePoint >= Character.MIN_CODE_POINT && codePoint <= 255) {
...
} else if (codePoint >= Character.MIN_CODE_POINT && codePoint < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
...
} else if (codePoint >= Character.MIN_CODE_POINT && codePoint <= Character.MAX_CODE_POINT) {
...
} else {
error handling
}
This introduces extra code and confusion (later reader need lots of explanation about the rational of this code logic). I don't think any performance gain here worth the potential issue later.
Now I feel this code should go back to original structure, i.e.,
if ((codePoint < Character.MIN_CODE_POINT) || (codePoint > Character.MAX_CODE_POINT)) {
error handling
} else if (codePoint <= 255) {
...
} else if (codePoint < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
...
} else {
...
}
It is clear and easy to understand. Can we modify it to improve performance later if this becomes a hot spot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I hadn't thought about the extra checks introduced in my original comment. I'm fine either keeping as is or reverting back to your latest suggestion @JasonFengJ9. Apologies for the churn!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, updated PR as latest suggestion above.
test/Java11andUp/build.xml
Outdated
|
||
<target name="build" > | ||
<if> | ||
<equals arg1="${JAVA_VERSION}" arg2="SE110"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to change this conditional check to "not SE80 && not SE90 && not SE100". This makes the build.xml to work for SE120 and up without any change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, changed the condition to not ^SE(8|9|10)0$$
.
To enable SE110, we need to make an update in testKitGen. (#1495) |
The PR has been updated to address review comments. |
*/ | ||
static String valueOfCodePoint(int codePoint) { | ||
String string; | ||
// Save less often check (codePoint > Character.MAX_CODE_POINT) last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is no longer accurate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment in question has been removed.
// Character.MIN_CODE_POINT - 1 & Character.MAX_CODE_POINT + 1 | ||
try { | ||
method.invoke(null, Character.MIN_CODE_POINT - 1); | ||
method.invoke(null, Character.MAX_CODE_POINT + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second method.invoke will never run as the first show throw. We need this to be two tests or to have it written like:
try {
method.invoke(null, Character.MIN_CODE_POINT - 1);
fail()
} catch (IllegalArgumentException | InvocationTargetException e) {}
try {
method.invoke(null, Character.MIN_CODE_POINT - 2);
fail()
} catch (IllegalArgumentException | InvocationTargetException e) {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test with suggested format above.
Please have another look.
Implemented j.l.String.valueOfCodePoint(); Added tests to verify that String.valueOfCodePoint(codePoint).codePointAt(0) equals codePoint for all valid Unicode code point. Signed-off-by: Jason Feng <fengj@ca.ibm.com>
} | ||
// Verify that IllegalArgumentException is thrown for Character.MIN_CODE_POINT - 1. | ||
try { | ||
method.invoke(null, Character.MIN_CODE_POINT - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs an explicit Assert.fail()
call or the method could succeed and pass the test when we know it should throw the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, added Assert.fail(message)
.
} | ||
// Verify that IllegalArgumentException is thrown for Character.MAX_CODE_POINT + 1. | ||
try { | ||
method.invoke(null, Character.MAX_CODE_POINT + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs an explicit Assert.fail()
call or the method could succeed and pass the test when we know it should throw the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed as well. Please have another look, thanks.
Jenkins test sanity plinux |
Running one builds worth of sanity to ensure that this is correctly ignored by older Java levels. |
Bringup
Java 11 b04
- implementj.l.String.valueOfCodePoint()
Implemented
j.l.String.valueOfCodePoint()
, manually verified that the output ofString.valueOfCodePoint(codePoint).codePointAt(0)
equalscodePoint
for all Unicode code point.closes: #1439
Reviewer @DanHeidinga @fjeremic
Signed-off-by: Jason Feng fengj@ca.ibm.com