-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: operator[] checks bounds in debug mode #16002
src: operator[] checks bounds in debug mode #16002
Conversation
@@ -44,7 +44,9 @@ class Vector { | |||
|
|||
// Access individual vector elements - checks bounds in debug mode. | |||
T& operator[](size_t index) const { | |||
#ifdef DEBUG | |||
CHECK(index < length_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep the check - if it's worth doing in debug mode, it's almost always worth doing in release mode too.
I probably should have updated the comment in 1b7372f. Can you do that now?
Benchmarks: $ ./node benchmark/compare.js --new ./node --old ./node-1b358f1fde0e --runs 10 --filter buffer-indexof.js --set iter=0.1 --set encoding=utf8 buffers | Rscript benchmark/compare.R
[00:03:58|% 100| 1/1 files | 20/20 runs | 30/30 configs]: Done
improvement confidence p.value
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="@" -3.05 % 4.074023e-01
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="10x" 1.36 % 7.158332e-01
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="aaaaaaaaaaaaaaaaa" 13.98 % *** 6.245506e-14
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="Alice" -1.62 % 6.286298e-01
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="among mad people" 15.27 % *** 4.234457e-15
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="found it very" 13.04 % *** 7.519944e-15
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="Gryphon" 0.55 % 4.549318e-01
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="</i> to the Caterpillar" 12.76 % *** 5.213227e-11
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="--l" 6.29 % *** 9.629246e-08
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="neighbouring pool" 12.84 % *** 1.342501e-13
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="Ou est ma chatte?" 10.85 % *** 1.118890e-12
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="Panther" 3.29 % ** 1.133720e-03
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="Soo--oop" 13.48 % *** 7.043781e-17
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="SQ" 1.16 % 3.090357e-01
buffers/buffer-indexof.js iter=0.1 type="buffer" encoding="utf8" search="venture to go near the house till she had brought herself down to" 12.52 % *** 6.585226e-13
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="@" 0.36 % 9.254180e-01
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="10x" 0.90 % 7.551025e-01
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="aaaaaaaaaaaaaaaaa" 13.83 % *** 4.085834e-14
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="Alice" 0.71 % 8.635437e-01
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="among mad people" 15.92 % *** 1.473361e-15
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="found it very" 13.29 % *** 2.628669e-14
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="Gryphon" 1.35 % 8.332586e-02
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="</i> to the Caterpillar" 13.32 % *** 1.593228e-14
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="--l" 5.36 % *** 3.708626e-06
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="neighbouring pool" 12.99 % *** 4.543519e-15
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="Ou est ma chatte?" 9.79 % *** 5.845900e-12
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="Panther" 2.93 % ** 4.257264e-03
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="Soo--oop" 13.75 % *** 4.978218e-15
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="SQ" 2.38 % * 3.816994e-02
buffers/buffer-indexof.js iter=0.1 type="string" encoding="utf8" search="venture to go near the house till she had brought herself down to" 11.67 % *** 9.017317e-12 |
CI: https://ci.nodejs.org/job/node-test-commit/13134/ This should be ready. |
@bnoordhuis ... is your objection here a hard objection to this landing? #16002 (comment) |
the CHECK makes sense to me even in release mode irrespective of what performance numbers are. This is the minimum check required to make sure buggy access do not corrupt other parts of memory ( a wrong large index can cause negative numbers to |
@jasnell Not a hard objection but like @gireeshpunathil says, removing it from release builds is arguably a reduction in robustness. I had a working hypothesis that the CHECK inhibits inlining. I haven't had time to investigate that fully but the patch below brings performance about halfway between what we have now and with the check removed: diff --git a/src/string_search.cc b/src/string_search.cc
index 326fba7c4a..f939bdc0e9 100644
--- a/src/string_search.cc
+++ b/src/string_search.cc
@@ -7,5 +7,9 @@ int StringSearchBase::kBadCharShiftTable[kUC16AlphabetSize];
int StringSearchBase::kGoodSuffixShiftTable[kBMMaxShift + 1];
int StringSearchBase::kSuffixTable[kBMMaxShift + 1];
+void CheckIndex(size_t index, size_t length) {
+ CHECK(index < length);
+}
+
} // namespace stringsearch
} // namespace node
diff --git a/src/string_search.h b/src/string_search.h
index 73e90f5873..9ff8e34f75 100644
--- a/src/string_search.h
+++ b/src/string_search.h
@@ -13,6 +13,7 @@
namespace node {
namespace stringsearch {
+void CheckIndex(size_t index, size_t length);
// Returns the maximum of the two parameters.
template <typename T>
@@ -42,9 +43,9 @@ class Vector {
// In the latter case, v[0] corresponds to the *end* of the memory range.
size_t forward() const { return is_forward_; }
- // Access individual vector elements - checks bounds in debug mode.
+ // Access individual vector elements.
T& operator[](size_t index) const {
- CHECK(index < length_);
+ CheckIndex(index, length_); // Out-of-line to encourage method inlining.
return start_[is_forward_ ? index : (length_ - index - 1)];
}
|
@bnoordhuis - what if we further customize this check (fully inline and reduce to its simplest form) - such as:
? |
What I managed to get performance within 2% of this PR with the patch below: diff --git a/src/string_search.h b/src/string_search.h
index 73e90f5873..65be49f1a4 100644
--- a/src/string_search.h
+++ b/src/string_search.h
@@ -42,9 +42,9 @@ class Vector {
// In the latter case, v[0] corresponds to the *end* of the memory range.
size_t forward() const { return is_forward_; }
- // Access individual vector elements - checks bounds in debug mode.
+ // Access individual vector elements.
T& operator[](size_t index) const {
- CHECK(index < length_);
+ if (index >= length_) node::Abort();
return start_[is_forward_ ? index : (length_ - index - 1)];
}
|
@bnoordhuis - thanks, Preceded with a one liner on the lines of: |
I'm fine with @bnoordhuis's suggestion. |
@GitHubTracey would you be so kind and update the PR according to @bnoordhuis suggestion? |
Hi Ruben,
I haven't been near my laptop since the event (I get very little computer
time when I'm home - though I code during the day). I can certainly take a
look when I get back tonight.
Tracey
…On Wed, Oct 18, 2017 at 09:05 Ruben Bridgewater ***@***.***> wrote:
@GitHubTracey <https://github.com/githubtracey> would you be so kind and
update the PR according to @bnoordhuis <https://github.com/bnoordhuis>
suggestion?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16002 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ATBGPjnFzJbuG7m_3m2m3rEc4fvzUu-Vks5stiG8gaJpZM4Pw3fA>
.
|
I mean, I would still prefer the patch in its current form. Performing the bounds check for every single access seems excessive – also, in other places we do this the same way, e.g. in #15077. I’d really prefer to work towards making Node’s test pass in debug mode (and I have a few V8 CLs and Node PRs merged/open), and then having a CI machine that exercises all the debug mode checks… |
@addaleax - so what is your view on protecting the object from memory corruption? |
Hey All, |
@gireeshpunathil Phrasing the question like that certainly is a bit suggestive :) Basically, my stance is:
So, this is why I would feel comfortable with landing this as it is, and why I would prefer that over the other options. |
Anna, I did not mean any provocations, apologies if the words sounded so. Your points on the vitality and adequate test coverage of the code section sounds good to me, and I am sufficiently convinced on its removal being safe, while improving performance. |
@gireeshpunathil I didn’t read anything as provocative here :) |
So can we have consensus on this PR please! As I flipped views, I take the onus to take this forward to graceful completion. |
Yes, I think it’s okay to land this in the next few days if there are no explicit objections |
Landed in 838eca2 🎉 @GitHubTracey Thanks for your second commit here! |
PR-URL: #16002 Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Timothy Gu <timothygu99@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net>
Thanks, Anna! And thanks again for sharing your story at Node.js
Interactive. It really had a big impact on me when you showed your first
commit into Node.js. A reminder that we all need to start somewhere - and
since you are a respected contributor/reviewer, it was encouraging to see
the steps you took to get to where you are.
Cheers!
Tracey
…On Sun, Oct 22, 2017 at 11:13 Anna Henningsen ***@***.***> wrote:
Landed in 838eca2
<838eca2>
🎉
@GitHubTracey <https://github.com/githubtracey> Thanks for your second
commit here!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16002 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ATBGPmVBt2GA1yPcyn9a9p09EKTWAmVJks5su4XEgaJpZM4Pw3fA>
.
|
PR-URL: #16002 Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Timothy Gu <timothygu99@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net>
@vsemozhetbyt Hm – not sure, I don’t think anything can be done about that at this point? |
@addaleax I am not sure, but see #16360 (comment) @GitHubTracey Could you add this local email to your GitHub account? Let us see if something will be updated after this. |
@vsemozhetbyt Yeah, it’s my bad I didn’t ask for the name, but why would we want to require authors to add email addresses to github? |
If the addresses in local git setting and GitHub setting is not in sync, the commit is not associated with GitHub user and the user is not promoted to Conributor after landing. |
I don't mind verifying... weird though cause I signed in the other day and
it had a check mark. Maybe I need to verify my comment as well. I'll update
that later today.
…On Tue, Oct 24, 2017 at 08:10 Anna Henningsen ***@***.***> wrote:
@vsemozhetbyt <https://github.com/vsemozhetbyt> Yeah, it’s my bad I
didn’t ask for the name, but why would we want to require authors to add
email addresses to github?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16002 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ATBGPoXEi2raf2M_4mzxr41Y38I1qX1nks5svf3WgaJpZM4Pw3fA>
.
|
Yeah, right – if it’s about that, I think we should make clear that this only affects how Github displays things, there’s no technical need to provide any data to Github. |
Commit association is properly updated, but not the Contributor status, unfortunately. Let's hope this will be fixed with next contributions from @GitHubTracey! |
PR-URL: nodejs/node#16002 Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Timothy Gu <timothygu99@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net>
PR-URL: nodejs/node#16002 Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Timothy Gu <timothygu99@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net>
operator[] restrict check bounds to debug mode
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
src