-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for checkpoint...rollback API to clean up leaked scopes #8516
Conversation
94c5b8d
to
b4ed541
Compare
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 5 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.48.0-SNAPSHOT~3efa8ff025, baseline=1.48.0-SNAPSHOT~71cbb065b9
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.041 s) : 0, 1041040
Total [baseline] (8.662 s) : 0, 8661661
Agent [candidate] (1.039 s) : 0, 1039319
Total [candidate] (8.658 s) : 0, 8657712
section iast
Agent [baseline] (1.17 s) : 0, 1170370
Total [baseline] (9.235 s) : 0, 9234898
Agent [candidate] (1.17 s) : 0, 1169542
Total [candidate] (9.281 s) : 0, 9281467
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.171 s) : 0, 1171092
Total [baseline] (9.207 s) : 0, 9207491
Agent [candidate] (1.172 s) : 0, 1171712
Total [candidate] (9.238 s) : 0, 9237777
section iast_TELEMETRY_OFF
Agent [baseline] (1.166 s) : 0, 1165574
Total [baseline] (9.251 s) : 0, 9251346
Agent [candidate] (1.166 s) : 0, 1165947
Total [candidate] (9.231 s) : 0, 9230753
gantt
title insecure-bank - break down per module: candidate=1.48.0-SNAPSHOT~3efa8ff025, baseline=1.48.0-SNAPSHOT~71cbb065b9
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (716.655 ms) : 0, 716655
BytebuddyAgent [candidate] (716.532 ms) : 0, 716532
GlobalTracer [baseline] (239.075 ms) : 0, 239075
GlobalTracer [candidate] (239.2 ms) : 0, 239200
AppSec [baseline] (55.202 ms) : 0, 55202
AppSec [candidate] (55.112 ms) : 0, 55112
Remote Config [baseline] (693.488 µs) : 0, 693
Remote Config [candidate] (691.795 µs) : 0, 692
Telemetry [baseline] (14.451 ms) : 0, 14451
Telemetry [candidate] (12.832 ms) : 0, 12832
section iast
BytebuddyAgent [baseline] (836.201 ms) : 0, 836201
BytebuddyAgent [candidate] (835.848 ms) : 0, 835848
GlobalTracer [baseline] (230.357 ms) : 0, 230357
GlobalTracer [candidate] (230.211 ms) : 0, 230211
IAST [baseline] (22.816 ms) : 0, 22816
IAST [candidate] (22.71 ms) : 0, 22710
AppSec [baseline] (56.776 ms) : 0, 56776
AppSec [candidate] (56.581 ms) : 0, 56581
Remote Config [baseline] (612.084 µs) : 0, 612
Remote Config [candidate] (610.69 µs) : 0, 611
Telemetry [baseline] (8.602 ms) : 0, 8602
Telemetry [candidate] (8.674 ms) : 0, 8674
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (836.803 ms) : 0, 836803
BytebuddyAgent [candidate] (836.63 ms) : 0, 836630
GlobalTracer [baseline] (230.612 ms) : 0, 230612
GlobalTracer [candidate] (231.002 ms) : 0, 231002
IAST [baseline] (22.722 ms) : 0, 22722
IAST [candidate] (22.866 ms) : 0, 22866
AppSec [baseline] (56.755 ms) : 0, 56755
AppSec [candidate] (56.91 ms) : 0, 56910
Remote Config [baseline] (603.2 µs) : 0, 603
Remote Config [candidate] (604.974 µs) : 0, 605
Telemetry [baseline] (8.639 ms) : 0, 8639
Telemetry [candidate] (8.763 ms) : 0, 8763
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (832.989 ms) : 0, 832989
BytebuddyAgent [candidate] (832.691 ms) : 0, 832691
GlobalTracer [baseline] (230.287 ms) : 0, 230287
GlobalTracer [candidate] (230.309 ms) : 0, 230309
IAST [baseline] (23.823 ms) : 0, 23823
IAST [candidate] (22.279 ms) : 0, 22279
AppSec [baseline] (54.454 ms) : 0, 54454
AppSec [candidate] (56.537 ms) : 0, 56537
Remote Config [baseline] (609.02 µs) : 0, 609
Remote Config [candidate] (608.244 µs) : 0, 608
Telemetry [baseline] (8.477 ms) : 0, 8477
Telemetry [candidate] (8.61 ms) : 0, 8610
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.48.0-SNAPSHOT~3efa8ff025, baseline=1.48.0-SNAPSHOT~71cbb065b9
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.046 s) : 0, 1046055
Total [baseline] (10.575 s) : 0, 10574653
Agent [candidate] (1.047 s) : 0, 1047225
Total [candidate] (10.5 s) : 0, 10499579
section appsec
Agent [baseline] (1.187 s) : 0, 1186535
Total [baseline] (10.789 s) : 0, 10788602
Agent [candidate] (1.183 s) : 0, 1183073
Total [candidate] (10.75 s) : 0, 10750475
section iast
Agent [baseline] (1.171 s) : 0, 1171374
Total [baseline] (10.979 s) : 0, 10978931
Agent [candidate] (1.172 s) : 0, 1171718
Total [candidate] (10.976 s) : 0, 10975509
section profiling
Agent [baseline] (1.257 s) : 0, 1257359
Total [baseline] (10.883 s) : 0, 10883254
Agent [candidate] (1.259 s) : 0, 1259056
Total [candidate] (10.874 s) : 0, 10874318
gantt
title petclinic - break down per module: candidate=1.48.0-SNAPSHOT~3efa8ff025, baseline=1.48.0-SNAPSHOT~71cbb065b9
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (721.375 ms) : 0, 721375
BytebuddyAgent [candidate] (721.636 ms) : 0, 721636
GlobalTracer [baseline] (241.186 ms) : 0, 241186
GlobalTracer [candidate] (241.065 ms) : 0, 241065
AppSec [baseline] (55.54 ms) : 0, 55540
AppSec [candidate] (55.647 ms) : 0, 55647
Remote Config [baseline] (684.339 µs) : 0, 684
Remote Config [candidate] (711.564 µs) : 0, 712
Telemetry [baseline] (12.166 ms) : 0, 12166
Telemetry [candidate] (13.113 ms) : 0, 13113
section appsec
BytebuddyAgent [baseline] (737.253 ms) : 0, 737253
BytebuddyAgent [candidate] (734.507 ms) : 0, 734507
GlobalTracer [baseline] (237.258 ms) : 0, 237258
GlobalTracer [candidate] (236.641 ms) : 0, 236641
IAST [baseline] (21.62 ms) : 0, 21620
IAST [candidate] (21.763 ms) : 0, 21763
AppSec [baseline] (177.185 ms) : 0, 177185
AppSec [candidate] (176.447 ms) : 0, 176447
Remote Config [baseline] (667.502 µs) : 0, 668
Remote Config [candidate] (657.483 µs) : 0, 657
Telemetry [baseline] (8.349 ms) : 0, 8349
Telemetry [candidate] (9.004 ms) : 0, 9004
section iast
BytebuddyAgent [baseline] (836.931 ms) : 0, 836931
BytebuddyAgent [candidate] (836.491 ms) : 0, 836491
GlobalTracer [baseline] (230.225 ms) : 0, 230225
GlobalTracer [candidate] (230.434 ms) : 0, 230434
IAST [baseline] (23.715 ms) : 0, 23715
IAST [candidate] (23.13 ms) : 0, 23130
AppSec [baseline] (56.178 ms) : 0, 56178
AppSec [candidate] (57.194 ms) : 0, 57194
Remote Config [baseline] (609.048 µs) : 0, 609
Remote Config [candidate] (616.606 µs) : 0, 617
Telemetry [baseline] (8.762 ms) : 0, 8762
Telemetry [candidate] (8.827 ms) : 0, 8827
section profiling
BytebuddyAgent [baseline] (708.216 ms) : 0, 708216
BytebuddyAgent [candidate] (708.745 ms) : 0, 708745
GlobalTracer [baseline] (348.409 ms) : 0, 348409
GlobalTracer [candidate] (349.91 ms) : 0, 349910
AppSec [baseline] (54.855 ms) : 0, 54855
AppSec [candidate] (54.266 ms) : 0, 54266
Remote Config [baseline] (664.505 µs) : 0, 665
Remote Config [candidate] (667.937 µs) : 0, 668
Telemetry [baseline] (8.83 ms) : 0, 8830
Telemetry [candidate] (8.972 ms) : 0, 8972
ProfilingAgent [baseline] (96.025 ms) : 0, 96025
ProfilingAgent [candidate] (96.103 ms) : 0, 96103
Profiling [baseline] (96.051 ms) : 0, 96051
Profiling [candidate] (96.127 ms) : 0, 96127
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 16 unstable metrics. Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.48.0-SNAPSHOT~3efa8ff025, baseline=1.48.0-SNAPSHOT~71cbb065b9
dateFormat X
axisFormat %s
section baseline
no_agent (379.36 µs) : 360, 399
. : milestone, 379,
iast (513.984 µs) : 492, 536
. : milestone, 514,
iast_FULL (731.78 µs) : 710, 754
. : milestone, 732,
iast_GLOBAL (562.426 µs) : 540, 585
. : milestone, 562,
iast_HARDCODED_SECRET_DISABLED (511.841 µs) : 490, 534
. : milestone, 512,
iast_INACTIVE (470.328 µs) : 448, 492
. : milestone, 470,
iast_TELEMETRY_OFF (499.969 µs) : 478, 522
. : milestone, 500,
tracing (457.79 µs) : 436, 479
. : milestone, 458,
section candidate
no_agent (379.508 µs) : 360, 399
. : milestone, 380,
iast (514.545 µs) : 493, 536
. : milestone, 515,
iast_FULL (729.729 µs) : 708, 752
. : milestone, 730,
iast_GLOBAL (553.823 µs) : 532, 575
. : milestone, 554,
iast_HARDCODED_SECRET_DISABLED (516.11 µs) : 493, 539
. : milestone, 516,
iast_INACTIVE (460.838 µs) : 440, 482
. : milestone, 461,
iast_TELEMETRY_OFF (504.217 µs) : 481, 527
. : milestone, 504,
tracing (458.532 µs) : 438, 479
. : milestone, 459,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.48.0-SNAPSHOT~3efa8ff025, baseline=1.48.0-SNAPSHOT~71cbb065b9
dateFormat X
axisFormat %s
section baseline
no_agent (1.365 ms) : 1345, 1384
. : milestone, 1365,
appsec (1.723 ms) : 1699, 1746
. : milestone, 1723,
appsec_no_iast (1.72 ms) : 1695, 1745
. : milestone, 1720,
code_origins (1.68 ms) : 1646, 1714
. : milestone, 1680,
iast (1.521 ms) : 1497, 1544
. : milestone, 1521,
profiling (1.526 ms) : 1503, 1549
. : milestone, 1526,
tracing (1.486 ms) : 1461, 1512
. : milestone, 1486,
section candidate
no_agent (1.376 ms) : 1358, 1395
. : milestone, 1376,
appsec (1.715 ms) : 1691, 1739
. : milestone, 1715,
appsec_no_iast (1.731 ms) : 1707, 1755
. : milestone, 1731,
code_origins (1.68 ms) : 1647, 1714
. : milestone, 1680,
iast (1.52 ms) : 1495, 1544
. : milestone, 1520,
profiling (1.508 ms) : 1485, 1530
. : milestone, 1508,
tracing (1.491 ms) : 1467, 1515
. : milestone, 1491,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.48.0-SNAPSHOT~3efa8ff025, baseline=1.48.0-SNAPSHOT~71cbb065b9
dateFormat X
axisFormat %s
section baseline
no_agent (1.471 ms) : 1460, 1483
. : milestone, 1471,
appsec (2.323 ms) : 2280, 2366
. : milestone, 2323,
iast (2.116 ms) : 2060, 2171
. : milestone, 2116,
iast_GLOBAL (2.165 ms) : 2109, 2221
. : milestone, 2165,
profiling (2.41 ms) : 2243, 2577
. : milestone, 2410,
tracing (1.966 ms) : 1923, 2009
. : milestone, 1966,
section candidate
no_agent (1.47 ms) : 1458, 1481
. : milestone, 1470,
appsec (2.343 ms) : 2298, 2387
. : milestone, 2343,
iast (2.121 ms) : 2065, 2177
. : milestone, 2121,
iast_GLOBAL (2.162 ms) : 2107, 2218
. : milestone, 2162,
profiling (1.989 ms) : 1943, 2034
. : milestone, 1989,
tracing (1.945 ms) : 1903, 1987
. : milestone, 1945,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.48.0-SNAPSHOT~3efa8ff025, baseline=1.48.0-SNAPSHOT~71cbb065b9
dateFormat X
axisFormat %s
section baseline
no_agent (14.974 s) : 14974000, 14974000
. : milestone, 14974000,
appsec (15.154 s) : 15154000, 15154000
. : milestone, 15154000,
iast (18.146 s) : 18146000, 18146000
. : milestone, 18146000,
iast_GLOBAL (17.905 s) : 17905000, 17905000
. : milestone, 17905000,
profiling (15.033 s) : 15033000, 15033000
. : milestone, 15033000,
tracing (15.068 s) : 15068000, 15068000
. : milestone, 15068000,
section candidate
no_agent (15.324 s) : 15324000, 15324000
. : milestone, 15324000,
appsec (15.131 s) : 15131000, 15131000
. : milestone, 15131000,
iast (18.783 s) : 18783000, 18783000
. : milestone, 18783000,
iast_GLOBAL (18.336 s) : 18336000, 18336000
. : milestone, 18336000,
profiling (14.962 s) : 14962000, 14962000
. : milestone, 14962000,
tracing (15.035 s) : 15035000, 15035000
. : milestone, 15035000,
|
* @deprecated This should only be used when scopes have leaked onto the scope stack that cannot | ||
* be cleaned up by other means. | ||
*/ | ||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this generally happen on known conditions? Or is it something that would need investigation?
If it is the later, I'd suggest one or both of the following:
- Emit a telemetry metric
- Emit a telemetry log with a stacktrace attached (possibly only the first time it happens in an application, to avoid performance issues).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's known to happen for the Akka/Pekko ActorCell and Mailbox instrumentations. My understanding is that this is something inherent to how they work and is not fixable via additional instrumentation. (There are tests that fail if the scope cleanup is removed from those instrumentations)
So adding telemetry isn't warranted in this case, because we know this happens under known conditions (basically the conditions recreated in the existing tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instrumentation update looks good 👍 but I failed to understand the API in the first place 😓 -- might me my English skills 🤷 That should not be blocking, I can stamp if you need to move on refactoring.
* be cleaned up by other means. | ||
*/ | ||
@Deprecated | ||
public static void markActive() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think I get the reason, I had hard time understanding the API -- that might be as I'm not native English speaker 🤷 but markActive
make me think of "I will mark something as active". Instead, it's marking the active scope for reset, where reset is implicit as scopes have multiple flags like the async propagation.
And resetActive() is doing close() calls (so why it's called reset
and what's the difference with close
is not immediately clear), and it does not apply to active (but up to active excluded).
Am I the only one struggling here? What about markActiveForCleanUp()
and cleanUpUntilActive()
or something? Not sure it fits perfectly with the other scope flag either (ie setAsyncPropagationEnabled()
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inspiration was from similar mark
and reset
operations on streams
an alternative would be checkpointActive
and restoreActive
?
I'd like to keep the ...Active
suffix to make clear this is operating on the active scope - first to mark/checkpoint/record it and then to reset/restore/rollback
( also cleanUpUntilActive
doesn't capture that we're effectively restoring the scope that was marked by the first method )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe checkpointActive
and rollbackActive
would be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inspiration was from similar mark and reset operations on streams
Like for nio Buffer? That makes sense!
It did not get the analogy in the first place. It might be the Javadoc that puts the emphasis on what will happen for the scope hierarchy rather than the overall purpose (restoring to the marked state).
( also cleanUpUntilActive doesn't capture that we're effectively restoring the scope that was marked by the first method )
That shows that I did not catch it from the Javadoc in the first place 😓
Don't change the API, that was me and my English. Thanks for the clarification ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with checkpointActiveForRollback
and rollbackActiveToCheckpoint
and did a bit of cleanup to make it more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good for platform 👍
IDM might want to have a look too
b4ed541
to
3efa8ff
Compare
markActive
...resetActive
to clean up leaked scopescheckpointActiveForRollback
...rollbackActiveToCheckpoint
to clean up leaked scopes
checkpointActiveForRollback
...rollbackActiveToCheckpoint
to clean up leaked scopes
What Does This Do
Provides a well-defined approach for managing cleanup of scopes when we know that scopes are leaking, but aren't able to address that via instrumentation.
Usage:
checkpointActiveForRollback();
before starting work that might leak scopesrollbackActiveToCheckpoint();
to roll back to the last checkpointed scopeMotivation
Avoids use of
activeScope()
to implement ad-hoc clean-up of leaked scopes.Additional Notes
This should only be used as a last resort, when the usual scope management isn't enough.
Contributor Checklist
type:
and (comp:
orinst:
) labels in addition to any usefull labelsclose
,fix
or any linking keywords when referencing an issue.Use
solves
instead, and assign the PR milestone to the issueJira ticket: APMAPI-956