Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate startup perf regression in #performance bot #101083

Closed
deepak1556 opened this issue Jun 26, 2020 · 14 comments
Closed

Investigate startup perf regression in #performance bot #101083

deepak1556 opened this issue Jun 26, 2020 · 14 comments
Assignees
Labels
debt Code quality issues engineering VS Code - Build / issue tracking / etc.

Comments

@deepak1556
Copy link
Collaborator

Refs #97638 (comment)

@jrieken lets start with tracing chrome and our own js code in the main process to eliminate any app specific issues.

  • Chrome startup trace - code-insiders <path-to-vscode> <path-to-vscode-package.json> --trace-startup --trace-startup-duration=20 , this will run the trace for 20seconds and will generate a file chrometrace.log. Perform this for both code and code-insiders.

  • App main process JS profiling - code-insiders <path-to-vscode> <path-to-vscode-package.json> --inspect-brk and visit chrome://inspect > open devtools for node, this will automatically attach the debugger. take cpu profile until the app finishes the startup loading. Attach the logs from both code and code-insiders

If both these logs can't identify the problem then we can go down a more sophisticated yet effective ETW tracing.

@deepak1556 deepak1556 added the engineering VS Code - Build / issue tracking / etc. label Jun 26, 2020
@deepak1556 deepak1556 added this to the On Deck milestone Jun 26, 2020
@jrieken jrieken modified the milestones: On Deck, June 2020 Jun 30, 2020
@jrieken
Copy link
Member

jrieken commented Jul 1, 2020

There you go. This is today insider which running on electron 7 and last insiders using 8

@deepak1556
Copy link
Collaborator Author

Some observations from the profiles,

Electron 8 :

Screen Shot 2020-07-06 at 3 25 37 PM

I couldn't find a equivalent notifyReady call in Electron 7 but for consistency the crash reporter in the render gets initialized via ipc to the main just before the workbench is finished, comparing the timings

Screen Shot 2020-07-06 at 4 03 09 PM

Electron 7 :

Screen Shot 2020-07-06 at 4 03 21 PM

Comparing the readiness of shared process

Electron 8 :

Screen Shot 2020-07-06 at 4 05 37 PM

Electron 7
Screen Shot 2020-07-06 at 4 05 28 PM
:

@deepak1556
Copy link
Collaborator Author

Electron 8 numbers don't reflect the actual poor startup numbers we are seeing on the performance channel, @jrieken the test seems to be done on a fresh user data directory, should we try checking numbers from the user data directory of the perf bot ?

@jrieken
Copy link
Member

jrieken commented Jul 7, 2020

@jrieken the test seems to be done on a fresh user data directory,

Unsure about that, maybe some random changes that we also sometimes see with the bot. Or the fact that this is run manually, e.g with screen on and not triggered by the task scheduler? Also, do we have a clue how the user data dir would impact this? This would be bad because it affects all users, right?

@deepak1556
Copy link
Collaborator Author

Unsure about that, maybe some random changes that we also sometimes see with the bot.

The tests are done on the user data dir C:\Users\Administrator\Desktop\electron8-slow\data-dir , is this one the bot always uses ?

Also, do we have a clue how the user data dir would impact this? This would be bad because it affects all users, right?

this is unlikely the issue here, chromium is pretty cautious about blocking calls that can happen on the UI thread during startup, most of the file IO operations will be delegated as tasks to a threadpool https://chromium.googlesource.com/chromium/src.git/+/master/docs/threading_and_tasks.md. If something weird goes on here the chrome tracing profile data will reveal it. I don't see such a case with the provided traces.

e.g with screen on and not triggered by the task scheduler?

this is an interesting one, can you share the task scheduler script used, I would like to test on my machine.

@deepak1556
Copy link
Collaborator Author

deepak1556 commented Jul 7, 2020

@bpasero shared the task scheduler script on slack just now, will play with it tomorrow and get back.

@deepak1556
Copy link
Collaborator Author

Or the fact that this is run manually, e.g with screen on and not triggered by the task scheduler?

Imported the scheduler script, only adjustment was the trigger action being set to workstation lock. The final numbers were equally good as the numbers seen previously on my machine.

@jrieken can you get a chrome trace from running the perf script via the task scheduler, also lets try using the default data dir used by the perf bot

Perf Canary.zip

@bpasero
Copy link
Member

bpasero commented Jul 8, 2020

Not sure if this helps, but I just noticed a new electron trace event category in the release notes for 8.4.0: https://github.com/electron/electron/releases/tag/v8.4.0

Added tracing instrumentation to WebContents IPC methods, available with the contentTracing category "electron". #24307

@deepak1556
Copy link
Collaborator Author

deepak1556 commented Jul 8, 2020

@bpasero the screenshots with category electron I posted in #101083 (comment) are based on those events. They only trace the electron ipc calls

@jrieken
Copy link
Member

jrieken commented Jul 13, 2020

@deepak1556 The zip are traces for 1eda45d590d0bc9e8db4ac610705f9b804b2c9df (exploration). One run triggered manual, e.g starting the perf-script from powershell, and once triggered by task scheduler - which then doesn't render a UX. That might help drill into this

traces.zip

@deepak1556
Copy link
Collaborator Author

deepak1556 commented Jul 14, 2020

Thanks for the traces,

with Electron 8 task scheduler run there are repeated blocks of RenderThreadImpl::EstablishGpuChannelSync calls blocking the renderer executions, which mostly means the GPU process has been crashing constantly on the system and the renderer is failing to establish a channel.

Electron 8:

Screen Shot 2020-07-13 at 11 51 59 PM

Electron 7:

Screen Shot 2020-07-13 at 11 52 48 PM

@deepak1556
Copy link
Collaborator Author

I am not sure if this is a bug in chromium trying to connect with so many retries, I was under the impression it would disable gpu after certain failed tries but doesn't seem to be.

@jrieken as a next step,

  • can you get the output of --verbose, most likely I will need to provide a debug build to get more useful logs from GPU
  • run the perf bot with --disable-gpu
  • Get a trace with --disable-gpu

@jrieken
Copy link
Member

jrieken commented Jul 17, 2020

Closing this as we know that this is happening only in the "run vscode without UX" case. The underlying cause is unknown tho

@jrieken jrieken closed this as completed Jul 17, 2020
@deepak1556
Copy link
Collaborator Author

I would like to investigate the underlying issue, certainly seems like a bug in chromium or if its the predefined behavior would be good to understand why, can we keep this open for debt work ?

@deepak1556 deepak1556 added debt Code quality issues and removed electron-8-update labels Jul 20, 2020
@deepak1556 deepak1556 modified the milestones: July 2020, August 2020 Jul 20, 2020
@deepak1556 deepak1556 reopened this Jul 20, 2020
@jrieken jrieken removed their assignment Aug 10, 2020
@deepak1556 deepak1556 removed this from the October 2020 milestone Oct 30, 2020
@github-actions github-actions bot locked and limited conversation to collaborators Mar 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
debt Code quality issues engineering VS Code - Build / issue tracking / etc.
Projects
None yet
Development

No branches or pull requests

3 participants