Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

React Native Benchmark Suite #186

Closed
tido64 opened this issue Jan 16, 2020 · 20 comments
Closed

React Native Benchmark Suite #186

tido64 opened this issue Jan 16, 2020 · 20 comments
Labels
🗣 Discussion This label identifies an ongoing discussion on a subject

Comments

@tido64
Copy link

tido64 commented Jan 16, 2020

Introduction

We are currently blind to any regressions that don't directly translate to pixels on the screen. Performance regressions, in particular, isn't always straight forward to measure as different platforms and devices may yield wildly different results depending on how you do the measurements.

The Core of It

A suite of benchmarks and tools should be run on all PRs, and on every commit to master, to catch regressions (or improvements) compared to a baseline.

What we should measure:

  • App Size
    • Measure the app size and working set of a clean react-native init app or RNTester.
  • Performance Regressions
    • React initialization time
      • Time to interaction (TTI)
      • Native module initialization
      • JS parsing
      • JS execution
    • Number of thread jumps during initialization
    • Blocked threads, and how long they were blocked for
    • Memory usage
      • Working set
  • Memory Leaks
    • Static analysis and sanitizers, e.g. run RNTester with MemorySanitizer
    • Linters, e.g. Clang-Tidy
    • Leverage online code analysis services such as LGTM
    • Essentially, look to tools and automation to help us find bugs, memory leaks, and potential security issues.

How do we report this?

  • Make comments on PRs (and commits?) commenting on the delta change for all metrics.
  • Every night (or every commit on master) there should be a benchmark run to keep track of performance over time. The report should be publicly available, similar to the Firefox Performance Dashboard.

Suggested Timeline

Task Estimate Assignee Issue #
Implement PR bot for posting comments from CI agents 1 week @tido64 1 2 3
Nightly CI builds 1 week @sweggersen 1
Add reporting for performance regressions compared to master 1 week @tido64 1 2
Add reporting for app bundle size 1 week @tido64 1
Add app memory usage tests 1 week @tido64 1
Add reporting for app memory usage 1 week
Implement performance markers for React TTI 2 weeks
Implement performance markers for thread jumps/blocks 2 weeks
RNTester run with MemorySanitizer 2 weeks @HeyImChris
RNTester run with UndefinedBehaviorSanitizer 2 weeks @HeyImChris
Add reporting of static analysis results 1 week
Implement perf dashboards ???
Figure out whether we need a device farm or alternate solutions ???

Discussion points

  • Thoughts?
  • Missing metrics?
  • Milestones/timeline?

cc @sweggersen, @danrydholm

@hramos hramos added the 🗣 Discussion This label identifies an ongoing discussion on a subject label Jan 16, 2020
@kelset
Copy link
Member

kelset commented Jan 17, 2020

Hey @tido64 that sounds like a great idea!

I'll be working more closely on the topic of benchmarking & tooling from April but until then I think that starting this conversation is 🔥

I think that we should try to come up with at least a baseline of what is "acceptable" (ex. 200ms for a full-round from click to visual change), maybe on a range of different devices (ex. android+old, android+new, ios+old, ios+new).

The one benchmark that I can think of missing here is the "app bundling" benchmark, which could underline suboptimal (or improvements) of the native-side settings of the mobile app project; at least, this was an area I managed to find a few easy wins the last times I've done some performances improvements for a prod app I was working on -> https://gist.github.com/kelset/2eb61161a68d1ab35337d1eaa9a05e78

@kelset
Copy link
Member

kelset commented Jan 17, 2020

Was totally forgetting, but @Kudo did a lot of work in the benchmarking space -> https://github.com/Kudo/react-native-js-benchmark

@tido64
Copy link
Author

tido64 commented Jan 21, 2020

@kelset That's great to hear. Do you have some sort of plan for what you want to do? Regarding the baseline, I like the idea of having goals. I was thinking more of having a diff against master but we should have both. Can you explain what you mean by "app bundling", and how it affects performance?

@Kudo How much work do you think it will take to bring this to iOS as well?

@oblador
Copy link
Member

oblador commented Jan 21, 2020

I'm interested in this problem space and can help out 👍

@kelset
Copy link
Member

kelset commented Jan 22, 2020

Do you have some sort of plan for what you want to do?

So the plan is a bit hand-wavey still but I'll be working 6 weeks full time on a react-native performances open source book (both from perspective of how to measure, finding the baselines, bottlenecks, etc and how to act upon those). So for your proposal I think that I'll be able to help you out with the sides of baseline & benchmarking (at least). But I'll only start in April AFAIK.

Regarding the baseline, I like the idea of having goals. I was thinking more of having a diff against master but we should have both.

Just for me to better understand - with diff you mean like a report of the "delta" between commits? Like, "this commit has made the TTI 30 ms slower"?

Can you explain what you mean by "app bundling", and how it affects performance?

So, by app bundling I mean the process that both iOS and Android need to perform to generate the .ipa/.apk. I think that by keeping an eye on this too we could make sure that whenever native-side configurations are touched, they are not leading to less optimal processes (ex. let's say a commit changes the compiler optimization, if we check the build time we could detect if it's actually improving the overall time VS making it more slower, etc).

Another side effect of this is that it would help detect easy wins in terms of the native-side project configuration: let's say that we never realised that we can use the App Bundling split feature of Android (can't recall the precise name of it atm, the one that only bundles for the precise architecture)(I may be missremembering completely btw, my brain is ded atm). By having a baseline (ex. a fresh new Android app takes 2 mins to generate the apk) we could detect that our config, because it was first created let's say 3 years ago, it's x10 slower because we didn't update that particular config to take advantage of the new native side tooling. And the side effect of this is that then the app will be built more efficiently for the device it will run against, leading to then better "in-app" perfs.

@tido64
Copy link
Author

tido64 commented Jan 23, 2020

Just for me to better understand - with diff you mean like a report of the "delta" between commits? Like, "this commit has made the TTI 30 ms slower"?

Correct. The baseline should the branch you're merging against, most commonly master branch. That way, we catch perf regressions before they get merged.

So, by app bundling I mean the process that both iOS and Android need to perform to generate the .ipa/.apk. I think that by keeping an eye on this too we could make sure that whenever native-side configurations are touched, they are not leading to less optimal processes (ex. let's say a commit changes the compiler optimization, if we check the build time we could detect if it's actually improving the overall time VS making it more slower, etc).

I'm not sure if this makes sense. If we misconfigured React Native to be built with with -Og instead of -Os, we will improve build time but performance will be worse for sure. Maybe it makes more sense to open up the bundles and see what's in there? E.g. for Android, we can check whether you've created bundles for each supported architecture, and that all the .so files only include code for the target architecture. The checks are more explicit and there will be less confusion.

@kelset
Copy link
Member

kelset commented Jan 23, 2020

I see - makes sense!

Ok then let's leave the app bundling out of this... there's plenty already anyway ;)

@tido64
Copy link
Author

tido64 commented Jan 30, 2020

I've added a table of tasks and estimates. Let me know if I missed anything. Also feel free to assign yourself to any of them 😉

@elicwhite
Copy link

Thanks for adding the table! The measurement that we get asked for a lot internally is what the “working set” is of React Native. All the people that ask for that are ex Microsoft :-). We aren’t fully sure how we are supposed to measure that, and there is no one better to help us get that set up than you! :-)

I’d like to make sure that is a part of this as that will help us get long awaited answers for our VPs asking for it.

@shergin
Copy link

shergin commented Jan 30, 2020

BTW, we have a couple of synth performance test in new Fabric infra, and we (I) plan to add a couple more that will cover the most performance-critical (hot paths) aspects of the UI part of the framework (prop parsing, diffing, layout, nodes cloning and so on). We use Google Benchmark for that.

The challenging (unexplored) part of that is running that continuously on CI and running that on ARM CPUs to get actual results for architectures we care about (x86 is usually fast enough).

@tido64
Copy link
Author

tido64 commented Feb 5, 2020

@shergin That's awesome! How much work is it to get them pushed to the open source repo? I think it's a good starting point regardless. We can figure out how to get non-x86 results after we have them running continuously. I've added a task for it.

@TheSavior Updated the table. I think we have a team internally who have solved this. I can hear with them.

@Kudo
Copy link
Member

Kudo commented Feb 5, 2020

react-native-js-benchmark currently focused at JS engine comparison but I think it could support to change as CI regression monitoring.

For iOS support, the most difficult part is about running on real devices including the developer certificates and code signing.
Fortunately, we are focusing on CI regression, testing by simulators would be enough.
I could add iOS simulators support from my very limited free time, maybe 1.5 weeks estimation.

@danrydholm
Copy link

From the MS side, we're going to look into funding CI using AppCenter which has all the real devices we would need for iOS and Android.

@Kudo
Copy link
Member

Kudo commented Feb 11, 2020

Please let me clarify if my understanding is correct.
Does the benchmark focus at regression across different builds?
What I did for react-native-js-benchmark is a fixed pre-configured project that is not for regression.

I had a similar work which to use RN-cli to generate a pure project and patch project to test my V8 integration.
These scripts are written in Python and things like that I should rewrite a JavaScript version.

@tido64
Copy link
Author

tido64 commented Feb 11, 2020

@Kudo, I think it makes sense to just report the numbers in the first iteration. We can add diffing in the next iteration. @sweggersen had some ideas around how to do this.

@Kudo
Copy link
Member

Kudo commented Feb 12, 2020

Got it, will start to scratch some works accordingly.

hramos pushed a commit to hramos/react-native that referenced this issue Feb 12, 2020
Summary:
Report size of app bundles on PRs. See [React Native Benchmark Suite](react-native-community/discussions-and-proposals#186) for further discussion.

## Changelog

[Internal] [Added] - Report size of app bundles on PRs
Pull Request resolved: facebook#28019

Test Plan: PRs should start seeing comments from a bot with app bundle sizes, given that they got built successfully.

Differential Revision: D19859187

Pulled By: hramos

fbshipit-source-id: 48ba25903356b219135716f989a4a3c05140abfb
hramos pushed a commit to hramos/react-native that referenced this issue Feb 13, 2020
Summary:
Pull Request resolved: facebook#28041

Report size of app bundles on PRs. See [React Native Benchmark Suite](react-native-community/discussions-and-proposals#186) for further discussion.

## Changelog

[Internal] [Added] - Report size of app bundles on PRs
Pull Request resolved: facebook#28019

Test Plan: PRs should start seeing comments from a bot with app bundle sizes, given that they got built successfully.

Reviewed By: cpojer

Differential Revision: D19859187

Pulled By: hramos

fbshipit-source-id: 0eb510287ba2001895d66d038fc1e7b8a722d9a8
facebook-github-bot pushed a commit to facebook/react-native that referenced this issue Feb 13, 2020
Summary:
Pull Request resolved: #28041

Report size of app bundles on PRs. See [React Native Benchmark Suite](react-native-community/discussions-and-proposals#186) for further discussion.

## Changelog

[Internal] [Added] - Report size of app bundles on PRs
Pull Request resolved: #28019

Test Plan: PRs should start seeing comments from a bot with app bundle sizes, given that they got built successfully.

Reviewed By: cpojer

Differential Revision: D19859187

Pulled By: hramos

fbshipit-source-id: 3920dc60e6fd073928388e6ae52fc2ba1bc745ac
@grabbou
Copy link
Member

grabbou commented Feb 17, 2020

This is really good idea! Looks like the nightly builds are already taken, but if you @sweggersen need any help from the CI/releases perspective, please let me know!

My main question is how are we planning on broadcasting regressions that happen on master? PRs are one thing, but I think that the additional benefit is being able to track and double-test the diffs that are being synced from Facebook every once in a while.

I am not sure if there's mechanism for that yet in place, but being able to notify early of CI failures (e.g. this commit has broken CI on master or this commit has caused certain regressions) could prevent situations where we are stuck with red CI for a long time and breakages just keep adding up.

facebook-github-bot pushed a commit to facebook/react-native that referenced this issue Feb 27, 2020
Summary:
Note: I'm currently testing this, so do not review just yet.

Adding a nightly CI build in preparation for further work on [React Native: Benchmark Suite](react-native-community/discussions-and-proposals#186.)

Currently it will only run the job `nightly_job` which will simply run `$ echo "Nightly build run"`. The plan is to do nightly checks and performance status on the React Native repo, and create a dashboard where we can see performance changes over time. More on this in the proposal: [React Native: Benchmark Suite](react-native-community/discussions-and-proposals#186.)

## Changelog

[General] [Added] - Add nightly CI build
Pull Request resolved: #28066

Test Plan: Individual jobs will have their own test plan.

Reviewed By: sebmck

Differential Revision: D20008434

Pulled By: hramos

fbshipit-source-id: a5e8aad616c28bfad8f3455b5ebadf7a7d174a55
@elicwhite
Copy link

What's the next step for the the bot that is posting about the app sizes? I'm excited for the future plans but as-is, it's a bit...loud.

Is it possible to get the bot to delete all of its previous comments in a PR when it posts. Check out this PR for a particular example where since there are lots of changes, most of the comments are from the bot. facebook/react-native#27908

If it is going to be a while before the bot is able to start commenting with deltas which are a bit more actionable, does it make sense to temporarily disable the bot? I'm not sure the absolute numbers being shared on PRs are that helpful currently.

@tido64
Copy link
Author

tido64 commented Feb 29, 2020

@TheSavior: We've just merged some improvements to the bot. It should now update its posts instead. Let me know if it's still too loud, and we'll figure something out.

@Kudo
Copy link
Member

Kudo commented Mar 3, 2020

I was busy for the past weeks and sorry not meet my original plan.
Anyway, I figured out it would be a good fit to leverage existing react-native-cli run-ios/run-android commands and the plugin mechanism.
Here is what I did: https://github.com/Kudo/react-native-cli-plugin-benchmark
So far there are three commands:

  • react-native get-appsize-ios
  • react-native get-appsize-android
  • react-native measure-ios

Thank for @grabbou to have the awesome plugin support in cli.
This is my first Typescript experience and also learn a lot from cli's code.

osdnk pushed a commit to osdnk/react-native that referenced this issue Mar 9, 2020
Summary:
Pull Request resolved: facebook#28041

Report size of app bundles on PRs. See [React Native Benchmark Suite](react-native-community/discussions-and-proposals#186) for further discussion.

## Changelog

[Internal] [Added] - Report size of app bundles on PRs
Pull Request resolved: facebook#28019

Test Plan: PRs should start seeing comments from a bot with app bundle sizes, given that they got built successfully.

Reviewed By: cpojer

Differential Revision: D19859187

Pulled By: hramos

fbshipit-source-id: 3920dc60e6fd073928388e6ae52fc2ba1bc745ac
osdnk pushed a commit to osdnk/react-native that referenced this issue Mar 9, 2020
Summary:
Note: I'm currently testing this, so do not review just yet.

Adding a nightly CI build in preparation for further work on [React Native: Benchmark Suite](react-native-community/discussions-and-proposals#186.)

Currently it will only run the job `nightly_job` which will simply run `$ echo "Nightly build run"`. The plan is to do nightly checks and performance status on the React Native repo, and create a dashboard where we can see performance changes over time. More on this in the proposal: [React Native: Benchmark Suite](react-native-community/discussions-and-proposals#186.)

## Changelog

[General] [Added] - Add nightly CI build
Pull Request resolved: facebook#28066

Test Plan: Individual jobs will have their own test plan.

Reviewed By: sebmck

Differential Revision: D20008434

Pulled By: hramos

fbshipit-source-id: a5e8aad616c28bfad8f3455b5ebadf7a7d174a55
@kelset kelset closed this as completed Jun 10, 2021
@react-native-community react-native-community locked and limited conversation to collaborators Jun 10, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
🗣 Discussion This label identifies an ongoing discussion on a subject
Projects
None yet
Development

No branches or pull requests

9 participants