React Native Benchmark Suite #186

tido64 · 2020-01-16T17:59:29Z

Introduction

We are currently blind to any regressions that don't directly translate to pixels on the screen. Performance regressions, in particular, isn't always straight forward to measure as different platforms and devices may yield wildly different results depending on how you do the measurements.

The Core of It

A suite of benchmarks and tools should be run on all PRs, and on every commit to master, to catch regressions (or improvements) compared to a baseline.

What we should measure:

App Size
- Measure the app size and working set of a clean react-native init app or RNTester.
Performance Regressions
- React initialization time
  - Time to interaction (TTI)
  - Native module initialization
  - JS parsing
  - JS execution
- Number of thread jumps during initialization
- Blocked threads, and how long they were blocked for
- Memory usage
  - Working set
Memory Leaks
- Static analysis and sanitizers, e.g. run RNTester with MemorySanitizer
- Linters, e.g. Clang-Tidy
- Leverage online code analysis services such as LGTM
- Essentially, look to tools and automation to help us find bugs, memory leaks, and potential security issues.

How do we report this?

Make comments on PRs (and commits?) commenting on the delta change for all metrics.
Every night (or every commit on master) there should be a benchmark run to keep track of performance over time. The report should be publicly available, similar to the Firefox Performance Dashboard.

Suggested Timeline

Task	Estimate	Assignee	Issue #
Implement PR bot for posting comments from CI agents	1 week	@tido64	1 2 3
Nightly CI builds	1 week	@sweggersen	1
Add reporting for performance regressions compared to master	1 week	@tido64	1 2
Add reporting for app bundle size	1 week	@tido64	1
Add app memory usage tests	1 week	@tido64	1
Add reporting for app memory usage	1 week
Implement performance markers for React TTI	2 weeks
Implement performance markers for thread jumps/blocks	2 weeks
RNTester run with MemorySanitizer	2 weeks	@HeyImChris
RNTester run with UndefinedBehaviorSanitizer	2 weeks	@HeyImChris
Add reporting of static analysis results	1 week
Implement perf dashboards	???
Figure out whether we need a device farm or alternate solutions	???

Discussion points

Thoughts?
Missing metrics?
Milestones/timeline?

cc @sweggersen, @danrydholm

The text was updated successfully, but these errors were encountered:

kelset · 2020-01-17T14:12:16Z

Hey @tido64 that sounds like a great idea!

I'll be working more closely on the topic of benchmarking & tooling from April but until then I think that starting this conversation is 🔥

I think that we should try to come up with at least a baseline of what is "acceptable" (ex. 200ms for a full-round from click to visual change), maybe on a range of different devices (ex. android+old, android+new, ios+old, ios+new).

The one benchmark that I can think of missing here is the "app bundling" benchmark, which could underline suboptimal (or improvements) of the native-side settings of the mobile app project; at least, this was an area I managed to find a few easy wins the last times I've done some performances improvements for a prod app I was working on -> https://gist.github.com/kelset/2eb61161a68d1ab35337d1eaa9a05e78

kelset · 2020-01-17T15:05:25Z

Was totally forgetting, but @Kudo did a lot of work in the benchmarking space -> https://github.com/Kudo/react-native-js-benchmark

tido64 · 2020-01-21T13:55:21Z

@kelset That's great to hear. Do you have some sort of plan for what you want to do? Regarding the baseline, I like the idea of having goals. I was thinking more of having a diff against master but we should have both. Can you explain what you mean by "app bundling", and how it affects performance?

@Kudo How much work do you think it will take to bring this to iOS as well?

oblador · 2020-01-21T18:02:46Z

I'm interested in this problem space and can help out 👍

kelset · 2020-01-22T17:44:37Z

Do you have some sort of plan for what you want to do?

So the plan is a bit hand-wavey still but I'll be working 6 weeks full time on a react-native performances open source book (both from perspective of how to measure, finding the baselines, bottlenecks, etc and how to act upon those). So for your proposal I think that I'll be able to help you out with the sides of baseline & benchmarking (at least). But I'll only start in April AFAIK.

Regarding the baseline, I like the idea of having goals. I was thinking more of having a diff against master but we should have both.

Just for me to better understand - with diff you mean like a report of the "delta" between commits? Like, "this commit has made the TTI 30 ms slower"?

Can you explain what you mean by "app bundling", and how it affects performance?

So, by app bundling I mean the process that both iOS and Android need to perform to generate the .ipa/.apk. I think that by keeping an eye on this too we could make sure that whenever native-side configurations are touched, they are not leading to less optimal processes (ex. let's say a commit changes the compiler optimization, if we check the build time we could detect if it's actually improving the overall time VS making it more slower, etc).

Another side effect of this is that it would help detect easy wins in terms of the native-side project configuration: let's say that we never realised that we can use the App Bundling split feature of Android (can't recall the precise name of it atm, the one that only bundles for the precise architecture)(I may be missremembering completely btw, my brain is ded atm). By having a baseline (ex. a fresh new Android app takes 2 mins to generate the apk) we could detect that our config, because it was first created let's say 3 years ago, it's x10 slower because we didn't update that particular config to take advantage of the new native side tooling. And the side effect of this is that then the app will be built more efficiently for the device it will run against, leading to then better "in-app" perfs.

tido64 · 2020-01-23T12:05:09Z

Just for me to better understand - with diff you mean like a report of the "delta" between commits? Like, "this commit has made the TTI 30 ms slower"?

Correct. The baseline should the branch you're merging against, most commonly master branch. That way, we catch perf regressions before they get merged.

So, by app bundling I mean the process that both iOS and Android need to perform to generate the .ipa/.apk. I think that by keeping an eye on this too we could make sure that whenever native-side configurations are touched, they are not leading to less optimal processes (ex. let's say a commit changes the compiler optimization, if we check the build time we could detect if it's actually improving the overall time VS making it more slower, etc).

I'm not sure if this makes sense. If we misconfigured React Native to be built with with -Og instead of -Os, we will improve build time but performance will be worse for sure. Maybe it makes more sense to open up the bundles and see what's in there? E.g. for Android, we can check whether you've created bundles for each supported architecture, and that all the .so files only include code for the target architecture. The checks are more explicit and there will be less confusion.

kelset · 2020-01-23T17:09:09Z

I see - makes sense!

Ok then let's leave the app bundling out of this... there's plenty already anyway ;)

tido64 · 2020-01-30T11:20:02Z

I've added a table of tasks and estimates. Let me know if I missed anything. Also feel free to assign yourself to any of them 😉

elicwhite · 2020-01-30T16:17:50Z

Thanks for adding the table! The measurement that we get asked for a lot internally is what the “working set” is of React Native. All the people that ask for that are ex Microsoft :-). We aren’t fully sure how we are supposed to measure that, and there is no one better to help us get that set up than you! :-)

I’d like to make sure that is a part of this as that will help us get long awaited answers for our VPs asking for it.

shergin · 2020-01-30T18:27:58Z

BTW, we have a couple of synth performance test in new Fabric infra, and we (I) plan to add a couple more that will cover the most performance-critical (hot paths) aspects of the UI part of the framework (prop parsing, diffing, layout, nodes cloning and so on). We use Google Benchmark for that.

The challenging (unexplored) part of that is running that continuously on CI and running that on ARM CPUs to get actual results for architectures we care about (x86 is usually fast enough).

tido64 · 2020-02-05T13:01:42Z

@shergin That's awesome! How much work is it to get them pushed to the open source repo? I think it's a good starting point regardless. We can figure out how to get non-x86 results after we have them running continuously. I've added a task for it.

@TheSavior Updated the table. I think we have a team internally who have solved this. I can hear with them.

Kudo · 2020-02-05T16:08:31Z

react-native-js-benchmark currently focused at JS engine comparison but I think it could support to change as CI regression monitoring.

For iOS support, the most difficult part is about running on real devices including the developer certificates and code signing.
Fortunately, we are focusing on CI regression, testing by simulators would be enough.
I could add iOS simulators support from my very limited free time, maybe 1.5 weeks estimation.

danrydholm · 2020-02-05T16:15:50Z

From the MS side, we're going to look into funding CI using AppCenter which has all the real devices we would need for iOS and Android.

Kudo · 2020-02-11T04:29:24Z

Please let me clarify if my understanding is correct.
Does the benchmark focus at regression across different builds?
What I did for react-native-js-benchmark is a fixed pre-configured project that is not for regression.

I had a similar work which to use RN-cli to generate a pure project and patch project to test my V8 integration.
These scripts are written in Python and things like that I should rewrite a JavaScript version.

tido64 · 2020-02-11T12:51:09Z

@Kudo, I think it makes sense to just report the numbers in the first iteration. We can add diffing in the next iteration. @sweggersen had some ideas around how to do this.

Kudo · 2020-02-12T07:18:13Z

Got it, will start to scratch some works accordingly.

Summary: Report size of app bundles on PRs. See [React Native Benchmark Suite](react-native-community/discussions-and-proposals#186) for further discussion. ## Changelog [Internal] [Added] - Report size of app bundles on PRs Pull Request resolved: facebook#28019 Test Plan: PRs should start seeing comments from a bot with app bundle sizes, given that they got built successfully. Differential Revision: D19859187 Pulled By: hramos fbshipit-source-id: 48ba25903356b219135716f989a4a3c05140abfb

Summary: Pull Request resolved: facebook#28041 Report size of app bundles on PRs. See [React Native Benchmark Suite](react-native-community/discussions-and-proposals#186) for further discussion. ## Changelog [Internal] [Added] - Report size of app bundles on PRs Pull Request resolved: facebook#28019 Test Plan: PRs should start seeing comments from a bot with app bundle sizes, given that they got built successfully. Reviewed By: cpojer Differential Revision: D19859187 Pulled By: hramos fbshipit-source-id: 0eb510287ba2001895d66d038fc1e7b8a722d9a8

Summary: Pull Request resolved: #28041 Report size of app bundles on PRs. See [React Native Benchmark Suite](react-native-community/discussions-and-proposals#186) for further discussion. ## Changelog [Internal] [Added] - Report size of app bundles on PRs Pull Request resolved: #28019 Test Plan: PRs should start seeing comments from a bot with app bundle sizes, given that they got built successfully. Reviewed By: cpojer Differential Revision: D19859187 Pulled By: hramos fbshipit-source-id: 3920dc60e6fd073928388e6ae52fc2ba1bc745ac

…-community/discussions-and-proposals#186.

grabbou · 2020-02-17T02:04:47Z

This is really good idea! Looks like the nightly builds are already taken, but if you @sweggersen need any help from the CI/releases perspective, please let me know!

My main question is how are we planning on broadcasting regressions that happen on master? PRs are one thing, but I think that the additional benefit is being able to track and double-test the diffs that are being synced from Facebook every once in a while.

I am not sure if there's mechanism for that yet in place, but being able to notify early of CI failures (e.g. this commit has broken CI on master or this commit has caused certain regressions) could prevent situations where we are stuck with red CI for a long time and breakages just keep adding up.

Summary: Note: I'm currently testing this, so do not review just yet. Adding a nightly CI build in preparation for further work on [React Native: Benchmark Suite](react-native-community/discussions-and-proposals#186.) Currently it will only run the job `nightly_job` which will simply run `$ echo "Nightly build run"`. The plan is to do nightly checks and performance status on the React Native repo, and create a dashboard where we can see performance changes over time. More on this in the proposal: [React Native: Benchmark Suite](react-native-community/discussions-and-proposals#186.) ## Changelog [General] [Added] - Add nightly CI build Pull Request resolved: #28066 Test Plan: Individual jobs will have their own test plan. Reviewed By: sebmck Differential Revision: D20008434 Pulled By: hramos fbshipit-source-id: a5e8aad616c28bfad8f3455b5ebadf7a7d174a55

elicwhite · 2020-02-29T08:01:50Z

What's the next step for the the bot that is posting about the app sizes? I'm excited for the future plans but as-is, it's a bit...loud.

Is it possible to get the bot to delete all of its previous comments in a PR when it posts. Check out this PR for a particular example where since there are lots of changes, most of the comments are from the bot. facebook/react-native#27908

If it is going to be a while before the bot is able to start commenting with deltas which are a bit more actionable, does it make sense to temporarily disable the bot? I'm not sure the absolute numbers being shared on PRs are that helpful currently.

tido64 · 2020-02-29T08:52:35Z

@TheSavior: We've just merged some improvements to the bot. It should now update its posts instead. Let me know if it's still too loud, and we'll figure something out.

Kudo · 2020-03-03T16:42:29Z

I was busy for the past weeks and sorry not meet my original plan.
Anyway, I figured out it would be a good fit to leverage existing react-native-cli run-ios/run-android commands and the plugin mechanism.
Here is what I did: https://github.com/Kudo/react-native-cli-plugin-benchmark
So far there are three commands:

react-native get-appsize-ios
react-native get-appsize-android
react-native measure-ios

Thank for @grabbou to have the awesome plugin support in cli.
This is my first Typescript experience and also learn a lot from cli's code.

Summary: Pull Request resolved: facebook#28041 Report size of app bundles on PRs. See [React Native Benchmark Suite](react-native-community/discussions-and-proposals#186) for further discussion. ## Changelog [Internal] [Added] - Report size of app bundles on PRs Pull Request resolved: facebook#28019 Test Plan: PRs should start seeing comments from a bot with app bundle sizes, given that they got built successfully. Reviewed By: cpojer Differential Revision: D19859187 Pulled By: hramos fbshipit-source-id: 3920dc60e6fd073928388e6ae52fc2ba1bc745ac

Summary: Note: I'm currently testing this, so do not review just yet. Adding a nightly CI build in preparation for further work on [React Native: Benchmark Suite](react-native-community/discussions-and-proposals#186.) Currently it will only run the job `nightly_job` which will simply run `$ echo "Nightly build run"`. The plan is to do nightly checks and performance status on the React Native repo, and create a dashboard where we can see performance changes over time. More on this in the proposal: [React Native: Benchmark Suite](react-native-community/discussions-and-proposals#186.) ## Changelog [General] [Added] - Add nightly CI build Pull Request resolved: facebook#28066 Test Plan: Individual jobs will have their own test plan. Reviewed By: sebmck Differential Revision: D20008434 Pulled By: hramos fbshipit-source-id: a5e8aad616c28bfad8f3455b5ebadf7a7d174a55

hramos added the 🗣 Discussion This label identifies an ongoing discussion on a subject label Jan 16, 2020

tido64 mentioned this issue Feb 11, 2020

Report size of app bundles on PRs facebook/react-native#28019

Closed

hramos mentioned this issue Feb 12, 2020

Report size of app bundles on PRs (#28019) facebook/react-native#28041

Closed

sweggersen added a commit to sweggersen/react-native that referenced this issue Feb 14, 2020

Add nightly ci build, in preparation for further work on react-native…

bb44110

…-community/discussions-and-proposals#186.

sweggersen mentioned this issue Feb 14, 2020

Add nightly build facebook/react-native#28066

Closed

tido64 mentioned this issue Mar 2, 2020

iOS: Measure startup memory footprint facebook/react-native#28215

Closed

kelset closed this as completed Jun 10, 2021

react-native-community locked and limited conversation to collaborators Jun 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

React Native Benchmark Suite #186

React Native Benchmark Suite #186

tido64 commented Jan 16, 2020 •

edited

Loading

kelset commented Jan 17, 2020

kelset commented Jan 17, 2020

tido64 commented Jan 21, 2020

oblador commented Jan 21, 2020

kelset commented Jan 22, 2020

tido64 commented Jan 23, 2020

kelset commented Jan 23, 2020

tido64 commented Jan 30, 2020

elicwhite commented Jan 30, 2020

shergin commented Jan 30, 2020

tido64 commented Feb 5, 2020 •

edited

Loading

Kudo commented Feb 5, 2020

danrydholm commented Feb 5, 2020

Kudo commented Feb 11, 2020

tido64 commented Feb 11, 2020

Kudo commented Feb 12, 2020

grabbou commented Feb 17, 2020

elicwhite commented Feb 29, 2020

tido64 commented Feb 29, 2020 •

edited

Loading

Kudo commented Mar 3, 2020

This issue was moved to a discussion.

This issue was moved to a discussion.

React Native Benchmark Suite #186

React Native Benchmark Suite #186

Comments

tido64 commented Jan 16, 2020 • edited Loading

Introduction

The Core of It

Suggested Timeline

Discussion points

kelset commented Jan 17, 2020

kelset commented Jan 17, 2020

tido64 commented Jan 21, 2020

oblador commented Jan 21, 2020

kelset commented Jan 22, 2020

tido64 commented Jan 23, 2020

kelset commented Jan 23, 2020

tido64 commented Jan 30, 2020

elicwhite commented Jan 30, 2020

shergin commented Jan 30, 2020

tido64 commented Feb 5, 2020 • edited Loading

Kudo commented Feb 5, 2020

danrydholm commented Feb 5, 2020

Kudo commented Feb 11, 2020

tido64 commented Feb 11, 2020

Kudo commented Feb 12, 2020

grabbou commented Feb 17, 2020

elicwhite commented Feb 29, 2020

tido64 commented Feb 29, 2020 • edited Loading

Kudo commented Mar 3, 2020

This issue was moved to a discussion.

tido64 commented Jan 16, 2020 •

edited

Loading

tido64 commented Feb 5, 2020 •

edited

Loading

tido64 commented Feb 29, 2020 •

edited

Loading