Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: splitCssText causes degraded performance when recording #1603

Closed
1 task done
guntherjh opened this issue Dec 6, 2024 · 6 comments
Closed
1 task done

[Bug]: splitCssText causes degraded performance when recording #1603

guntherjh opened this issue Dec 6, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@guntherjh
Copy link
Contributor

guntherjh commented Dec 6, 2024

Preflight Checklist

  • I have searched the issue tracker for a bug report that matches the one I want to file, without success.

What package is this bug report for?

rrweb

Version

v2.0.0-alpha.18

Expected Behavior

rrweb should not have a significant impact on an application/site when recording data.

Actual Behavior

Whenever markCssSplits is called, we have observed significantly degraded performance caused by splitCssText maxing out the JS heap.

Steps to Reproduce

Reproduction steps with browser extensions:
In the wild, we were seeing this issue with customers who were using chrome extensions that are injecting CSS in to the page. We've seen the issue with our customers using the Grammarly browser extension and Sider AI browser extension. To reproduce the issue this way:

  1. Install one these two extensions.
  2. Open a page that has rrweb recordind data.
  3. In the case of Grammarly, you just need to click in an input in a form and that should trigger the extension to start injecting code that will trigger the issue. For Sider AI, you need to open the extension and a side bar will appear that triggers the issue.

Simple reproduction steps (not using a browser extension):

  1. Serve the following HTML file using a simple http server. For example. I used http-server.
<html>
<head>
    // test.css can really contain anything, it's super important for this test
    <link href="test.css"rel="stylesheet">
    // benchmark.css is the css file used for benchmarking in /packages/rrweb-snaphsot/test/css
    // choosen b/c it's pretty large
    <link href="benchmark.css"rel="stylesheet">
    <title>Home</title>
    <textarea></textarea>
    <script type="module">
        // This is the record.js file from /packages/record/dist created after running yarn && yarn build:all in the root of the repo
        // commented out for now
        // import { record } from './record.js';
        
        const stylesheet = document.getElementsByTagName('link')[1].sheet; 

        let cssString = '';
        // Check if the stylesheet is loaded
        if (stylesheet.rules) {
            // Get all the CSS rules as a string
            for (let i = 0; i < stylesheet.cssRules.length; i++) {
            cssString += stylesheet.cssRules[i].cssText;
            }
        }

        var recorder = record({
            emit(event) {
                console.log(event);
            },
        });
        
        const newStyleEl = document.createElement('style');
        document.body.appendChild(newStyleEl);
        // adds Text Nodes to the <style> element so that markCssSplits gets executed.
        newStyleEl.appendChild(document.createTextNode(cssString));
        newStyleEl.appendChild(document.createTextNode(cssString));
    </script>
</html>
  1. Open the URL were the HTML file is being served. For my example that was http://127.0.0.1:8080.
  2. Using Google Chrome, start taking a performance profile.
  3. Uncomment this line // import { record } from './record.js'; and refresh the page.
  4. Stop recording the performance profile. Note this seems to be enough to lock up the browser so you may need to refresh to get the performance profile to finish loading.
  5. You should see something akin to the following in the performance profile:
    Screenshot 2024-12-06 at 3 12 17 PM

Testcase Gist URL

No response

Additional Information

Just for reference, here is the commit that added splitCssTest

This issue seems to be related, but looks to be more of a problem on the playback side (unless I am misunderstanding 😅 ).

Benchmark
IDK how helpful this would be, but I created a simple benchmark for splitCssText in /packages/rrweb-snapshot/test/stringify-stylesheet.bench.ts

describe('splitCssText', () => {
  const style: HTMLStyleElement = document.createElement('style');
  const style2: HTMLStyleElement = document.createElement('style');

  const cssText = fs.readFileSync(
    path.resolve(__dirname, './css/benchmark.css'),
    'utf8',
  );
  style.textContent = cssText;
  style2.textContent = cssText;
  style2.appendChild(document.createTextNode(cssText));
  style2.appendChild(document.createTextNode(cssText));
  bench(
    'splitCssText',
    () => {
      splitCssText(cssText, style);
    },
    { time: 1000},
  );
  bench(
    'splitCssText triggers nested for loops',
    () => {
      splitCssText(cssText, style2);
    },
    { time: 1000},
  );
});

Here are the results I got:
Screenshot 2024-12-06 at 3 25 05 PM

Please let me know if there is any additional details I should provide. TY!

@guntherjh guntherjh added the bug Something isn't working label Dec 6, 2024
@guntherjh
Copy link
Contributor Author

Update on the reproduction steps above. I have also observed that when a style element contains a large amount of CSS, the resulting CSS gets split in to multiple child text nodes of the style element. This looks to happen at around 60-70kB of CSS in Chrome. There doesn't seem to be a defined limit (at least that I can find). In the case of the Grammarly browser extension, the injected web component contains a style element with 104kB of CSS that gets split across two child text nodes (the first containing 65.6kB and the second containing the remaining 38.3kB). I also observed a web component on a customer's site that had a style element containing what appeared to be all of the font awesome CSS library. This got split across multiple child text nodes in a similar manner.

@eoghanmurray
Copy link
Contributor

I've a PR with a fix in #1615

While the benchmark demonstrates a huge difference in performance between the two cases, there's no actual improvement in the benchmark after the PR as it wasn't demonstrating the real pathological performance; the difference in the two bench timings was just reflecting the necessary call to `normalizeCssString` and accessing of childNodes content.

I've instead made a pathological case out of your test html file, and added an iter_limit so that we 'early out' if we are iterating too much.

Thanks so much for the bug report!

@guntherjh
Copy link
Contributor Author

@eoghanmurray Thanks for taking a look! TBH I didn't have a ton of time to analyze the benchmark I put up so what you described makes perfect sense.

@hipporello
Copy link

Hi,

I was testing this after your fix and it seems that splitCssText hits the performance severely still. It freeze the screen for 10 secs. May I ask why we need this method?

@eoghanmurray
Copy link
Contributor

@hipporello details are in #1437 and I've just done a writeup at
https://rrweb.slack.com/archives/C0614SW58TW/p1738067819837089?thread_ts=1737734377.235029&cid=C0614SW58TW

It's a correctness thing so for most sites it should be fine to bypass the function (return an array with a single element containing the entire css text)

I'll need to take another look at where exactly the problem is arising, any hints into the exact loop where it's stuck would be appreciated.

@eoghanmurray
Copy link
Contributor

@hipporello you were right and I managed to cover further pathological cases in #1640

eoghanmurray added a commit that referenced this issue Feb 6, 2025
Fixes a browser 'lock up' at record time due to a presence of large amounts of css in <style> elements, which are split over multiple text nodes, which triggers the new code added in #1437 (see that PR for full explanation of why this all exists).  #1437 was not written with performance in mind as it was believed to be an edge case, but things like Grammarly browser extension (#1603) among other scenarios were triggering pathological behavior, some of which was solved in #1615.
See also #1640 (comment) for further discussion.

* Fix the case when there are multiple matches and we end up not finding a unique one - just go with the best guess when there are many splits by looking at the previous chunk's size
* Also add '0px' -> '0' stylesheet normalization, which also fixes the sample problem in a different way
* Add new test and modify it so that it can trigger a failure in the absence of the '0px' normalization; there may be other unknown ways of triggering a similar bug, so ensure that the primary 'best guess' method doesn't suffer a regression
* Leverage the 'best guess' method so that we can quit after 100 iterations trying to find a unique substring; hopefully this bit along with the `iterLimit` already added will prevent any future pathological cases.

Failing example extracted from large files identified by Paul D'Ambra (Posthog) ... see comment from MartinWorkfully: PostHog/posthog-js#1668
gnpaone added a commit to Midpath-Software/rrweb that referenced this issue Feb 7, 2025
* Fix up the 'should replace the existing DOM nodes on iframe navigation with `isAttachIframe`' test (rrweb-io#1636)

- it was working for me when the test was run in isolation (`-t` option), but when the entire cross-origin-iframes test was run, the change of iframe contents didn't seem to happen in time

* [chore]: Update actions/upload-artifact to v4 (rrweb-io#1643)

* update actions/upload-artifact to v4

---------

Co-authored-by: Eoghan Murray <eoghan@getthere.ie>

* Fix a code path where masking could be skipped on textareas (rrweb-io#1599)

* Fixes rrweb-io#1596

* [chore] Cache yarn packages for CI (rrweb-io#1646)

* [chore] Cache yarn packages for CI

* Cache yarn in release.yml

* [chore] Update deprecated download artifact on CI (rrweb-io#1647)

* I'm merging even though ESLint is stlll failing in Github Actions as I believe it's running actions _without_ this PR applied yet

* Fix env puppeteer error in cross-origin-iframes.test.ts (rrweb-io#1629)

* chore(ci): track bundle size (rrweb-io#1630)

* chore(ci): track bundle size

---------

Co-authored-by: pauldambra <pauldambra@users.noreply.github.com>

* Fix adapt css with split (rrweb-io#1600)

Fix for rrweb-io#1575 where postcss was raising an exception

* adapt the entire CSS as a whole in one pass with postcss, rather than adapting each split part separately
* break up the postcss output again and assign to individual text nodes (kind of inverse of splitCssText at record side)
* impose an upper bound of 30 iterations on the substring searches to preempt possible pathological behavior
* add tests to demonstrate the scenario and prevent regression

More technical details:
* Fix algorithm; checks against `ix_end` within loop were incorrect when `ix_start` was bigger than zero.  
* Fix that length check against wrong array was causing 'should record style mutations with multiple child nodes and replay them correctly' test to fail. 
Note on last point: I haven't looked into things more deeply than that the test was complaining about missing .length after `replayer.pause(1000);`

* Warn instead of fail on exceptions thrown from postcss (rrweb-io#1580)

* postcss was introduced in rrweb-io#1458 for use within adaptCssForReplay
* rrweb-io#1600 fixes the main case where invalid css could be introduced when if valid css from the output of `sheet.cssRules` was split according to how it was split across text nodes of the <style>
* the guard introduced here is still useful as we likely in future will switch to capturing the raw stylesheet contents (both <style> and <link>), at which point we will be much less confident of getting valid css

* Fix splitCssText again (rrweb-io#1640)

Fixes a browser 'lock up' at record time due to a presence of large amounts of css in <style> elements, which are split over multiple text nodes, which triggers the new code added in rrweb-io#1437 (see that PR for full explanation of why this all exists).  rrweb-io#1437 was not written with performance in mind as it was believed to be an edge case, but things like Grammarly browser extension (rrweb-io#1603) among other scenarios were triggering pathological behavior, some of which was solved in rrweb-io#1615.
See also rrweb-io#1640 (comment) for further discussion.

* Fix the case when there are multiple matches and we end up not finding a unique one - just go with the best guess when there are many splits by looking at the previous chunk's size
* Also add '0px' -> '0' stylesheet normalization, which also fixes the sample problem in a different way
* Add new test and modify it so that it can trigger a failure in the absence of the '0px' normalization; there may be other unknown ways of triggering a similar bug, so ensure that the primary 'best guess' method doesn't suffer a regression
* Leverage the 'best guess' method so that we can quit after 100 iterations trying to find a unique substring; hopefully this bit along with the `iterLimit` already added will prevent any future pathological cases.

Failing example extracted from large files identified by Paul D'Ambra (Posthog) ... see comment from MartinWorkfully: PostHog/posthog-js#1668

* fix: move patch function into utils to improve bundling (rrweb-io#1631)

* fix: move patch function into utils to improve bundling

---------

Co-authored-by: pauldambra <pauldambra@users.noreply.github.com>
Co-authored-by: Justin Halsall <Juice10@users.noreply.github.com>

---------

Co-authored-by: Eoghan Murray <eoghan@getthere.ie>
Co-authored-by: Kevin Townsend <11738094+kevinatown@users.noreply.github.com>
Co-authored-by: Justin Halsall <Juice10@users.noreply.github.com>
Co-authored-by: Paul D'Ambra <paul@posthog.com>
Co-authored-by: pauldambra <pauldambra@users.noreply.github.com>
Co-authored-by: John Henry Gunther <jguntherenator@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants