Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Haste2 #599

Merged
merged 41 commits into from
Feb 16, 2016
Merged

Haste2 #599

merged 41 commits into from
Feb 16, 2016

Conversation

cpojer
Copy link
Member

@cpojer cpojer commented Nov 7, 2015

This pull request is used to track the implementation of node-haste2.

Summary
  • Implemented node-haste2, a re-implementation of Facebook's haste dependency system, based on Facebook's react-native packager. We now create the entire haste-map upfront and pass it to the worker threads instead of creating node-haste instances in each worker.
  • Changed the core architecture of Jest to abstract the HasteModuleLoader into a ModuleLoader and HasteResolver piece. Added a Test class so we aren't creating instances of a TestRunner in every worker. The new code should bring a lot of clarity but there is also a lot more to be done.
  • Performance improvements
    • fixed startup time of Jest, both on cold and warm start. node_modules are now lazily analyzed.
    • fixed require-resolution time for relative requires. Loading modules without haste is now much faster and uses more caching.
  • New implementation of --onlyChanged that should work much better and find the correct set of related tests to run.
  • Numerous bugfixes and cleanups all around Jest.
Performance Improvements
  • The Relay test suite (GitHub) now takes 20s vs. 60-100s previously (on 13” MBP)
  • The react-native test suite now takes 8 seconds instead of ~70s (on 13” MBP)
  • Starting up Jest on react-native (fb internal) now takes 1 second vs. ~10-15 seconds from before.
  • FB's frontend codebase test run was already heavily optimized but there is ~10 % win down to ~97s from about 107s.

This diff/PR forms the basis for all the work on Jest that I'm planning to do in the near term and will set up Jest for further success. It should be much easier to iterate on Jest from now on given that we have replaced the biggest legacy system and I can finally focus on solving some of the community problems.

@cpojer
Copy link
Member Author

cpojer commented Nov 7, 2015

The new module execution code is a little bit more ugly but it is super-hot code and yields a 10 sec perf win on www (~8.5 % of the entire runtime).

@cpojer
Copy link
Member Author

cpojer commented Nov 12, 2015

omg it works.

@browniefed
Copy link
Contributor

#dope

@cpojer
Copy link
Member Author

cpojer commented Nov 14, 2015

Outstanding list of TODOs:

  • Add cache and fileWatcher to actually make this fast
  • Only construct one moduleLoader in TestRunner; re-use HasteResolver
  • Make 'HasteResolver' class configurable
  • Test this at FB and fix issues that come up

Future release:

  • Support NODE_PATH + traversal up the tree for node_modules so we can drop resolve

None of these should be blockers for merging this PR except for testing this at FB (no point in building this if it doesn't work on all of our tests ). I will try to run it on Monday and fix minor issues as they come up (assuming it won't blow up completely :) ). The first for TODO items are blockers for tagging a new version of jest itself, of course.

@jsdf
Copy link
Contributor

jsdf commented Nov 14, 2015

Correct me if I'm wrong, but it looks like the (soon to be pluggable) module resolver still can only provide filepaths to resolved modules, and the file contents are actually loaded from disk by the lib/transform module.

From the perspective of being able to plug-in different module transformation+bundling pipelines (eg. webpack, browserify), with the goal of reducing the amount of duplicated config/machinery required to use Jest, it would be great to have the option of providing module sources as the output of the resolver, to be used, instead of only providing filepaths.

In both browserify and webpack, resolving and transforming of modules is interleaved, and transforms/loaders are async, so the easiest place to plug them in would probably be in place of HasteResolver, but also providing the transformed module source text along with the resolved module filepath.

@jsdf
Copy link
Contributor

jsdf commented Nov 14, 2015

All the other stuff looks good though 😄

@cpojer
Copy link
Member Author

cpojer commented Nov 16, 2015

@jsdf thanks for the review! What would the ideal API look like to you? I take it webpack/browserify pre-transform code and then resolve them?

I can imagine to merge the preprocessor and the resolver: resolving the dependencies for a path will pre-transform the file and apply the transform recursively for every dependency before resolving its dependencies. Now that there is only a single entry point for transform (there used to be three!) this should be easy to change. All that it needs is a map from real path to the cache path, which should be achievable using the getCacheKey method in the preprocessor.

@cpojer
Copy link
Member Author

cpojer commented Nov 16, 2015

High level description of all the changes:

  • replaced node-haste with node-haste2
  • major cleanups in HasteModuleLoader, TestRunner and jest.js
  • fixed a bunch of bugs.
  • Split up HasteModuleLoader into two classes, the new class is HasteResolver which will soon receive more functionality.

Here is a breakdown of all the changes from oldest to latest:

cpojer/jest@89481fb...cpojer:5f82cc75872ae02e97fcfa5f41039f8f6d6e3686

  • The first few commits are cleanups. They remove some cruft in the test runner and inline the module execution. By not using objects and Object.keys when executing a module we get a slight speedup. Note that this code will move out of HasteModuleLoader in the future and I just consolidated this convoluted piece of code

cpojer@e1faf1b

  • This diff is inconsequential. getJestRuntimeForTest was removed in a later commit in the stack.

cpojer@620ea9e

  • This diff cleans up a bunch of minor things. Most notably I noticed that the disposed variable wasn't useful at all. I'm now using the global to check if a context should be disposed.

cpojer@f518a92

  • Slightly unrelated change but we were reporting the wrong number of total test suites (we didn't include the failed ones, this was annoying during the refactors)

cpojer/jest@9a4e8d0...cpojer:000d0d9d6c53b64e5c25b9e2018c3035c8f50a4b

  • This is the collection of diffs with all the changes that replace node-haste with node-haste2. At the end I changed all objects in the loader from {} to Object.create(null).

cpojer@422ab88

  • This diff reorganizes all the methods in HasteModuleLoader, the file is now readable. I sorted it similar to how classes are organized in node-haste, with the public interface on top. The one change I made was to not only create coverage data for direct dependents. This doesn't seem useful and isn't worth mutating the config object over. I can bring this back in the future if we need it but I think ignoring node_modules is enough. I froze the config object in the next diff: cpojer@1b06d9e which will allow further optimizations in the future.

cpojer@92bcc17

  • Just a code cleanup and moved the code to ES2015

cpojer@5fd0c88

  • I noticed jest --onlyChanged never properly worked in the jest repo itself. This fixes the command for repos where <rootDir> doesn't equal the git root. The big algorithmic change in this diff is that instead of looking at all the changed files and resolving their dependents recursively up to a test file, we are now looking at all the recursive dependencies of all the test files and intersect them with the changed files. Let's bring back the old algorithm (and support for it in node-haste) if this proves to be too slow internally at FB.

@jsdf
Copy link
Contributor

jsdf commented Nov 16, 2015

@cpojer your description of the combined preprocessor and resolver sounds right. For the webpack/browserify use case, the ideal API would be (most of which is already there):

  • The module resolver class should be instantiated once and reused between test files, to facilitate in-memory caching of work done.
  • There should be a getDependencies method which is called with a test file before it is run by the test runner. This method should return a promise of a dependency map for that test file.
  • The resulting dependency map can include preprocessed source contents. Jest would use these sources when executing modules. In this case, either the Jest preprocessor implementation wouldn't be used, or alternatively if it is used by the custom module resolver implementation then there should be a version of the transform function which would be able to take (source, filepath) as arguments rather than loading the source file from disk itself. I'm not entirely sure how useful it would be to combine Jest preprocessing with eg. webpack preprocessing, but it might be useful for people who want to write a custom resolver which only changes the way files are resolved, while using the built in Jest preprocessing implementation.

@cpojer
Copy link
Member Author

cpojer commented Dec 7, 2015

Ok we are back in business. I just rebased this branch on top of all the work that went into jest since I started on this diff. Everything up until "Use Object.create(null) in HasteResolver." was reviewed by @amasad about two weeks ago. I'll pull in the minor cleanups from the beginning of this PR into jest 0.8.1 and once I'm done with the jasmine2 conversion at FB I'll pick this one back up.

@cpojer cpojer mentioned this pull request Dec 7, 2015
@cpojer
Copy link
Member Author

cpojer commented Feb 17, 2016

I've published this PR as 0.9.0-fb1 for anyone who would like to try it out.

@SomniumDigital
Copy link

Gone from 70 tests in 26 tests suites running in ~21secs to ~9secs (after warming up)
Great work @cpojer go and get yourself a beer 👍🍻😀

@iamdustan
Copy link
Contributor

👏👏

@irvinebroque
Copy link

Working great for us, also appears to have fixed #632 👏

@irudoy
Copy link
Contributor

irudoy commented Feb 19, 2016

I'm running tests with jest.runCLI() via Gulp, and it still takes a lot of time. Also, when tests have been passed, gulp task doesn't exit and loading CPU at 80%. If I run it with npm test, all is ok. @cpojer

@cpojer
Copy link
Member Author

cpojer commented Feb 19, 2016

@irudoy can you try passing watchman: false to your jest run?

@irudoy
Copy link
Contributor

irudoy commented Feb 19, 2016

@cpojer it didn't help

@cpojer
Copy link
Member Author

cpojer commented Feb 19, 2016

If you are willing to add tests to Jest that point out this issue you are seeing with your integration or sahre a test suite with a repro, I can try to look into this.

@irudoy
Copy link
Contributor

irudoy commented Feb 19, 2016

@cpojer
Copy link
Member Author

cpojer commented Feb 19, 2016

Thanks! Note that 0.9 is not out yet and you should be safe using 0.8. Would you mind opening a separate issue on this repo with the details of this issue and what you'd expect to work that doesn't any more? :)

@irudoy
Copy link
Contributor

irudoy commented Feb 19, 2016

@cpojer okay, here it is #722 :)

@cpojer
Copy link
Member Author

cpojer commented Feb 19, 2016

Published jest@next again as 0.9.0-fb2 which should resolve the watcher issues.

ghost pushed a commit that referenced this pull request Feb 22, 2016
Summary:This fixes unmock resolution for node_modules when using npm3. The way this is solved is by checking whether the parent dependency is unmocked and both modules are within a `node_modules` folder.

This has taken a while, mainly because it required #599 to be merged. I also added some more caching that makes test runs even faster. The react-native test suite now completes in about 6s (down from 7.5-8s). There might be more in the near future ;)

This fixes #554, #730, facebook/react#5183, facebook/relay#832 and will be part of 0.9.0. I'll publish 0.9.0-fb3 today with this fix.
Closes #732

Differential Revision: D2959610

fb-gh-sync-id: c374b7a2bcdfddf768905356a08948d9156eb028
shipit-source-id: c374b7a2bcdfddf768905356a08948d9156eb028
@jrubins
Copy link

jrubins commented Mar 10, 2016

Performance improvements from v0.8.2 to v0.9.0 were SIGNIFICANT for us.

The following is output from a Jenkins build with v0.8.2:
screen shot 2016-03-10 at 5 40 18 pm

And the following is output from a Jenkins build after upgrading to v0.9.0:
screen shot 2016-03-10 at 5 40 50 pm

(Note: Coverage was turned on for both runs. The upgrade was added along with additional tests in a feature branch which is why the test suite numbers went up.)

Really appreciate the dedication to improving the performance and look forward to future improvements!

@cpojer
Copy link
Member Author

cpojer commented Mar 10, 2016

Ohhh, there is an open task to improve caching when coverage is being used. If anyone wants to work on that, it should improve test runs with coverage by 20-50 %.

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.