-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization: cache results between TypeScript projects #182
Conversation
Towards #175 Previously, scip-typescript didn't cache anything at all between TypeScript projects. This commit implements an optimization so that we now cache the results of loading source files and parsing options. Benchmarks against the sourcegraph/sourcegraph repo indicate this optimization consistently speeds up the `index` command in all three multi-project repositories that I tested it with. - sourcegraph/sourcegraph: ~30% from ~100s to ~70s - nextautjs/next-auth: ~40% from 6.5s to 3.9 - xtermjs/xterm.js: ~45% from 7.3s to 4.1s For every repo, I additionally validated that the resulting index.scip has identical checksum before and after applying this optimization. Given these promising results, this new optimization is enabled by default, but can be disabled with the option `--no-global-cache`. *Test plan* Manually tested by running `scip-typescript index tsconfig.all.json` in the sourcegraph/sourcegraph repository. To benchmark the difference for this PR: - Checkout the code - Run `yarn tsc -b` - Go to the directory of your project - Run `node PATH_TO_SCIP_TYPESCRIPT/dist/src/main.js` - Copy the "optimized" index.scip with `cp index.scip index-withcache.scip` - Run `node PATH_TO_SCIP_TYPESCRIPT/dist/src/main.js --no-global-caches` - Validate the checksum is identical from the optimized output `shasum -a 256 *.scip`
src/ProjectIndexer.ts
Outdated
if (!hostCopy.getParsedCommandLine) { | ||
return undefined | ||
} | ||
if (cache.parsedCommandLines.has(fileName)) { | ||
return cache.parsedCommandLines.get(fileName) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand all the cases here; is it possible that the cache has a hit for this even though hostCopy.getParsedCommandLine
is missing? (when is getParsedCommandLine
missing?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding your comment, but we guard against hostCopy.getParsedCommandLine
at the top, so even if the cache has a hit we return undefined when hostCopy.getParsedCommandLine
is undefined.
src/ProjectIndexer.ts
Outdated
const fromCache = cache.sources.get(fileName) | ||
if (fromCache !== undefined) { | ||
return fromCache | ||
} | ||
const result = hostCopy.getSourceFile(fileName, languageVersion) | ||
if (result !== undefined) { | ||
cache.sources.set(fileName, result) | ||
} | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Maybe add a comment here or near the cache definition which says "It is safe use the file name as a key here because we assume that a source file only ever belongs to a single project, and is only ever type-checked with a single language version".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! I added an additional check to additionally only cache when languageVersion
is unchanged (excluding a setExternalModuleIndicator
field that is a function that is never the same. I also added a comment explaining in more details. I also discovered that getSourceFile
has two optional parameters that we're now properly forwarding.
Very nice! |
@@ -24,7 +24,6 @@ function checkIndexParser( | |||
|
|||
// defaults | |||
checkIndexParser([], { | |||
progressBar: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this get removed? Shouldn't there be an extra globalCaches: true
here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention it in the description but this PR also disables the fancy progress bar by default. I didn't measure any performance difference with it enabled/disabled, but I found it distracting while testing projects. scip-typescript runs >99% of the time in CI where you should disable it by default anyways and I'm guessing most people are not doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review! 🙏🏻
@@ -24,7 +24,6 @@ function checkIndexParser( | |||
|
|||
// defaults | |||
checkIndexParser([], { | |||
progressBar: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention it in the description but this PR also disables the fancy progress bar by default. I didn't measure any performance difference with it enabled/disabled, but I found it distracting while testing projects. scip-typescript runs >99% of the time in CI where you should disable it by default anyways and I'm guessing most people are not doing it.
src/ProjectIndexer.ts
Outdated
if (!hostCopy.getParsedCommandLine) { | ||
return undefined | ||
} | ||
if (cache.parsedCommandLines.has(fileName)) { | ||
return cache.parsedCommandLines.get(fileName) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding your comment, but we guard against hostCopy.getParsedCommandLine
at the top, so even if the cache has a hit we return undefined when hostCopy.getParsedCommandLine
is undefined.
src/ProjectIndexer.ts
Outdated
const fromCache = cache.sources.get(fileName) | ||
if (fromCache !== undefined) { | ||
return fromCache | ||
} | ||
const result = hostCopy.getSourceFile(fileName, languageVersion) | ||
if (result !== undefined) { | ||
cache.sources.set(fileName, result) | ||
} | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! I added an additional check to additionally only cache when languageVersion
is unchanged (excluding a setExternalModuleIndicator
field that is a function that is never the same. I also added a comment explaining in more details. I also discovered that getSourceFile
has two optional parameters that we're now properly forwarding.
Towards #175
Previously, scip-typescript didn't cache anything at all between TypeScript projects. This commit implements an optimization so that we now cache the results of loading source files and parsing options. Benchmarks against the sourcegraph/sourcegraph repo indicate this optimization consistently speeds up the
index
command in all three multi-project repositories that I tested it with.For every repo, I additionally validated that the resulting index.scip has identical checksum before and after applying this optimization. Given these promising results, this new optimization is enabled by default, but can be disabled with the option
--no-global-cache
.Test plan
Manually tested by running
scip-typescript index tsconfig.all.json
in the sourcegraph/sourcegraph repository. To benchmark the difference for this PR:yarn tsc -b
node PATH_TO_SCIP_TYPESCRIPT/dist/src/main.js
cp index.scip index-withcache.scip
node PATH_TO_SCIP_TYPESCRIPT/dist/src/main.js --no-global-caches
shasum -a 256 *.scip