-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory efficiency for used names collection and other Relations building #764
Conversation
def toImmutable[K, V]( | ||
map: java.util.HashMap[K, (K, mutable.Builder[V, immutable.HashSet[V]])] | ||
): immutable.HashMap[K, immutable.HashSet[V]] = { | ||
val builder = immutable.HashMap.newBuilder[K, immutable.HashSet[V]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scala/scala#8726 is the 2.12.x backport of efficient immutable.HashMap$Builder
(thanks, @mkeskells!). With that, or with 2.13, this step is step is quite efficient.
I considered an alternative of just doing an unsafeToImmutableMap
wrapper, but that is a somewhat more fragile if the users of the built Relation
start to call map operations on it.
It would be great if we had a richer immutable.HashMap.Builder
that supported key getOrElseUpdate
on the map being built, as well as a find transformValues
step. Then, we could build up the desired structure from the begining and do a second pass that just mutates the data in the leaves.
internal/zinc-core/src/main/scala/sbt/internal/inc/IncrementalCompile.scala
Outdated
Show resolved
Hide resolved
21f554c
to
130309b
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A light-touch/non-paranoid LGTM.
internal/zinc-core/src/main/scala/sbt/internal/inc/IncrementalCompile.scala
Outdated
Show resolved
Hide resolved
e47b500
to
07b336f
Compare
This comment has been minimized.
This comment has been minimized.
Note to self: the benchmarks are unlikely to show much benefit on 2.12.11. Run on 2.12.x-SNAPSHOT instead. In that build:
|
d44f509
to
3715290
Compare
6830263
to
f7e1812
Compare
666b3ea
to
31d3263
Compare
- Build the maps in the used name relation in a mutable map, then use a HashMap builder in a second pass to build the resulting maps. In Scala 2.12.12+ or 2.13.0+, use of the builder is fastest as the builder mutates the structure internally. - Intern the keys and values, so we don't have a separate instance of `UsedName("toString", EnumSet.of(UseScope.Default)` for each call site. - Factor this into RelationBuilder, and use this in ProtobufReaders where I recently made similar opts. ``` [info] All tests passed. [info] Passed: Total 17, Failed 0, Errors 0, Passed 17 [success] Total time: 149 s (02:29), completed 28/04/2020 3:06:32 PM sbt:zinc Root> ```
31d3263
to
8d49d9c
Compare
@retronym Hey! :) I'm concerned that with this change we may experience concurrency hazards when compiling with Hydra (such as visibility issues), as the underlying |
Yep, that's totally fine with me. Looking at this again, I see I did synchronizes writes to the |
You might want a different performance tradeoff (preferring less potential for contention and worrying less about allocation pressure). In that case you could create a second implementation of |
Yep, that's it. Problem is that swapping the As the changes in this PR are motivated by (allocation) performance, what's the order of magnitude of the savings obtained using the current implementation? (I suspect you have been seeing a meaningful difference in large code bases, haven't you? If so, could you share some figures) By the way, would you have a benchmark that we can use to see if we can tweak the implementation in a way that would make it work with Hydra too, and at no meaningful penalty for vanilla Scala? |
@retronym ping :) |
@dotta We've now got a way to avoid building relations (bi-directional maps) for used names. So we can go back to the lock-free data structures: https://github.com/sbt/zinc/pull/995/files#r677157060 |
then use a HashMap builder in a second pass to build the
resulting maps. In Scala 2.12.12+ or 2.13.0+, use of the
builder is fastest as the builder mutates the structure
internally.
instance of
UsedName("toString", EnumSet.of(UseScope.Default))
for each call site.
RelationBuilder,
and use this inProtobufReaders
where I recently made similar opts.