-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String deduplication #1269
String deduplication #1269
Conversation
…ke it possible to support string deduplication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the serialisation memory issue, which comes with it's own set of problems this LGTM
I too approve, with large files i have seen a reduction of 15G in ram, a previous file set (netflow and logs on day2) i saw 55G of raphtory ram usage, now with these changes the graph uses 40.7G of ram |
* eliminate a lot of arc clones * replace String by ArcStr (wrapped Arc<str>) for cheap clone and to make it possible to support string deduplication * test string deduplication * implement string deduplication for property values * clean up warnings * expose meta data in core ops and minor cleanup
* take property insertion apart and put it back together again * fix tests that were testing broken behaviour * remove `"_id"` from properties and change stray ints to floats in python tests * fix warnings * String deduplication (#1269) * eliminate a lot of arc clones * replace String by ArcStr (wrapped Arc<str>) for cheap clone and to make it possible to support string deduplication * test string deduplication * implement string deduplication for property values * clean up warnings * expose meta data in core ops and minor cleanup * fix rebase issues and clean up warnings * dubious warning fix * attribute does not work, warning is still there * simplify edge addition and deletion * No more spin-locking for adding edges (instead get locks in consistent order)
* take property insertion apart and put it back together again * fix tests that were testing broken behaviour * remove `"_id"` from properties and change stray ints to floats in python tests * fix warnings * String deduplication (#1269) * eliminate a lot of arc clones * replace String by ArcStr (wrapped Arc<str>) for cheap clone and to make it possible to support string deduplication * test string deduplication * implement string deduplication for property values * clean up warnings * expose meta data in core ops and minor cleanup * fix rebase issues and clean up warnings * dubious warning fix * attribute does not work, warning is still there * simplify edge addition and deletion * No more spin-locking for adding edges (instead get locks in consistent order)
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change? If yes is this documented?
no, changes are completely transparent to the user-facing apis
How was this patch tested?
existing tests still work and added a test to check the deduplication is effective