-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Motif improvements #1273
Motif improvements #1273
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #1273 +/- ##
==========================================
- Coverage 59.58% 57.92% -1.67%
==========================================
Files 170 173 +3
Lines 18064 18544 +480
==========================================
- Hits 10763 10741 -22
- Misses 7301 7803 +502
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good but I think you are missing a trick which should speed up the multi-delta computation. The results should be nested as a motif for a small delta is still a motif for large delta. If you were to compute the results across all values of delta in one loop instead of restarting from scratch for each new delta value (and ensure that the deltas are sorted), you should be able to save on a lot of comparison operations.
deltas | ||
.into_iter() | ||
.map(|delta| { | ||
let mut star_count = init_star_count(evv.degree()); | ||
star_count.execute(&events, delta); | ||
star_count.return_counts() | ||
}) | ||
.collect::<Vec<[usize; 24]>>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are redoing the count separately for all the deltas here and in other places but the results are nested as a motif for small delta is still there for large delta, right? Should be possible to use that to speed up the computation a lot as you can save many comparisons.
* sort tiebreaker for same time events * prepare for master pull * code is neater but slower * improved sorting for motifs algorithm * random shuffling for comparison * sort by time src dst version * sort by time then file line * motifs now take multiple deltas * clean unused imports * expose to python * fixed wrong type * ran formatter * killed the sort by options * fix py method signature * fmt
What changes were proposed in this pull request?
functionality for running multiple deltas on motif counts in a single run. also for choosing tie breaking mechanism from time + index and random reordering within block
Why are the changes needed?
before this, doing multiple deltas separately incurred a lot of wasted time in identifying static motifs over and over for each delta choice, this is hopefully a bit more optimised for that.
Does this PR introduce any user-facing change? If yes is this documented?
no, the original algorithm can be used before (it calls a special case of the new more general algo).
How was this patch tested?
single delta test still works.
Issues
Are there any further changes required?
Some parts seem not very optimal e.g. the sorting function is decided in the top level algorithm but then each different motif count has to do a match on the sorting type given (this matching then is done for every vertex), this was the only way I could get it working but feels like there should be a better way. Also the code feels a bit ugly. If anyone has some suggestions on short hand +/or optimisation, that would be amazing!