Cache external URI calls #58

glasnt · 2015-11-29T09:35:22Z

I've rearranged the external calls, including the pagination calls, so that I can memoise via a unique url the JSON response of the API calls.

The cache dict is stored as a local JSON file so it can be recalled on subsequent runs of the script.

Additionally, this captures all external calls, so re-runs of the script do not call externally (testable by disabling local network after a full successful run and re-running)

Thanks to @aurynn, @tveastman and @ncoghlan for the tips to get this happening :)

@tveastman

OMG this is an awesome bit of python that I had no idea you could use before This creates a function that stores the output of the function `get_code_commentors` for an entire repo call. This means that repeated calls do not have to re-get commentors. Results are saved to file in a json format, as to not introduce yet-another-keyvalue-store. This isn't as effective as it could be, because it stores the full result as a list of people. For proper effectiveness, results should be stored on a per pull/issue bases, as so to allow for incremental checks (e.g. 5000 tokens used for a repo with 5000+ issues will mean multiple runs spaced out. Storing the results of previous runs will build up a store of data, which on the final run will just complete 'instantly' rather than have to use any requests) Thanks to @aurryn, @tveastman and @ncoghlan for their assistance :D

glasnt · 2015-11-29T19:35:53Z

Resolves #49

SvenDowideit · 2015-11-30T00:26:27Z

I'm kinda wary - if i grok the code right, you're caching without checking if the cache is invalid (ie, if there are more comments etc now)

That said, it totally depends on what the goal for this codebase and specifically for this PR are - if its saving on API calls during development for now, then 👍

glasnt · 2015-11-30T00:33:51Z

@SvenDowideit yeah, this is a rudimentary fix for now. The cache invalidation would be a Big Thing(tm) and would have to be introduced alongside #34.

'Mutating cache': running octohatrack.py with a sufficiently full cache_file.json for a repo was resulting in an increating cache. This was reproducible even when the script was running on a machine lacking network. The results for a pull/comments were increasing on subsequent runs. A result with 3 comments would increase to 6, then 9. Fact: for a PR in github, content for comments appears at both the pulls/ID/comments and issues/ID/comments endpoints. The counters for these will be the same. Speculation: the results for the issues were being appended to the results for the pulls. Hence a constant increase and not a multiplicative increase in the results. I have no idea *how* it was mutating, but removing the `+=` of the results to loop through once, and processing the results of two memoise'd calls separately means the cache no longer mutates.

SvenDowideit · 2015-12-17T02:19:18Z

oh go on, mergify it!

Cache external URI calls

glasnt added 5 commits November 29, 2015 15:51

Memoise json output of uri get commands for better caching

3231637

Invalidate cache if the json is corrupt

79a6754

Small formatting changes

74c81b0

Memoise the last external call

893da9d

glasnt mentioned this pull request Nov 29, 2015

Really would like some kind of partial data caching #49

Closed

glasnt added a commit that referenced this pull request Jan 7, 2016

Merge pull request #58 from LABHR/demo/full_comment_cache

4439d31

Cache external URI calls

glasnt merged commit 4439d31 into master Jan 7, 2016

glasnt deleted the demo/full_comment_cache branch March 5, 2016 03:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache external URI calls #58

Cache external URI calls #58

glasnt commented Nov 29, 2015

glasnt commented Nov 29, 2015

SvenDowideit commented Nov 30, 2015

glasnt commented Nov 30, 2015

SvenDowideit commented Dec 17, 2015

Cache external URI calls #58

Cache external URI calls #58

Conversation

glasnt commented Nov 29, 2015

glasnt commented Nov 29, 2015

SvenDowideit commented Nov 30, 2015

glasnt commented Nov 30, 2015

SvenDowideit commented Dec 17, 2015