Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync improvement addressing #787 #821

Merged
merged 38 commits into from
Feb 17, 2021
Merged

Conversation

KevinMenden
Copy link
Contributor

@KevinMenden KevinMenden commented Jan 8, 2021

PR checklist

  • This comment contains a description of changes (with reason)
  • CHANGELOG.md is updated
  • If you've fixed a bug or added code that should be tested, add tests!
  • Documentation in docs is updated

This PR implements what is discussed in issue #787

The code adjusts the nf-core syncfunction in the following way:

  • checks for any PRs from branches named nf-core-template-merge-* and closes them if found
  • updates the TEMPLATE branch and pushes it to origin (as before)
  • creates a new branch from the updated TEMPLATE branch called nf-core-template-merge-<nf-core-version>
  • makes a PR from this new branch to dev(or whatever branch was specified as the from_branch)

It does not try to update any existing PRs anymore, because essentially all PRs from any nf-core-template-merge-* branches are closed anyway.
If the new merge branch already exists, this will cause an error, so I will still have to address this.

I'm sure there are more things to add or do differently, so I'm just putting this as draft for now 👍

@KevinMenden KevinMenden added automation WIP Work in progress labels Jan 12, 2021
@codecov
Copy link

codecov bot commented Jan 12, 2021

Codecov Report

Merging #821 (f5ac071) into dev (95b9d6f) will decrease coverage by 1.82%.
The diff coverage is 15.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #821      +/-   ##
==========================================
- Coverage   75.02%   73.20%   -1.83%     
==========================================
  Files          22       22              
  Lines        2755     2784      +29     
==========================================
- Hits         2067     2038      -29     
- Misses        688      746      +58     
Impacted Files Coverage Δ
nf_core/sync.py 52.48% <15.18%> (-23.04%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95b9d6f...dc2148f. Read the comment docs.

@KevinMenden KevinMenden marked this pull request as ready for review January 12, 2021 09:33
@KevinMenden
Copy link
Contributor Author

Okay so now additionally, closed PRs get updates on their titles and bodies, to indicate why they have been closed.

Also, instead of failing when the merge branch already exists, we will create a new branch with suffix -2, or a higher number if that exists as well, as suggested by @ewels .

The pipeline creation checks don't pass because the GITHUB_AUTH_TOKEN is not found during the tests. The error is caused by:
assert os.environ.get("GITHUB_AUTH_TOKEN", "") != ""
I have the suspicion that this is not related to the new code, but just wasn't invoked in the old version, as no PR was created there.

@KevinMenden
Copy link
Contributor Author

I guess this can only be fixed by changeing the create-lint-wf.yml action and adding a GITHUB_AUTH_TOKEN environment variable.

Is there a secret that we can use for this?

@KevinMenden KevinMenden added Ready for Review and removed WIP Work in progress labels Jan 14, 2021
@ewels ewels linked an issue Jan 14, 2021 that may be closed by this pull request
@KevinMenden
Copy link
Contributor Author

Solved the issue with the token by just using the ${{ secrets.GITHUB_TOKEN }} as token.

Copy link
Member

@Zethson Zethson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very very minor nits.
LGTM

nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
.github/workflows/create-lint-wf.yml Outdated Show resolved Hide resolved
@KevinMenden
Copy link
Contributor Author

Thanks for the review @Zethson !

@KevinMenden
Copy link
Contributor Author

Alright so I fixed the issue with the assert statements. Also surrounded the function close_open_template_merge_pull_requests with a try-except block, which complains if it didn't work but doesn't throw an exception. The remaining code should still work if any open template-merge PRs can't be closed.

Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass looks great! A few thoughts and comments but nothing major.

Given how terrified I am about touching this code, this is awesome work! 🚀

CHANGELOG.md Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated
Comment on lines 291 to 299
# Close existing PR
pr_title = "Important! Template update for nf-core/tools v{} Closed because outdated!".format(
nf_core.__version__
)
pr_body_text = (
"A new release of the main template in nf-core/tools has just been released. "
"This automated pull-request attempts to apply the relevant updates to this pipeline.\n\n"
"This pull-request is outdated and has been closed. A new pull-request has been created instead."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these overwrite the main title and body of the open PR? I wonder if it's better to just add a comment to the PR instead of editing this..

(Bonus points if we can do it after creating the new PR and post a link to the PR that closed it..)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 okay I'll see if how to do that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linting code used to post GitHub comments:

tools/nf_core/lint.py

Lines 1473 to 1525 in 9b3eec9

def github_comment(self):
"""
If we are running in a GitHub PR, try to post results as a comment
"""
if os.environ.get("GITHUB_TOKEN", "") == "":
log.debug("Environment variable GITHUB_TOKEN not found")
return
if os.environ.get("GITHUB_COMMENTS_URL", "") == "":
log.debug("Environment variable GITHUB_COMMENTS_URL not found")
return
try:
headers = {"Authorization": "token {}".format(os.environ["GITHUB_TOKEN"])}
# Get existing comments - GET
get_r = requests.get(url=os.environ["GITHUB_COMMENTS_URL"], headers=headers)
if get_r.status_code == 200:
# Look for an existing comment to update
update_url = False
for comment in get_r.json():
if comment["user"]["login"] == "github-actions[bot]" and comment["body"].startswith(
"\n#### `nf-core lint` overall result"
):
# Update existing comment - PATCH
log.info("Updating GitHub comment")
update_r = requests.patch(
url=comment["url"],
data=json.dumps({"body": self.get_results_md().replace("Posted", "**Updated**")}),
headers=headers,
)
return
# Create new comment - POST
if len(self.warned) > 0 or len(self.failed) > 0:
r = requests.post(
url=os.environ["GITHUB_COMMENTS_URL"],
data=json.dumps({"body": self.get_results_md()}),
headers=headers,
)
try:
r_json = json.loads(r.content)
response_pp = json.dumps(r_json, indent=4)
except:
r_json = r.content
response_pp = r.content
if r.status_code == 201:
log.info("Posted GitHub comment: {}".format(r_json["html_url"]))
log.debug(response_pp)
else:
log.warning("Could not post GitHub comment: '{}'\n{}".format(r.status_code, response_pp))
except Exception as e:
log.warning("Could not post GitHub comment: {}\n{}".format(os.environ["GITHUB_COMMENTS_URL"], e))

I removed it because the linting runs on the PR base, so didn't have sufficient authentication when the PR came from someone's fork. But here we have full auth permissions and it's coming from a branch on the same repo so it should work fine I think..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to resurrect that code and put it to some use again! 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh that's nice!! thanks 👌

nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
KevinMenden and others added 2 commits February 12, 2021 13:33
Co-authored-by: Phil Ewels <phil.ewels@scilifelab.se>
nf_core/sync.py Outdated Show resolved Hide resolved
@KevinMenden
Copy link
Contributor Author

Okay I think I got it to work so that it adds a comment to the closed PR instead of changing the body/title.
Needs some more work but I gotta run. Will look into it over the weekend or on monday 👍

@KevinMenden
Copy link
Contributor Author

Okay I had to move the function for closing PRs behind the one for creating one, and now am trying to assure that we don't close the PR that we have just created.

For some reason though, it closes this PR even when explicitly not checking for PRs between this branch and the base. I double checked 5 times now but I can't find the issue ... will come back to this later.

The tests are also failing for another issue that I will track down after this. This whole API thing is just so annyoing to test :/

Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more minor comments.. I'll see if I can have a stab at working on this PR a bit myself to help.

.github/workflows/create-lint-wf.yml Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
nf_core/sync.py Show resolved Hide resolved
nf_core/sync.py Outdated Show resolved Hide resolved
@ewels
Copy link
Member

ewels commented Feb 16, 2021

This whole API thing is just so annyoing to test :/

Tell me about it, this is why I hate working with this code! You can only really properly test when you actually do a release and it tries to sync everything. Then if (when) it breaks you have to wait another x months for a new release (unless it broke badly). It was even worse trying to test the GitHub Actions workflow that runs this (this code used to run on all pipelines but it was a nightmare so I split out that parallelisation into an Actions workflow so that the Python code can focus on just one repo).

This is why it has the ability to run locally as I have spent quite a lot of time manually doing syncs to solve problems caused by minor bugs. It's also why it spits out complete API return values at every opportunity (which get saved into a GitHub Actions artefact file that can be downloaded) to help with debugging, as a lot of the problems were very difficult to figure out.

At least we have nf-core/test-pipeline now I guess - is that where you have been testing this?

These tests are getting to be more effort to write than the code they're testing. Mocking up an entire functional GitHub API response for multiple calls is a little too much..
@ewels
Copy link
Member

ewels commented Feb 16, 2021

The tests are also failing for another issue that I will track down after this.

The tests were failing because they mocked up specific API call responses that were expected at the time, which are now out of date. They were getting so tricky to write and so brittle to changes that I just deleted them.

@ewels
Copy link
Member

ewels commented Feb 16, 2021

Ok, after a lot of trial and error with https://github.com/nf-core/testpipeline (apologies to anyone subscribed to that repo) I think I got it working.

I also found that the code was closing the PR that had just been opened, and I ended up refactoring it a little to just pull in all PRs from the API in one go then loop through doing the filtering in the Python code directly. I still found that it was trying to close old PRs that were already closed and ignoring newer open ones though, until I realised that it was the goddam requests_cache caching the API calls and so working on stale data 🤦🏻 Once I added a call to clear the cache before running the sync, everything was good.

On the plus side, not only is the code working now, but it has even better logging and debugging output 😆

Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM to merge now 👍🏻

@KevinMenden
Copy link
Contributor Author

KevinMenden commented Feb 17, 2021

Caching 🤦 yeah that explains why it closed PRs that were already closed ....

Oh man thanks a lot for going through this I was too annoyed to touch it for a couple of days :D

Will have a proper look - really want to know now how all of this works.

Oh yeah I've also been abusing my local copy of the testpipeline. Got a lot of PRs those days I think :D

@ewels
Copy link
Member

ewels commented Feb 17, 2021

Sounds good 👍🏻 Feel free to merge when you're happy, shout if you have questions.

I wonder if we get rid of the caching and instead manually cache the conda results if that's the only thing we really wanted it for. We're already doing some manual caching for the nextflow config call which is quite slow:

tools/nf_core/utils.py

Lines 95 to 102 in 8c68650

if cache_basedir and cache_fn:
cache_path = os.path.join(cache_basedir, cache_fn)
if os.path.isfile(cache_path):
log.debug("Found a config cache, loading: {}".format(cache_path))
with open(cache_path, "r") as fh:
config = json.load(fh)
return config
log.debug("No config cache found")

But low priority for now as everything seems to be working ok, and I'm not 100% sure if there are any other web requests that are being significantly helped by this (I'm looking at you, GitHub API rate limit).

@ewels ewels added this to the 1.13 milestone Feb 17, 2021
@KevinMenden KevinMenden merged commit bdba529 into nf-core:dev Feb 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sync: Make a branch for merging
3 participants