Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest: use new augur curate commands #70

Merged
merged 7 commits into from
Jul 18, 2024

Conversation

j23414
Copy link
Contributor

@j23414 j23414 commented Jul 17, 2024

Description of proposed changes

Updates the curate rule to use the new augur curate commands.
Updates ingest/vendored to pull in latest changes including nextstrain/ingest#45.

Includes additional changes to rename transform to curate to conform the pathogen-repo-guide
Includes additional change to remove reverse and institution columns since they are blank and seemingly unused.

Related issue(s)

Mirroring changes in nextstrain/pathogen-repo-guide#53
Resolves #67

Checklist

  • Checks pass
  • Confirmed only diff is the added period to <first author> et al. and the removal of the reverse and institution columns.

@j23414 j23414 linked an issue Jul 17, 2024 that may be closed by this pull request
2 tasks
@j23414 j23414 requested a review from a team July 17, 2024 15:48
Copy link
Contributor

@genehack genehack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit: if you use git mv to rename files, rather than delete/re-add, I think there's better history and change tracking.

@j23414
Copy link
Contributor Author

j23414 commented Jul 17, 2024

Ohh interesting, I did use git mv in 5690058 but perhaps the subsequent changes muddied the history.

@tsibley
Copy link
Member

tsibley commented Jul 17, 2024

Tiny nit: if you use git mv to rename files, rather than delete/re-add, I think there's better history and change tracking.

git mv is exactly the same operation as rm + add. Tracking of moves/copies is done entirely by content similarity. There are config options to tweak how "hard" git tries to identify moves/copies; they're pretty conservative by default and it's nice to tune them upwards sometimes. I've also noticed that github.com's diff sometimes has worse move/copy detection than what I have configured locally.

@genehack
Copy link
Contributor

git mv is exactly the same operation as rm + add.

TIL!

'authors',
'institution'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking

FWIW, institution can be populated by adding submitter-affiliation=institution to the field_map config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looking around the code configs, I decided to add a functionally populated institution column in 644faa9

Vendored scripts for the curate rule have been ported to `augur curate`
and are available starting from Augur 25.0.0.¹

¹ <https://github.com/nextstrain/augur/releases/tag/25.0.0>
Separate out selecting metadata columns into a rule to match the pathogen repo guide
subrepo:
  subdir:   "ingest/vendored"
  merged:   "258ab8c"
upstream:
  origin:   "https://github.com/nextstrain/ingest"
  branch:   "main"
  commit:   "258ab8c"
git-subrepo:
  version:  "0.4.6"
  origin:   "https://github.com/ingydotnet/git-subrepo"
  commit:   "110b9eb"
@j23414 j23414 force-pushed the 67-ingest-use-new-augur-curate-commands branch from 9ca3015 to 644faa9 Compare July 18, 2024 15:44
@@ -8,7 +8,7 @@ ncbi_taxon_id:
general: "11250"

# Params for the transform rulegeneral
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably want to update the comment too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, good catch, will update

@j23414 j23414 merged commit 211ba44 into master Jul 18, 2024
3 checks passed
@j23414 j23414 deleted the 67-ingest-use-new-augur-curate-commands branch July 18, 2024 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ingest: use new augur curate commands
4 participants