Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submission: phylogram #212

Closed
15 of 19 tasks
shaunpwilkinson opened this issue Apr 24, 2018 · 40 comments
Closed
15 of 19 tasks

Submission: phylogram #212

shaunpwilkinson opened this issue Apr 24, 2018 · 40 comments

Comments

@shaunpwilkinson
Copy link

Summary

  • What does this package do? (explain in 50 words or less):

The phylogram R package is a tool for for developing phylogenetic trees as deeply-nested lists known as "dendrogram" objects. It provides functions for importing and exporting trees in parenthetic text format, as well as several tools for command-line tree manipulation and conversion between "dendrogram" and "phylo" class objects.

  • Paste the full DESCRIPTION file inside a code block below:
Package: phylogram
Type: Package
Title: Dendrograms for Evolutionary Analysis
Version: 2.0.1.9000
Authors@R: person("Shaun", "Wilkinson", role = c("aut", "cre"),
                     email = "shaunpwilkinson@gmail.com")
Author: Shaun Wilkinson [aut, cre]
Maintainer: Shaun Wilkinson <shaunpwilkinson@gmail.com>
Description: Contains functions for importing and exporting 'dendrogram' 
    objects in parenthetic text format, as well as several 
    functions for command-line tree manipulation.
License: GPL-3
LazyData: TRUE
URL: http://github.com/shaunpwilkinson/phylogram
BugReports: http://github.com/shaunpwilkinson/phylogram/issues
Encoding: UTF-8
Depends:
    R (>= 3.0.0),
    methods,
    stats
Imports:
    ape (>= 4.0),
Suggests:
    knitr,
    rmarkdown,
    testthat
RoxygenNote: 6.0.1
VignetteBuilder: knitr
  • URL for the package (the development repository, not a stylized html page):

https://github.com/shaunpwilkinson/phylogram

  • Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):

[e.g., "data extraction, because the package parses a scientific data file format"]

Data extraction, the package contains a text parser for trees in the Newick parenthetic text format, and functions for converting between dendrogram and phylo object types.

  •   Who is the target audience and what are scientific applications of this package?  

Biologists and bioinformaticists wishing to access the many analytical and tree-visualization functions available for dendrogram objects in R.

There are other packages that can parse Newick strings to create phylo objects (e.g. ape, ggtree) but none that we are aware of that enable two-way conversion between phylo and dendrogram objects.

  •   If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Requirements

Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • has a CRAN and OSI accepted license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration, including reporting of test coverage, using services such as Travis CI, Coveralls and/or CodeCov.
  • I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI: 10.5281/zenodo.1227670
    • (Do not submit your package separately to JOSS)
  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
    • The package is novel and will be of interest to the broad readership of the journal.
    • The manuscript describing the package is no longer than 3000 words.
    • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
    • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no gaurantee that your manuscript willl be within MEE scope.)
    • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
    • (Please do not submit your package separately to Methods in Ecology and Evolution)

Detail

  • Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

No errors or warnings.

  • Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

Tal Galili (talgalili)
Zuguang Gu (jokergoo)
Michael Krabbe Borregaard (mkborregaard)
Guangchuang Yu (GuangchuangYu)

@sckott
Copy link
Contributor

sckott commented Apr 30, 2018

thanks for your submission @shaunpwilkinson - will get to editor checks tomorrow morning

@sckott
Copy link
Contributor

sckott commented May 1, 2018

Editor checks:

  • Fit: The package meets criteria for fit and overlap
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has a CRAN or OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Thanks for your submission @shaunpwilkinson !!

  • You can put a ropensci review badge in your README

[![](https://badges.ropensci.org/212_status.svg)](https://github.com/ropensci/onboarding/issues/212)

  • Goodpractice output for you and reviewers to consider, for reviewers to consider:
── GP phylogram ───────────

It is good practice towrite unit tests for all functions, and all package code in general. 82% of code lines are covered by test cases.

    R/ladder.R:33:NA
    R/ladder.R:34:NA
    R/ladder.R:35:NA
    R/ladder.R:36:NA
    R/ladder.R:37:NA
    ... and 63 more linesnot use "Depends" in DESCRIPTION, as it can cause name clashes, and poor interaction with other packages. Use "Imports" instead.avoid long code lines, it is bad for readability. Also, many people prefer editor windows that are about 80 characters wide. Try make your lines
    shorter than 80 characters

    R/prune.R:66:1
    R/prune.R:96:1
    R/read.R:74:1
    R/read.R:151:1
    R/write.R:32:1
    ... and 2 more linesavoid sapply(), it is not type safe. It might return a vector, or a list, depending on the input data. Consider using vapply() instead.

    R/ladder.R:35:21
    R/prune.R:43:21
    R/read.R:113:25
    R/read.R:114:26
    R/read.R:158:21
    ... and 3 more linesnot import packages as a whole, as this can cause name clashes between the imported packages. Instead, import only the specific functions you need.not use exportPattern in NAMESPACE. It can lead to exporting functions unintendedly. Instead, export functions that constitute the external API of
    your package.

Seeking reviewers now 🕐


Reviewers: @benjward @wcornwell
Due date: 2018-05-31

@shaunpwilkinson
Copy link
Author

Thanks very much @sckott I will get onto those good-practice edits asap! S

@sckott
Copy link
Contributor

sckott commented May 10, 2018

sorry for delay, still trying to find 2 reviewers 🕐

@sckott
Copy link
Contributor

sckott commented May 11, 2018

Reviewers assigned:

Reviewers: @benjward @wcornwell
Due date: 2018-05-31

@wcornwell
Copy link

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 3 ---

Review Comments

Thanks for this package. I didn't know anything really about the dendrogram class in R before looking into this. And it does look like a useful thing for the R-phylo-universe to have ways to go back and forth between the edge matrix style and the nested list style representation. There might be interesting applications in between taxonomy and phylogenies, and it's always good to have a full suite of tools to efficiently translate between data structures. As taxonomy moves to Non-linnaean data sets, the nested list structure might become increasingly useful.

I thought I would get to this quickly, and just do some of my normal research with this new tool and see how it goes. The tree that we have been using for the last several years is available (here)[https://datadryad.org//resource/doi:10.5061/dryad.63q27]. It's a ~30,000 tip tree for angiosperms, and typically we cut it down to a subset for whatever the task at hand is, but when I tried to load it using read.dendrogram, it still wasn't finished loading after 10 minutes. (The tree loads in less than a second using ape::read.tree). So then I did a little investigation. First I checked the size of the dendrogram objects using this code:

evolve.tree <- function(size){
  object.size(diversitree::tree.bd(pars=c(1,0),max.taxa=size))
}

evolve.dendro <- function(size){
  object.size(phylogram::as.dendrogram(diversitree::tree.bd(pars=c(1,0),max.taxa=size)))
}

x <- seq(2, 200, 10)
df <- data.frame(ape.size=purrr::map_dbl(x,evolve.tree), 
                 dendro.size=purrr::map_dbl(x,evolve.dendro))


library(ggplot2)
ggplot(df, aes(x=ape.size,y=dendro.size)) + geom_point()+theme_bw()

It turns out dendrogram objects are about 5 times as big as phylo objects, which isn't a big deal.

Then I checked read in speeds:

read_tree_dendro<-function(size){
  ape::write.tree(diversitree::tree.bd(pars=c(1,0),max.taxa=size),"twoh.tre")
  system.time({phylogram::read.dendrogram("twoh.tre")})[3]
}

read_tree_ape <- function(size){
  ape::write.tree(diversitree::tree.bd(pars=c(1,0),max.taxa=size),"twoh.tre")
  system.time({ape::read.tree("twoh.tre")})[3]
}

set.seed(42)
x <- seq(100,3000,150)
df <- data.frame(phylo_size=x,
               ape.speed=purrr::map_dbl(x, read_tree_ape),
               dendro.speed=purrr::map_dbl(x, read_tree_dendro))


ggplot(df, aes(x=ape.speed,y=dendro.speed,col=phylo_size)) + geom_point() + theme_bw()
ggplot(df, aes(x=phylo_size,y=dendro.speed,col=ape.speed)) + geom_point() + theme_bw()

So I guess there is something non-linear about the performance of read.dendrogram relative to both ape::read.tree and tree size. ape::read.tree seems to be linear and about two orders of magnitude faster (although with some mysterious variation sometimes).

I tried to read the read.dendrogram code, but I couldn't really intuit where the performance issues are. There are some magic numbers in the code, which might be one place to start. Also I have trouble reading the nested functions, but that might be because of my original training in C--I know this style is common in many languages. But for the big picture, it's worth figuring out the best way to optimise from a higher level. I afraid I'm not very knowledgable about about how R allocates memory to create deeply nested lists, but it's very likely that there are people around Ropensci who are, and who may be willing to give constructive advise on optimisation?

Phylogenies are getting bigger all the time: there was a 300,000 tip tree for plants published this week. And there are some giant bacterial trees out also. So for the functionality of this package, designing and optimising the code for at least medium sized trees is pretty crucial.

other notes:

  • I'm sure there are some use cases for comparative methods where a nested list will be more useful than an edge matrix. Be nice to have a worked example in the vignette. I would also be worth comparing the performance of ape::drop.tip with phylogram::prune.

  • Reading in problematic newick strings is kind of a nightmare because there are so many flavors of errors, but there are probably some cases where read.dendrogram returns a warning, when it really should give an informative error message rather than the current behavior which is to return a warning and a garbled tree.

library(phylogram)
newick <- "(A:0.1,B:0.2,C:0.3),D:0.4:0.5);"
x <- read.dendrogram(text = newick)
## Warning in read.dendrogram(text = newick): Possibly incomplete tree parse
x
## 'dendrogram' leaf 'D:0.4', at height 0

and also

newick <- "(A:0.1,(B:0.2,)C:0.3,D:0.4);"
x <- read.dendrogram(text = newick)
## Warning in read.dendrogram(text = newick): Possibly incomplete tree parse
x
## 'dendrogram' leaf 'A', at height 0

@wcornwell
Copy link

figures from above

unnamed-chunk-19-1

unnamed-chunk-20-1

unnamed-chunk-20-2

@sckott
Copy link
Contributor

sckott commented May 17, 2018

thanks for your review @wcornwell !

@shaunpwilkinson
Copy link
Author

Thank you very much for your review @wcornwell I'll make it a priority to optimize the parser to better handle those bigger trees and add more informative error messages for when the tree parse fails. Will aim to get these and your other suggestions seen to in the next week or two. Thanks again for the feedback!

@shaunpwilkinson
Copy link
Author

Hi @sckott, I have now made the changes suggested by @wcornwell, including a major overhaul of the read.dendrogram and as.dendrogram.phylo functions which are now much more efficient. Just wondering am I okay to push the changes to GitHub or would you prefer I leave the repo alone until the other review comes in?

@sckott
Copy link
Contributor

sckott commented May 24, 2018

@shaunpwilkinson great, thanks. Can you please include a response here to @wcornwell suggestions, it doesn't have to be exactly point by point, but a summary at least.

We don't really have rules about how changes are made with respect to the two different reviewers, but I think it's a good idea to wait for the 2nd review to come in.

@shaunpwilkinson
Copy link
Author

Thanks @sckott, will do, and no problem to hold off pushing the changes until the second review is finished.

@TransGirlCodes
Copy link

TransGirlCodes commented May 29, 2018

Package Review

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: ~3 hours.


Review Comments

This package will certainly be useful to convert the specific tree classes in
the "ape" family of packages of Paradis et al. into the R dendrogram class and back,
especially since R and other packages contain functionality that works on trees,
such as general hierarchical clustering and classification techniques, that both
statisticians and biologists alike are interested in.

Checking, installation, and testing

👍 devtools::check, devtools::spell_check, and devtools::test all return
with no notes, warnings, errors, or obvious typographical errors.
I do not find any errors that may have been missed on the author's system.

The coverage of the source code is good:

phylogram Coverage: 85.18%
R/read.R: 76.26%
R/prune.R: 79.73%
R/write.R: 94.74%
R/utils.R: 94.87%
R/asdendrogram.R: 100.00%
R/asphylo.R: 100.00%
R/ladder.R: 100.00%
Suggestion

Coverage could be improved slightly in read.R and prune.R.

In particular, it would be good to add to the test suite some cases testing that
errors are indeed thrown, when they are expected to be thrown - by giving
package functions deliberately incorrect input, and using testthat to catch
the errors with expect_error. This would be good to ensure the Newick reading
is operating as expected and rejecting input when it should, as Newick can be
a problematic format due to its flexibility.

Useage

Going through the examples in the vignettes was simple and all examples ran for me.

I won't comment on performance as that has already been addressed in previous
posts. All my examples and files are also not so large that performance is an issue.
In my experience with different tree representations, a nested data structure such as a dendrogram is less efficient for some operations cf. to a contiguous block of data due to spatial locality. The nested data structures also generally require more space.
However, a nested structure is faster and easier to code operations involving
structural edits - prunes, re-grafts and other rearrangements. So it is great
that this package will allow the user to choose the desired tree representation based on need.

Suggestions

In the vignette, I think following the instructions on manipulating trees directly i.e.:

x <- list(1, list(2, 3))
## attach "leaf" and "label" attributes to leaf nodes
attr(x[[1]], "leaf") <- TRUE
attr(x[[2]][[1]], "leaf") <- attr(x[[2]][[2]], "leaf") <- TRUE
attr(x[[1]], "label") <- "A"
attr(x[[2]][[1]], "label")
..... and so on ....

Is important to demonstrate how the trees can be built up from a vanilla R linked
data structure, but this style becomes difficult to read quite quickly for non-trivial
trees.

I think that immediately following this, the vignette could demonstrate how to
build the same tree, but by using functions that define a set of verbs
common to phylogentics (like prune, graft, drop.tips etc.), on a set of nodes.

I think that if this was added it would make users much more comfortable with manipulating trees in a much more semantic way.

In the future - and this is not limited to this package, it would be cool to see
functions such as ladderize (ape) and ladder (phylogram) standardized to a
consistent name, perhaps even defined as one function, but with multiple class methods.

Concluding

This is a nice and simple R package, which adds compatibility between a few different
classes from a set of related, but sometimes difficult to connect packages.
My suggestions are minor improvements to the package that I think would improve the submission, but it is already very useable.

Thank you for this package 😄

@sckott
Copy link
Contributor

sckott commented May 29, 2018

thanks for your review @benjward !

@shaunpwilkinson all reviews in ✅

@shaunpwilkinson
Copy link
Author

Thanks so much for your review @benjward . @sckott I'm in the process of incorporating the edits and updating the vignette now - and should have the response done either today or tomorrow. Thanks again all!

@sckott
Copy link
Contributor

sckott commented May 30, 2018

thanks!

@shaunpwilkinson
Copy link
Author

Hi @sckott, I have now pushed the changes to GitHub and will release a new version of the package on CRAN shortly. We are pleased that the reviewers found the package useful, and are very happy with the improvements made to the package and vignette. Our responses to the reviewer comments/suggestions are listed below.


@wcornwell and @benjward thank you both so much for the extremely helpful reviews and advice on improving the package. The detailed write-ups you provided were an immense help for optimizing the package functions and making vignette more user friendly. We carried out a fairly major overhaul on the package (see commit # 5b69d36), that involved writing fast(er) new functions for converting between dendrogram and phylo objects, and getting rid of the old Newick text parser that suffered from some major performance issues and problems with the non-standard parenthetic formatting semantics. We also added more unit tests and generally improved the error management throughout the package. Point by point responses to each of the specific suggestions are listed below. Once again, that you both very much for all of your effort on this so far. We are delighted with the improvements to the package, and hope that the community will find it useful.

Review 1 - @wcornwell

Comment: There is something non-linear about the performance of read.dendrogram relative to both ape::read.tree and tree size. ape::read.tree seems to be linear and about two orders of magnitude faster … for the functionality of this package, designing and optimising the code for at least medium sized trees is pretty crucial.

Response: The previous version of read.dendrogram used regular expression pattern matching to read in Newick strings and recursively create nested-list dendrogram objects. In hindsight this was was always going to be much slower than ape’s well-optimized read.tree, which calls compiled C code to read in the strings. So we reluctantly decided to jettison the parser, focusing instead on optimizing the as.dendrogram.phylo function, which read.dendrogram now wraps to convert Newick strings into dendrograms via an intermediate “phylo” object. This resulted in a significant speedup in read.dendrogram - e.g. reading in the linked vascular plant tree in around ~ 6 seconds. This is still fairly slow in comparison to ape::read.tree, and we will work on optimizing it further in the future, but I guess we will always be limited with dendrogram objects in terms of speed and memory efficiency, particularly in comparison to matrix-based trees. Nevertheless, we think this overhaul has improved the package immensely, and we appreciate both reviewers wisely pointing it out as a speed bottleneck most in need of attention.

Comment: I'm sure there are some use cases for comparative methods where a nested list will be more useful than an edge matrix. Be nice to have a worked example in the vignette.

Response: Great idea, we have now included an example in the vignette on how to convert between phylo and dendrogram objects to create tanglegrams using the dendextend package. This seems to be a commonly encountered issue (see threads here, here , and here ), so I think this constitutes a really important addition to the vignette.

Comment: It would also be worth comparing the performance of ape::drop.tip with phylogram::prune.

Response: Yes we agree, and have added a couple of sentences to the vignette as follows: Leaf nodes and internal branching nodes can be removed using the function prune, which identifies and recursively deletes nodes based on pattern matching of “label” attributes. This is slower than the ape function drop.tip, but offers the benefits of versatile string matching using regular expressions, and the ability to remove inner nodes (and by extension all of their subnodes) that feature matching “label” attributes.

Comment: Reading in problematic newick strings is kind of a nightmare because there are so many flavors of errors, but there are probably some cases where read.dendrogram returns a warning, when it really should give an informative error message rather than the current behavior which is to return a warning and a garbled tree.

Response: This was another factor in our decision to decommission the parser and instead focus on extending ape’s (much better) read.tree function to enable dendrogram capability. Given the amount of work Paradis and others have put into ape::read.tree it seemed a bit like reinventing the wheel developing and maintaining a separate Newick parser for dendrograms.

Review 2- @benjward

Comment: Coverage could be improved slightly in read.R and prune.R. In particular, it would be good to add to the test suite some cases testing that errors are indeed thrown, when they are expected to be thrown - by giving package functions deliberately incorrect input, and using testthat to catch the errors with expect_error. This would be good to ensure the Newick reading is operating as expected and rejecting input when it should, as Newick can be a problematic format due to its flexibility.

Response: We have now expanded the unit tests to cover 95% of the code (including 100% for read.R and 96% for prune.R) and have also added an expect_error for multiPhylo objects. We expect that any other text-formatting issues will now be picked up by ape::read.tree.

Comment: In the vignette, I think following the instructions on manipulating trees … is important to demonstrate how the trees can be built up from a vanilla R linked data structure, but this style becomes difficult to read quite quickly for non-trivial trees. I think that immediately following this, the vignette could demonstrate how to build the same tree, but by using functions that define a set of verbs common to phylogentics (like prune, graft, drop.tips etc.), on a set of nodes. I think that if this was added it would make users much more comfortable with manipulating trees in a much more semantic way.

Response: This is a really good point, and would certainly help users get started with some basic manipulation of dendrogram objects. We have now rearranged the vignette to demonstrate the ground-up tree building towards the end (after the basic reading, writing and conversion functions) and added a new example showing how to build and manipulate the same basic tree object with the package functions. We think this adds a valuable contribution to the tutorial.

Comment: In the future - and this is not limited to this package, it would be cool to see
functions such as ladderize (ape) and ladder (phylogram) standardized to a consistent name, perhaps even defined as one function, but with multiple class methods.

Response: Yes we agree, and did consider adding a new method to ladderize for dendrogram objects, but unfortunately there was a clash with one of the functions in the dendextend package and so we decided to keep them separate. Of course we are happy to take advice on this!

@sckott
Copy link
Contributor

sckott commented May 31, 2018

thanks @shaunpwilkinson for the responses to reviewers. @wcornwell @benjward are you happy with the responses to your comments?

@TransGirlCodes
Copy link

@sckott Absolutely 👍

@sckott
Copy link
Contributor

sckott commented Jun 8, 2018

thx @benjward

@sckott
Copy link
Contributor

sckott commented Jun 8, 2018

@wcornwell are you happy with the changes made?

@wcornwell
Copy link

sounds good! @sckott

@sckott
Copy link
Contributor

sckott commented Jun 11, 2018

@shaunpwilkinson both reviewers are on board with changes. I'll do a quick peek ...

@sckott
Copy link
Contributor

sckott commented Jun 11, 2018

@shaunpwilkinson one question:

  • looks like your NAMESPACE file is created by hand. is that correct? any reason not to create it using devtools::document() or similar? Seems less error prone that way, but maybe you have a good reason?

@shaunpwilkinson
Copy link
Author

Hi @sckott I spent ages trying to get the NAMESPACE right using roxygen tags and devtools::document but gave up in the end, since I couldn't figure out how to get it to export methods for the as.dendrogram and as.phylo generics correctly. Also I remember seeing a comment from Dirk Eddelbuettel a while back recommending against automatic generation of the NAMESPACE, so I figured I'd just write it manually. I'm happy to change it if you prefer but might need some help with the roxygen tags!

@sckott
Copy link
Contributor

sckott commented Jun 12, 2018

Nah, seems good as is.


Approved! Thanks again for your submission @shaunpwilkinson

  • Please transfer the package to ropensci- I've invited you to a team within our ropensci organization. After you accept you should be able to move it. You'll be made admin once you do.* Make sure to
  • Make sure to
    • add rOpenSci footer to README
      [![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
    • Change any needed links, such those for CI badges
    • Travis CI should update to the new location automatically - you may have to update other CI systems manually
  • In a new .github folder in the root of the pkg, add a CONTRIBUTING.md file, an issue template file, and a PR template file, see https://github.com/ropensci/dotgithubfiles/ for examples, you can copy them directly and edit as you wish
  • We're starting to roll out software metadata files to all ropensci pkgs via the Codemeta initiative, see https://github.com/ropensci/codemetar/#codemetar for how to include it in your pkg, after installing the pkg - should be easy as running codemetar::write_codemeta() in the root of your pkg
  • For the MEE paper, go ahead now and submit your manuscript to MEE if you haven't already. They tell me there's an option to say the software has been reviewed by ropensci in their submission template
  • For JOSS:
    • After you transfer the repo, i'll add a Zenodo webhoook to your repo. After that, generate a new release associated with a git tag in your repo. This doesn't mean you have to submit to CRAN with that git tag/release, its just to get a Zenodo DOI - and you can put a zenodo badge in your readme, then after each git tag it will generate a new Zenodo release
    • Then submit your paper.md to JOSS at http://joss.theoj.org/papers/new

@shaunpwilkinson
Copy link
Author

That's great news - thanks so much @sckott !

I'll get to those tasks today.

Quick question - I'm planning to submit the paper to JOSS and wondering if it will be a problem if the paper.md file and package vignette are the same? I figured it didn't make much sense to write a whole separate vignette, since the paper includes all of the background and examples necessary to get started with the package. But I'm aware that some journals won't consider a manuscript if it is similar to a package vignette that is already available online (I had one knocked back from MEE for this reason recently).

@sckott
Copy link
Contributor

sckott commented Jun 13, 2018

@shaunpwilkinson good question about JOSS, i don't know. @noamross do you know answer to the above? whether JOSS cares if the paper is also somewhere else in another form?

@shaunpwilkinson
Copy link
Author

Hi @sckott I can't see an invite to join a team within the ropensci organization - any chance you could point me in the direction of where I should be looking? Perhaps I'm just being blind.

@sckott
Copy link
Contributor

sckott commented Jun 18, 2018

sorry about that, you should have gotten it now

@shaunpwilkinson
Copy link
Author

No worries, almost there, I've managed to join the organization but when I go to transfer the repo I get the error message You don’t have the permission to create repositories on ropensci

@sckott
Copy link
Contributor

sckott commented Jun 20, 2018

woops, sorry. Github recently changed how permissions work for transferring repos to organizations.

can you add me as admin to your repo - then i'll transfer to ropensci - then you can take it from there?

@shaunpwilkinson
Copy link
Author

No prob, I've just added you as a collaborator - I'm assuming that's the same thing..

@sckott
Copy link
Contributor

sckott commented Jun 20, 2018

thanks! not quite.

screen shot 2018-06-19 at 5 06 22 pm

i need to be able to get to Settings tab, so i need admin access

@shaunpwilkinson
Copy link
Author

Hmm, I don't seem to have that option, I just had a look at the GitHub help pages (https://help.github.com/articles/permission-levels-for-a-user-account-repository/#collaborator-access-on-a-repository-owned-by-a-user-account) and it looks like there are only two permission levels, owner and collaborator, so I have just transferred the ownership to you

@sckott
Copy link
Contributor

sckott commented Jun 20, 2018

thanks! and just moved to ropensci, and you have admin access on it, let me know if you don't have access

@sckott
Copy link
Contributor

sckott commented Jun 25, 2018

@shaunpwilkinson

Are you interested in doing a blog post for our blog https://ropensci.org/blog/ ? either a short-form intro to it (https://ropensci.org/technotes/) or long-form post with more narrative about its development (https://ropensci.org/blog/). If so, we'll have our community manager @stefaniebutland get in touch with you on that

@shaunpwilkinson
Copy link
Author

Hi @sckott and @stefaniebutland, yes I'd be more than happy to, I'll write up a technote asap.

Also just FYI the latest version of the package is on CRAN now, and JOSS paper is online too (http://joss.theoj.org/papers/10.21105/joss.00790)

@stefaniebutland
Copy link
Member

Glad to hear you'll contribute a technote @shaunpwilkinson. Will tweet it from @ropensci to get more eyes on your work.

Here's info on how to submit via pull request https://github.com/ropensci/roweb2#contributing-a-blog-post. Difference in YAML:

categories: technote

Since Tech Notes can be published any time, you can choose the date, but please submit about a week in advance so we can review. Happy to answer any questions that come up.

@sckott sckott closed this as completed Jun 26, 2018
@stefaniebutland
Copy link
Member

Forgot to mention, please add these tags to YAML (among others)

  • onboarding
  • review
  • r

And be sure that the tech note gives an example of a cool thing you can do with the package, rather than reiterating the README for example.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants