-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Autolink urls on man generated for DESCRIPTION. #1265
Comments
Indeed it is not parsed as md, because it is not supposed to be md. IDK if there is a good solution here. FWIW one workaround is to avoid using |
I understand the choice of not parsing as md but I think it would make sense to convert URLs and |
This would require some extra manipulation in |
Hi, as per the Checklist for CRAN submissions, I think those special urls are just |
Yes, it seems to be all. Maëlle identified the source for this feature and I cannot see anything else: ropensci/roweb3#56 (comment) |
So out of curiosity, I made a small analysis of the
Full reprex library(stringr)
library(dplyr, warn.conflicts = FALSE)
#> Warning: package 'dplyr' was built under R version 4.1.2
library(tidyr, warn.conflicts = FALSE)
#> Warning: package 'tidyr' was built under R version 4.1.2
cran <- tools::CRAN_package_db()
cran_mod <- cran %>%
mutate(date_pack = as.Date(str_split_fixed(Packaged, " ", 2)[, 1])) %>%
select(Package, date_pack)
extract_urls <- str_extract_all(cran$Description,
# Regex can be improved ...
regex("<(\\S*?)>"),
simplify = TRUE
) %>%
as_tibble() %>%
bind_cols(cran_mod, .) %>%
filter(V1 != "")
#> Warning: The `x` argument of `as_tibble.matrix()` must have unique column names if `.name_repair` is omitted as of tibble 2.0.0.
#> Using compatibility `.name_repair`.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
paste0(
"Number of packages with <pattern> in Description: ",
nrow(extract_urls),
" out of ", nrow(cran), " (",
round(100 * nrow(extract_urls) / nrow(cran), 2),
"%)"
)
#> [1] "Number of packages with <pattern> in Description: 8148 out of 19073 (42.72%)"
# Analyse patterns
allurls <- extract_urls %>%
pivot_longer(
cols = -c(Package, date_pack),
values_to = "url"
) %>%
# Remove blanks. etc
filter(url != "" & !is.na(url))
# Total urls enclosed by <>
nrow(allurls)
#> [1] 13531
# Split by pattern, I would use : and . for splitting
allurls <- allurls %>%
mutate(split = gsub(".", ".|",
gsub(":", ":|", url, fixed = TRUE),
fixed = TRUE),
domain = str_split_fixed(split, "\\|", n = 2)[, 1]
)
alldomains <- allurls %>%
group_by(domain) %>%
summarise(
max_date = max(date_pack, na.rm = TRUE),
n = n()
) %>%
arrange(desc(n))
alldomains <- alldomains %>%
mutate(
porc = round(100 * n / sum(alldomains$n), 3),
cumporc = cumsum(porc)
)
head(alldomains, 10)
#> # A tibble: 10 x 5
#> domain max_date n porc cumporc
#> <chr> <date> <int> <dbl> <dbl>
#> 1 <doi: 2022-03-30 7862 58.1 58.1
#> 2 <https: 2022-03-30 2902 21.4 79.6
#> 3 <arXiv: 2022-03-30 940 6.95 86.5
#> 4 <DOI: 2022-03-29 869 6.42 92.9
#> 5 <http: 2022-03-29 785 5.80 98.7
#> 6 <ISBN: 2022-03-15 38 0.281 99.0
#> 7 <arxiv: 2022-03-15 37 0.273 99.3
#> 8 <isbn: 2022-02-20 15 0.111 99.4
#> 9 <10. 2020-12-05 10 0.074 99.5
#> 10 <doi. 2020-07-28 10 0.074 99.5 Created on 2022-03-30 by the reprex package (v2.0.1) |
Thanks for the investigation! Do you also want to do a PR? 😄 |
Hi,
I am producing the
Rd
file of my packages with the following :And it is fine, however I find that the text on the
Description
field of my DESCRIPION is not autolinked, as it happens in the rest of my documents (maybe it is not parsing as.md
?).Would you be open to explore this? I prepared a reprex to check how the same text is parsed diferently depending if it is placed on a regular .R file or in the DESCRIPTION file:
Created on 2021-10-29 by the reprex package (v2.0.1)
The text was updated successfully, but these errors were encountered: