Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce duplication of docs #1676

Merged
merged 12 commits into from
May 23, 2024
Merged

reduce duplication of docs #1676

merged 12 commits into from
May 23, 2024

Conversation

olivroy
Copy link
Collaborator

@olivroy olivroy commented May 21, 2024

Here is a script I used to detect duplication. It is not polished at all, but thought I'd share

  • reduce documentation as much as possible in format_vec
    • important to note that 2 inheritparams can be supplied. Order is important, which is why you will see that I used for vec_fmt_bytes 1. look in fmt_bytes (bytes related text), then look in vec_fmt_number to retrieve the vec_ related help text.
  • gt and gt_preview together
  • gt_group
  • All saving related functions inherit from gtsave.

Let me know if you'd rather have some of this the other way around. I tried going in the same direction as what is called.

like vec_fmt is calling fmt, so vec_fmt docs inheriting fmt doc.

i.e. have the fmt_* inheriting the vec_fmt_*() instead.

Got tired of doing this, I may do another round of similar edits in the future.

Here is the script I created to help identify potential duplication.

library(roxygen2)
pkgload::unload("gt")
all_blocks <- 
  suppressMessages(
    roxygen2::parse_package()
)
pkgload::load_all()
library(tidyverse)

find_block_name <- function(x) {
  if (!is.null(x$object$topic)) {
    name <- x$object$topic[1]
  } else {
    name <- roxygen2::block_get_tag_value(x, "name")
  }
}

all_blocks <- set_names(all_blocks, purrr::map_chr(all_blocks, \(x) find_block_name(x)))

# params 
params_docs <- all_blocks |> 
  purrr::map(\(x) roxygen2::block_get_tags(x, "param")) |> 
  purrr::discard(\(x) identical(x, list())) |> 
  unlist(recursive = FALSE)

# catching all noRd tags
undocumented_fns <- all_blocks |> 
  purrr::map(\(x) roxygen2::block_get_tag_value(x, "noRd")) |> 
  purrr::discard(\(x) identical(x, list())) |> 
  unlist(recursive = FALSE) |> 
  names()

d_param <- purrr::map(
    params_docs,
    "val"
  ) |> 
  tibble::enframe(name = "id") |> 
    tidyr::unnest_wider(col = value) |> 
    dplyr::rename(
      param = name
    )
file_vec <- purrr::map_vec(
  params_docs,
  \(x) fs::path(normalizePath(x$file, winslash = "/"))
) |> 
  enframe(
    name = "id_1",
    value = "file"
  )
file_line_vec <- purrr::map_int(
  params_docs,
  "line"
) |> 
  tibble::enframe(
    name = "id_2",
    value = "line"
  )

## inherit params
inherit_params_docs <- all_blocks |> 
  purrr::map(\(x) roxygen2::block_get_tags(x, "inheritParams")) |> 
  purrr::discard(\(x) identical(x, list())) |> 
  unlist(recursive = FALSE)

d_inherit_param <- purrr::map_chr(
  inherit_params_docs,
  "val"
) |> 
  tibble::enframe(name = "id", value = "inherit_doc_from")




## final data for analysis.


params_df <- d_param |>
  bind_cols(
    file_line_vec,
    file_vec
  ) |> 
  select(-id_1, -id_2) |> 
  mutate(
    # strip duplicate name
    fn = stringr::str_remove(id, "\\d+$"),
    .before = "id"
  ) |> 
  left_join(
    d_inherit_param, by = c(fn = "id")
  ) |> 
  relocate(id, .after = last_col()) |> 
  filter(!fn %in% undocumented_fns)

# docs is inherited for 21 function from fmt_number
params_df |> 
  count(inherit_doc_from, sort = TRUE) |> 
  gt()

duplicate_docs_without_inherits <- params_df |> 
  filter(is.na(inherit_doc_from)) |> 
  select(-inherit_doc_from) |>
  # remove leading doc since inheritparams understands .data and data as the same thing
  mutate(param = stringr::str_remove(param, "^\\.")) |> 
  filter(
    n() > 1, # keep duplicates only for now (maybe modify regex to find approximate identical)
    .by = c(param, description)
  ) |> 
  dplyr::filter(
    # ... and .list are likely to be specific to each function 
    #!param %in% c("..", "list"),
    !fn %in% c(
      "fmt_number", # base for most functions
      "vec_fmt_number", # base for vec_ functions
      "fmt_datetime", # base for fmt_date and fmt_time
      "vec_fmt_datetime",  # base for vec_fmt_date and vec_fmt_time
      "gtsave", # base for all saving related functions
      "gt" # the base function.
    )
  )
  
# data, context, x, locale are the most common parameters that could inherit but don't
duplicate_docs_without_inherits |> 
  count(param, sort = TRUE)

duplicate_docs_without_inherits |> 
  summarise(
    n = n_distinct(param),
    .by = fn
  ) |> 
  arrange(desc(n))

duplicate_docs_without_inherits

# As a first step, I added some `@inheritParams` tag where none was there
duplicate_docs_without_inherits |> 
  arrange(param, description)


# As a second step, I am trying to see if some params could be inherited.
params_df |> 
  filter(
    n() > 1,
    .by = c(file, line)
  )


params_df |> 
  filter(
    fn == "fmt_integer"
  )

params_df |> 
  filter(
    param == "rows",
    description == reuseme::extract_cell_value(
      params_df, 
      var = description, # could be named columns
      filter = c(fn == "fmt_number" & param == "rows"), # could be named rows
      length = 1 
    )
  )

params_df |> 
  filter(
    n() > 1,
    .by = c(param, description)
  ) |> 
  arrange(description, fn, param) |> 
  select(-id, -line)

@olivroy
Copy link
Collaborator Author

olivroy commented May 21, 2024

I'd recommend looking at the resulting diff in Rd for review + logic of inheritance. We may be better off inheriting from a less chunky function for performance...

@rich-iannone
Copy link
Member

The changes I pushed here remove some explanation specific to certain functions only. The one for the currency arg is a special case where vec_fmt_currency() isn’t concerned with a gt() call.

@rich-iannone rich-iannone merged commit 456c022 into rstudio:master May 23, 2024
12 checks passed
Copy link
Member

@rich-iannone rich-iannone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants