Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to remove comments? #334

Closed
wlandau opened this issue Jan 30, 2018 · 13 comments
Closed

Option to remove comments? #334

wlandau opened this issue Jan 30, 2018 · 13 comments

Comments

@wlandau
Copy link

wlandau commented Jan 30, 2018

For my use case, I need to remove comments from code too. Desired behavior:

style_text(text = "x=a+1 # unstyled", ...)
## x <- a + 1
@wlandau wlandau changed the title Option to strip comments? Option to remove comments? Jan 30, 2018
@lorenzwalthert
Copy link
Collaborator

HI @wlandau, thanks for this suggestion.
This seems like not really a "non-invasive" option 😊, so we may need to think for a bit whether we want to make it an option in the tidyverse style guide. However, I think you can easily create a style guide that has this property. I created one for you here, so if you need a quick solution, you can incorporate the code below in your package (with proper name space handling, I am sure you know...).

library(styler)
tidyverse_style_with_comments_removed <- function() {
  remove_comments <- function(pd) {
    is_comment <- pd$token == "COMMENT"
    pd <- pd[!is_comment,]
    pd
  }
  
  tidyverse_with_comments_removed <- tidyverse_style()
  tidyverse_with_comments_removed$token$remove_comments <- remove_comments
  tidyverse_with_comments_removed
}

style_text(
  "a <- 3 # hi
  # there
  2
  list(
  # 
  hi
  )
  ", transformers = tidyverse_style_with_comments_removed()
)
#> a <- 3
#> 2
#> list(
#>   hi
#> )

Created on 2018-01-30 by the reprex package (v0.1.1.9000).

If you instead want comments to be replaced with the empty string, this seems a bit more complicated due to line breaks, but I think I managed to find a solution for this too, check out the penultimate commit in the aforementioned branch.
I haven't done extensive testing so please do so if you want to use this functionality in production.
I think this issue shows nicely how we can leverage stylers flexibility to create a new rule with just a few lines of additional code.

We may consider adding this feature at some point in some form to styler, I am just not sure if it should be a styling option on the same level as whether or not you want to modify just spaces or also indention since it has (potentially detrimental) effects if used without version control.

@wlandau
Copy link
Author

wlandau commented Jan 30, 2018

Thanks, @lorenzwalthert! I really appreciate your consideration, especially given that this is probably not a common use case for the tidyverse.

My use of styler would be to standardize workflow commands in the drake package. There, the purpose is not to make code more human-readable, but to map different variations of the same code to the same string. That way, if the user changes a target's command from "dataset=make_dataset()+1; munge(dataset)" to "dataset <- make_dataset() + 1; munge_dataset(dataset) # comment", drake will ignore the changes as trivial, and will not rebuild the dataset next time make() is called.

Currently, I am using formatR::tidy_source() to do this, but I am looking to phase it out in favor of styler. formatR seems older, and maybe it's more stable, but styler is more enthusiastically supported and it may make this functionality stronger.

@lorenzwalthert
Copy link
Collaborator

Ok, great. So is it ok for you to implement that style guide yourself in the drake package along the lines outlined above? styler was created with flexibility in mind, but we have not really decided on how to handle "third-party" style guides, i.e. whether we want to add them to styler or whether we want people to distribute them in their own packages. However, with respect to your use case, I am not sure if styler is all you need. Consider this

library(styler)
style_text("a \nb")
#> a
#> b
style_text("a \n\nb")
#> a
#> 
#> b

Created on 2018-01-30 by the reprex package (v0.1.1.9000).

It's the same code but it does not give the same styling because styler does not remove the line breaks here. I think what you want to check is whether, given all comments are dropped, the code evaluates to the same (i.e. the expression is the same). Hence, you probably want to parse the code and check for identity

library(styler)
(one_line_break <- style_text("a \nb"))
#> a
#> b
(two_line_breaks <- style_text("a \n\nb"))
#> a
#> 
#> b
identical(
  parse(text = one_line_break, keep.source = FALSE), 
  parse(text = two_line_breaks, keep.source = FALSE)
)
#> [1] TRUE

Created on 2018-01-30 by the reprex package (v0.1.1.9000).

Obviously this does not check for the values of a and b but that's probably not required if I understand you correctly.

@wlandau
Copy link
Author

wlandau commented Jan 30, 2018

Yes, I think if I do go with styler for drake, I will implement the style guide.

You bring up a great point about line breaks. That's one area where I still rely on formatR::tidy_source(..., blank = FALSE).

drake:::standardize_command("a\nb")
## [1] "{\n a\nb \n}"
drake:::standardize_command("a\n\nb")
## [1] "{\n a\nb \n}"

@lorenzwalthert
Copy link
Collaborator

lorenzwalthert commented Jan 30, 2018

You could even rely on parse() alone. styler will just help you with a few edge cases because it can also format tokens, so some expressions that don't parse to the same will be parsed to the same after sending them through styler:

  • a = 3 vs. a <- 3
  • if (X) {"hi}" vs if (X) "hi"
  • etc.

@lorenzwalthert
Copy link
Collaborator

lorenzwalthert commented Jan 30, 2018

As far as blank lines go, I think that might be worth a separate issue.

@lorenzwalthert
Copy link
Collaborator

Reference: #335

@wlandau
Copy link
Author

wlandau commented Jan 30, 2018

Yes, I did rely on parse() at one point. One of the reasons I originally moved to formatR::tidy_source() was to account for the edge cases you just pointed out. The other reason was a rare but egregious parsing error that I cannot recall in detail at the moment.

@lorenzwalthert
Copy link
Collaborator

Ok. We had a few parsing problems with parse(), e.g. #100 or #187.

@lorenzwalthert
Copy link
Collaborator

Removing comments is also implemented in oneliner. Maybe we can make it an option of the tidyverse style guide? @krlmlr what do you think? Should it be an option of it or is it too radical for a style guide like the tidyverse style guide to remove all comments?

@krlmlr
Copy link
Member

krlmlr commented Apr 15, 2018

Normalizing code (with removing comments) may be useful in other cases too. I don't mind if we support this, given that it seems easy to implement.

@lorenzwalthert
Copy link
Collaborator

This is a candidate for https://github.com/lorenzwalthert/styler.yours.

@lorenzwalthert
Copy link
Collaborator

Here is a style guide that does just that: remove comments: https://github.com/lorenzwalthert/styler.nocomments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants