Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode changed to symbol in usage section #592

Closed
brry opened this issue Feb 17, 2017 · 5 comments
Closed

Unicode changed to symbol in usage section #592

brry opened this issue Feb 17, 2017 · 5 comments
Labels
bug an unexpected problem or unintended behavior utf8 🌏 wip work in progress

Comments

@brry
Copy link

brry commented Feb 17, 2017

Roxygen2 6.0.1 changes unicode statements in function defaults.
In the usage section, rather than keeping \Uxxxx, it writes the represented symbol.
On Windows with German locale, this leads roxygenise to get stopped by an encoding error if run repeatedly.
Here is some code to reproduce the error.

devtools::create("dummypack", list(License="GPL-2"))
devtools::check("dummypack") # no errors, warnings or notes

cat("
  #' Some function
  #' @importFrom graphics plot
  #' @param b Some label
  a <- function(b = '7\\U{00B0}C') {plot(1, main=b)}
", file="dummypack/R/a.R")

devtools::document("dummypack") # works
#Updating dummypack documentation
#Loading dummypack
#Writing NAMESPACE
#Writing a.Rd

devtools::document("dummypack")  # error!!!
#Updating dummypack documentation
#Loading dummypack
#Error in gsub("\n", "\r\n", contents, fixed = TRUE) : 
#  input string 1 is invalid UTF-8

berryFunctions::tryStack(devtools::document("dummypack"))
# devtools::document -> withr::with_envvar -> force -> roxygen2::roxygenise -> 
# unlist -> lapply -> FUN -> roclet_output -> roclet_output.roclet_rd -> 
# mapply -> write_if_different -> same_contents -> gsub -> 
# gsub("\n", "\r\n", contents, fixed = TRUE)[1] "Error in gsub(    [...]

unlink("dummypack/man/a.Rd")
devtools::document("dummypack") # works
devtools::document("dummypack") # error
unlink("dummypack/man/a.Rd")
devtools::check("dummypack") # fails with gsub UTF8 error as well

rd <- readLines("dummypack/man/a.Rd")
rd[7] <- "a(b = \"7\\U{00B0}C\")"
writeLines(rd[-1], "dummypack/man/a.Rd")

devtools::document("dummypack") # works
devtools::document("dummypack") # works
devtools::check("dummypack") # works fine
 
unlink("dummypack", recursive = TRUE)

sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252

@brry
Copy link
Author

brry commented Feb 22, 2017

If at the very beginning of the code block above, I set Sys.setlocale(locale="C"), then I have no warnings with document, but check points out mismatches, because in the Rd file, Roxygen writes <U+00B0> instead of \U{00B0}, see #452.

@jranke
Copy link
Contributor

jranke commented Jul 24, 2017

Just a quick note that I am also affected by this issue. Thanks for taking the time to document it.

jranke added a commit to jranke/roxygen that referenced this issue Jul 24, 2017
@jranke
Copy link
Contributor

jranke commented Jul 24, 2017

The problem is - on my Windows 7 box - that the .Rd file that is written out contains characters in latin1 encoding. On Linux everything works fine here.

I wrote a test to investigate the problem:

jranke@db1320b

@jranke
Copy link
Contributor

jranke commented Jul 25, 2017

@hadley hadley added the bug an unexpected problem or unintended behavior label Aug 16, 2017
hadley added a commit that referenced this issue Aug 17, 2017
* Helpers read_lines and write_lines do the right thing
* readLines() and writeLines() through errors to prevent accidental re-use in the future
* Warn if package encoding is not utf-8

Fixes #564. Fixes #592
@hadley hadley added the wip work in progress label Aug 17, 2017
hadley added a commit that referenced this issue Aug 18, 2017
* Helpers read_lines and write_lines do the right thing
* readLines() and writeLines() through errors to prevent accidental re-use in the future
* Warn if package encoding is not utf-8

Fixes #564. Fixes #592
hadley added a commit that referenced this issue Aug 23, 2017
* Helpers read_lines and write_lines do the right thing
* readLines() and writeLines() through errors to prevent accidental re-use in the future
* Warn if package encoding is not utf-8

Fixes #564. Fixes #592
hadley added a commit that referenced this issue Aug 23, 2017
* Helpers read_lines and write_lines do the right thing
* readLines() and writeLines() through errors to prevent accidental re-use in the future
* Warn if package encoding is not utf-8

Fixes #564. Fixes #592. Thanks to @jimhester for actually figuring out how to make this work.
@jranke
Copy link
Contributor

jranke commented Aug 24, 2017

@hadley Thanks, using unicode in default values for function arguments works nicely now!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior utf8 🌏 wip work in progress
Projects
None yet
Development

No branches or pull requests

3 participants