Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write formula fails when formula.tools package is loaded #312

Closed
klin333 opened this issue Jan 11, 2022 · 10 comments · Fixed by #315
Closed

write formula fails when formula.tools package is loaded #312

klin333 opened this issue Jan 11, 2022 · 10 comments · Fixed by #315
Labels

Comments

@klin333
Copy link

klin333 commented Jan 11, 2022

Describe the bug
When the formula.tools package is loaded in namespace, openxlsx writing of excel sheets that include excel formula fails. Excel complains of "we found a problem with some contents of ..."

To Reproduce

library(openxlsx)

make_wb <- function() {
  df <- data.frame(
    x = 1, y = 2,
    z = "A2 + B2",
    stringsAsFactors = FALSE
  )
  
  class(df$z) <- c(class(df$z), "formula")
  
  wb <- createWorkbook()
  addWorksheet(wb, "Sheet1")
  writeData(wb, sheet = "Sheet1", x = df)
  
  wb
}

wb <- make_wb()
openxlsx::saveWorkbook(wb, file = 'works.xlsx', overwrite = TRUE)
print(sessionInfo())
# R version 3.6.3 (2020-02-29)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 10 x64 (build 19044)
# 
# Matrix products: default
# 
# locale:
#   [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
# [5] LC_TIME=English_Australia.1252    
# 
# attached base packages:
#   [1] graphics  grDevices utils     methods   base     
# 
# other attached packages:
#   [1] openxlsx_4.2.5
# 
# loaded via a namespace (and not attached):
#   [1] BiocManager_1.30.10 compiler_3.6.3      tools_3.6.3         Rcpp_1.0.4.6        stats_3.6.3         stringi_1.4.6       zip_2.1.1          
# [8] packrat_0.5.0       renv_0.13.2   

loadNamespace('formula.tools')
wb <- make_wb()
openxlsx::saveWorkbook(wb, file = 'error.xlsx', overwrite = TRUE)
print(sessionInfo())
# R version 3.6.3 (2020-02-29)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 10 x64 (build 19044)
# 
# Matrix products: default
# 
# locale:
#   [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
# [5] LC_TIME=English_Australia.1252    
# 
# attached base packages:
#   [1] graphics  grDevices utils     methods   base     
# 
# other attached packages:
#   [1] openxlsx_4.2.5
# 
# loaded via a namespace (and not attached):
#   [1] BiocManager_1.30.10  compiler_3.6.3       tools_3.6.3          Rcpp_1.0.4.6         stats_3.6.3          stringi_1.4.6       
# [7] zip_2.1.1            packrat_0.5.0        operator.tools_1.6.3 renv_0.13.2          formula.tools_1.7.1 

works.xlsx
error.xlsx

Likely cause

formula.tools has a as.character.formula function which changes the behaviour of as.character(formula_variable). It's caused me bugs in other places so maybe it's more a problem of just bad formla.tools package. To improve openxlsx to be more robust, we could explicitly use as.character.default in brittle places. However, if authors don't see the value in debugging this, might be good to notify users somehow of the existance of this problem?

@JanMarvin
Copy link
Collaborator

I'd say it is not really worth digging into this. Would you want to create a pull request? Maybe print an error if the package namespace is loaded or try to make it bullet proof that whatever we expect to be a formula behaves as such? Guess we someplace use as.character() on a formula.

# with formula.tools
> paste(df$z)
[1] "A2 + B2"
# what we currently write (and libreoffice sees as formula in the sheet)
> as.character(df$z)
[1] "structure(\"A2 + B2\", class = \"formula\")"

@jmbarbone
Copy link
Contributor

I think we could try to have a openxlsx::as.character.formula() function which just returns the value. Namespace conflicts would then be obvious when loading but internal functions, I believe, would still work fine.

Might actually be more of a problem with this package. Would probably be better to use more explicit class names like "workbookFormula" than just "formula".

jmbarbone added a commit to jmbarbone/openxlsx that referenced this issue Jan 11, 2022
@JanMarvin
Copy link
Collaborator

Like I've said, I don't really see why we should deal with the shenanigans of some other package modifying established base classes, but if there's some simple way to handle this, I'm not vetoing changes and it looks like you're already on it 😃

@klin333
Copy link
Author

klin333 commented Jan 11, 2022

actually i would really recommend against adding as.character.formula() function to this package. because as JanMarvin said, it's not good to modify established base classes. formula is a very common base class. adding a as.character.formula function would cause many hard-to-find bugs in user land.

The fact that formula.tools package had this as.character.formula function caused me so many bugs, this issue included, that i stripped out dependence on formula.tools altogether. i would really not like having this same problem with openxlsx.

i'm happy for no action to be taken. ideally openxlsx shouldn't use 'formula' class as it clashes with the established formula class in base R. if we can't be bothered changing openxlsx to use let's say workbookFormula as jmbarbone suggested, the next best thing is a warning somewhere, else no action taken. Anything is better than adding in the trouble making as.character.formula

@jmbarbone
Copy link
Contributor

jmbarbone commented Jan 11, 2022

openxlsx::as.character.formula(x) simply returns x as.character.default(x). There's no other modification done that wouldn't also be done without the method, and as.character.formula(), as far as I can tell, doesn't exist anywhere else. By exporting it will make conflicts explicit and easier to debug.

unloadNamespace("openxlsx")
form <- y ~ x1 * x2

identical(
  as.character(form),
  openxlsx:::as.character.formula(form)
)
#> [1] TRUE

Created on 2022-01-11 by the reprex package (v2.0.1)

I agree that this isn't the best solution. Better would be to replace every check/use of "formula" with a better class name.

@JanMarvin
Copy link
Collaborator

JanMarvin commented Jan 11, 2022

Can we make it simply internal? Otherwise I fear that it's a lot of noise just for one of the many thousands of packages and even more, one I haven't heard of before today? @jmbarbone Doesn't this depend on the order of package loading? Appreciate your efforts as always!
Edit: Maybe someone with more experience with the package and its implications (looking at you @klin333) should talk to the upstream package, maybe they are unaware of the implications it brings for other packages. Having written an estimator myself, I don't think I've guarded myself against such implications.

@jmbarbone
Copy link
Contributor

@JanMarvin if it's internal it won't have the warn conflicts, I believe.

formula class are language and dealt with in as.character() in internal methods. I'm not sure #315 is correct with the comparisons.

The more I look I this the more I think the PR #315 should be abandoned and we should see if using workbookFormula is feasible (#317).

@klin333
Copy link
Author

klin333 commented Jan 11, 2022

@JanMarvin, I don't think the blame can be placed on formula.tools. It intentionally tries to alter the behaviour of the generic as.character() when applied to the usual base R 'formula' class (things like y ~ x1 + x2). This is probably a deficiency of the clunky object oriented model in R. Still, i didn't like it, so i simply stopped using formula.tools as there's just no way the package will change it, it's a crucial but bad 'feature' of that package.

So the problem with openxlsx is it uses the 'formula' class, which clashes with an extremely popular base R class. If formula.tools wasn't causing a generic clash, some other package out there could. It's the same as why you should never create a package that unneccesarily override base classes such as Date or Datetime.

Support quote from https://stat.ethz.ch/R-manual/R-devel/library/methods/html/className.html "Overriding a class definition in another package with the same name deliberately is usually a bad idea. Although R attempts to keep and use the two definitions (as of version 2.14.0), ambiguities are always possible. It is more sensible to define a new class that extends an existing class but has a different name."

This case is worse because openxlsx's 'formula' has nothing to do with the base R formula class.

@github-actions
Copy link

This issue is stale because it has been open 365 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 14, 2023
@github-actions
Copy link

This issue was closed because it has been stalled for 7 days with no activity.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Jan 4, 2025
# openxlsx 4.2.7.1

* It's now possible to insert a hyperlinked image by passing a URL,
  relative or absolute file path, or mailto string to the new
  `address` parameter of `insertImage()`.

# openxlsx 4.2.7

* Fixed warning on `dataValidation(..., type = "list")`
  ([#342](ycphs/openxlsx#342))

* Added optional argument to `loadWorkbook` to decide if empty/blank
  cells should be converted to NA_character_ (the default) or left
  blank as is

* `saveWorkbook()` now succeeds when called after the user has set
  column widths for a range of columns (e.g. 1:2), saved the workbook,
  then set column widths for a new range that is inclusive of the
  previous one (e.g. 1:5)
  ([#493](ycphs/openxlsx#493)).

## Improvements

* Improve detectDates
  ([#288](ycphs/openxlsx#288))

* Preserve window size and position, also `getWindowSize()` and
  `setWindowSize()`
  ([466](ycphs/openxlsx#466))

# openxlsx 4.2.6

* Fix external links
  ([#410](ycphs/openxlsx#410))

* Do not add unneccessary sheetPr node
  ([#409](ycphs/openxlsx#409))

* Add support for `namedRegion`s having dots and other special
  characters ([#338](ycphs/openxlsx#338)).

* Add type blanks and not blanks to conditional formatting
  ([#311](ycphs/openxlsx#311))

# openxlsx 4.2.5

## Fixes

* `openxlsx_setOp()` now works with named list
  ([#215](ycphs/openxlsx#215))

* `loadWorkbook()` imports `inlineStr`. Values remain `inlineStr` when
  writing the workbook with `saveWorkbook()`. Similar `read.xlsx` and
  `readWorkbook` import `inlineStr`.

* `read.xlsx()` no longer changes random seed
  ([#183](ycphs/openxlsx#183))

* fixed a regression that caused fonts to be read in incorrectly
  ([#207](ycphs/openxlsx#207))

* add option to save as read only recommended
  ([#201](ycphs/openxlsx#201))

* fixed writing hyperlink formulas
  ([#200](ycphs/openxlsx#200))

* `write.xlsx()` now throws an error if it doesn't have write
  permissions ([#190](ycphs/openxlsx#190))

* `write.xlsx()` now again uses the default of `overwrite = TRUE` for
  saving files ([#249](ycphs/openxlsx#249))

* `as.character.formula()` exported to warn about potential conflicts
  with other packages
  ([#312](ycphs/openxlsx#312),
  [#315](ycphs/openxlsx#315))

## Improvements

* `options()` are more consistently set in functions (see:
  [#289](ycphs/openxlsx#262))

* `Workbook$show()` no longer fails when called in a 0 sheet
  workbook([#240](ycphs/openxlsx#240))

* `read.xlsx()` again accepts `.xlsm` files
([#205](ycphs/openxlsx#205),
[#209](ycphs/openxlsx#209))

* `makeHyperlinkString()` does no longer require a sheet argument
  ([#57](ycphs/openxlsx#57),
  [#58](ycphs/openxlsx#58))

* improvements in how `openxlsx` creates temporary directories (see
  [#262](ycphs/openxlsx#262))

* `writeData()` calls `force(x)` to evaluate the object before options
  are set ([#264](ycphs/openxlsx#264))

* `createComment()` now correctly handles `integers` in `width` and
  `height` ([#275](ycphs/openxlsx#275))

* `setStyles()` accepts `halign="justify"`
  ([#305](ycphs/openxlsx#305))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants