Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] write_parquet() has infinite recursion error when writing packageVersion() attributes #43748

Closed
tanho63 opened this issue Aug 19, 2024 · 1 comment
Assignees
Milestone

Comments

@tanho63
Copy link

tanho63 commented Aug 19, 2024

Describe the bug, including details regarding any error messages, version, and platform.

Reprex:

x <- mtcars
attr(x, "arrow_version") <- packageVersion("arrow")
arrow::write_parquet(x, "x.parquet")
# Error: C stack usage  7974388 is too close to the limit
attr(x, "arrow_version") <- as.character(packageVersion("arrow"))
arrow::write_parquet(x, "x.parquet")
# works fine
Session Info:
R> sessioninfo::session_info()
─ Session info ─────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       Ubuntu 22.04.4 LTS
 system   x86_64, linux-gnu
 ui       RStudio
 language (EN)
 collate  en_CA.UTF-8
 ctype    en_CA.UTF-8
 tz       America/Toronto
 date     2024-08-19
 rstudio  2024.04.0+735 Chocolate Cosmos (desktop)
 pandoc   2.9.2.1 @ /usr/bin/pandoc

─ Packages ─────────────────────────────────────────────────────────────────────────────
 package     * version    date (UTC) lib source
 arrow         17.0.0     2024-08-17 [1] RSPM
 assertthat    0.2.1      2019-03-21 [1] RSPM
 bit           4.0.5      2022-11-15 [1] RSPM
 bit64         4.0.5      2020-08-30 [1] RSPM
 cachem        1.1.0      2024-05-16 [1] RSPM
 callr         3.7.3      2022-11-02 [1] RSPM
 cli           3.6.3      2024-06-21 [1] RSPM
 crayon        1.5.2      2022-09-29 [1] RSPM
 devtools    * 2.4.5      2022-10-11 [1] RSPM
 digest        0.6.33     2023-07-07 [1] RSPM
 ellipsis      0.3.2      2021-04-29 [1] RSPM
 fastmap       1.2.0      2024-05-15 [1] RSPM
 fs            1.6.4      2024-04-25 [1] RSPM
 glue          1.7.0      2024-01-09 [1] RSPM
 htmltools     0.5.7.9000 2024-03-23 [1] Github (rstudio/htmltools@30d13a1)
 htmlwidgets   1.6.3      2023-11-22 [1] RSPM
 httpuv        1.6.12     2023-10-23 [1] RSPM
 later         1.3.1      2023-05-02 [1] RSPM
 lifecycle     1.0.4      2023-11-07 [1] RSPM
 magrittr      2.0.3      2022-03-30 [1] RSPM
 memoise       2.0.1      2021-11-26 [1] RSPM
 mime          0.12       2021-09-28 [1] RSPM
 miniUI        0.1.1.1    2018-05-18 [1] RSPM
 pkgbuild      1.4.2      2023-06-26 [1] RSPM
 pkgload       1.3.3      2023-09-22 [1] RSPM
 prettyunits   1.2.0      2023-09-24 [1] RSPM
 processx      3.8.2      2023-06-30 [1] RSPM
 profvis       0.3.8      2023-05-02 [1] RSPM
 promises      1.2.1      2023-08-10 [1] RSPM
 ps            1.7.5      2023-04-18 [1] RSPM
 purrr         1.0.2      2023-08-10 [1] RSPM
 R6            2.5.1      2021-08-19 [1] RSPM
 Rcpp          1.0.13     2024-07-17 [1] RSPM
 remotes       2.4.2.1    2023-07-18 [1] RSPM
 rlang         1.1.4      2024-06-04 [1] RSPM
 rstudioapi    0.15.0     2023-07-07 [1] RSPM
 sessioninfo   1.2.2      2021-12-06 [1] RSPM
 shiny         1.8.0      2023-11-17 [1] RSPM
 stringi       1.8.2      2023-11-23 [1] RSPM
 stringr       1.5.1      2023-11-14 [1] RSPM
 tidyselect    1.2.1      2024-03-11 [1] RSPM
 urlchecker    1.0.1      2021-11-30 [1] RSPM
 usethis     * 2.2.2      2023-07-06 [1] RSPM
 vctrs         0.6.5      2023-12-01 [1] RSPM
 xtable        1.8-4      2019-04-21 [1] RSPM

 [1] /home/tan/R/x86_64-pc-linux-gnu-library/4.3
 [2] /usr/local/lib/R/site-library
 [3] /usr/lib/R/site-library
 [4] /usr/lib/R/library

A very strange error I suspect is related to #41969 ?

Component(s)

R

@nealrichardson
Copy link
Member

Thanks for the report. This is bizarre behavior of packageVersion: it seems to be an infinitely recursive object.

> x <- packageVersion("arrow")
> typeof(x)
[1] "list"
> lapply(x, typeof)
[[1]]
[1] "list"

> lapply(x[[1]], typeof)
[[1]]
[1] "list"

> lapply(x[[1]][[1]], typeof)
[[1]]
[1] "list"

> x[[1]]
[1] ‘17.0.0.1’
> x[[1]][[1]]
[1] ‘17.0.0.1’
> x[[1]][[1]][[1]]
[1] ‘17.0.0.1’
> x[[1]][[1]][[1]][[1]]
[1] ‘17.0.0.1’

It's right there in the function definition: it basically does list(unclass(x)[[1]]), so it always returns a list

> `[[.numeric_version`
function (x, ..., exact = NA) 
{
    if (...length() < 2L) 
        structure(list(unclass(x)[[..., exact = exact]]), class = oldClass(x))
    else unclass(x)[[..1, exact = exact]][..2]
}
<bytecode: 0x1109cf1c8>
<environment: namespace:base>

I'll work around this, but this feels very wrong.

@nealrichardson nealrichardson self-assigned this Aug 31, 2024
@nealrichardson nealrichardson added this to the 18.0.0 milestone Sep 12, 2024
nealrichardson added a commit that referenced this issue Sep 12, 2024
### Rationale for this change

See #43748. There is what appears to be a bug in R's
`[[.numeric_version` implementation that leads to infinite recursion.

Edit: after some digging in R source, this appears to be as designed.
And other list subclasses that have methods to make them behave like
atomic types, like `POSIXlt`, also have this.

### What changes are included in this PR?

When recursing into list objects, `unclass()` them first to get the raw
list behavior. Also apply the checking to the `attributes()` before
reapplying them.

### Are these changes tested?

yes

### Are there any user-facing changes?

Fewer bugs!

* GitHub Issue: #43748
khwilson pushed a commit to khwilson/arrow that referenced this issue Sep 14, 2024
…#43895)

### Rationale for this change

See apache#43748. There is what appears to be a bug in R's
`[[.numeric_version` implementation that leads to infinite recursion.

Edit: after some digging in R source, this appears to be as designed.
And other list subclasses that have methods to make them behave like
atomic types, like `POSIXlt`, also have this.

### What changes are included in this PR?

When recursing into list objects, `unclass()` them first to get the raw
list behavior. Also apply the checking to the `attributes()` before
reapplying them.

### Are these changes tested?

yes

### Are there any user-facing changes?

Fewer bugs!

* GitHub Issue: apache#43748
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants