-
Run this code in Windows and Linux respectively, we get different hash value for list_b and list_c, whereas hash value of vector_a remain the same. I also see that global functions yield identical hash on different platforms. I'm trying to rely on targets to produce reproducible hash values cross-platform, I know this is not a typical usage, but can I be sure that apart from lists, I can get identical hashes across platforms and across targets package version? |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 3 replies
-
Also, hash of global strings with newline characters are also platform-dependent |
Beta Was this translation helpful? Give feedback.
-
For your first example I see these same 3 hashes on Mac, Linux, and Windows: # A tibble: 3 × 2
name data
<chr> <chr>
1 list_b 82bff5f15bd637fc
2 list_c 1d33db8f6f9ff6a4
3 vector_a da54c07c3a574407 Likewise, with a string with a newline character, I see the same hash on all 3 platforms: targets::tar_dir({
targets::tar_script({
string <- "abc\n123"
list()
})
targets::tar_make()
print(targets::tar_meta(fields = c("name", "data")))
}) # A tibble: 1 × 2
name data
<chr> <chr>
1 string ceca2dd4035334bf So I am not sure what could be causing your issue. I suggest trying to replicate the issue with the Session info of all platforms:Mac: > sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur ... 10.16
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] igraph_1.3.5 knitr_1.40 magrittr_2.0.3
[4] tidyselect_1.1.2.9000 R6_2.5.1 rlang_1.0.6
[7] fansi_1.0.3 tools_4.2.1 targets_0.13.5
[10] data.table_1.14.2 xfun_0.33 utf8_1.2.2
[13] cli_3.4.1 withr_2.5.0 base64url_1.4
[16] yaml_2.3.5 digest_0.6.29 tibble_3.1.8
[19] lifecycle_1.0.2 processx_3.7.0 callr_3.7.2
[22] vctrs_0.4.1 ps_1.7.1 codetools_0.2-18
[25] glue_1.6.2 compiler_4.2.1 pillar_1.8.1
[28] backports_1.4.1 pkgconfig_2.0.3 Linux: > sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux
Matrix products: default
BLAS/LAPACK: <CENSORED>
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] igraph_1.3.5 knitr_1.40 magrittr_2.0.3
[4] tidyselect_1.1.2 R6_2.5.1 rlang_1.0.6
[7] fansi_1.0.3 tools_4.1.2 targets_0.13.5
[10] data.table_1.14.2 xfun_0.33 utf8_1.2.2
[13] cli_3.4.1 withr_2.5.0 ellipsis_0.3.2
[16] yaml_2.3.5 base64url_1.4 digest_0.6.29
[19] tibble_3.1.8 lifecycle_1.0.2 processx_3.7.0.9000
[22] purrr_0.3.4 callr_3.7.2 vctrs_0.4.1
[25] ps_1.7.1 codetools_0.2-18 glue_1.6.2
[28] compiler_4.1.2 pillar_1.8.1 backports_1.4.1
[31] pkgconfig_2.0.3 Windows: > sessionInfo()
R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] igraph_1.3.4 rstudioapi_0.13 knitr_1.39 magrittr_2.0.3 tidyselect_1.1.2
[6] R6_2.5.1 rlang_1.0.4 fansi_1.0.3 tools_4.2.1 targets_0.13.5
[11] data.table_1.14.2 xfun_0.32 utf8_1.2.2 cli_3.3.0 withr_2.5.0
[16] ellipsis_0.3.2 yaml_2.3.5 base64url_1.4 digest_0.6.29 tibble_3.1.8
[21] lifecycle_1.0.1 processx_3.7.0 purrr_0.3.4 callr_3.7.1 vctrs_0.4.1
[26] ps_1.7.1 codetools_0.2-18 glue_1.6.2 compiler_4.2.1 pillar_1.8.1
[31] backports_1.4.1 pkgconfig_2.0.3 |
Beta Was this translation helpful? Give feedback.
-
My result in Linux matches what you posted.
My Session info``` > sessionInfo() R version 4.1.3 (2022-03-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19043) Matrix products: default locale: attached base packages: loaded via a namespace (and not attached):
|
Beta Was this translation helpful? Give feedback.
-
I've traced the problem down to the following minimal example: digest::getVDigest(algo = "xxhash64")(list(list(
aa = "aaval",
bb = "bbval",
cc = "ccval",
dd = "ddval"
)), serialize = T, serializeVersion = 3L) This is extracted from Windows: If I remove the parameter My Windows Session Info``` > sessionInfo() R version 4.1.3 (2022-03-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19043) Matrix products: default locale: attached base packages: loaded via a namespace (and not attached):
|
Beta Was this translation helpful? Give feedback.
-
Thanks to @wlandau I now located the differences down to serialization schemes of version 3 in the R base function base::serialize(list(aa = "aaval", bb = "bbval", cc = "ccval", dd = "ddval"), connection = stdout(), version = 3L) the result of this differs on my Windows and Linux. I created a topic in https://community.rstudio.com/ as @wlandau suggested. |
Beta Was this translation helpful? Give feedback.
Thanks to @wlandau I now located the differences down to serialization schemes of version 3 in the R base function
base::serialize
.the result of this differs on my Windows and Linux.
I created a topic in https://community.rstudio.com/ as @wlandau suggested.