Skip to content

Releases: tidyverse/readxl

readxl 1.4.5

07 Mar 17:23
Compare
Choose a tag to compare

This release contains no user-facing changes.
It eliminates a warning seen with gcc UBSAN.

readxl 1.4.4

28 Feb 20:09
Compare
Choose a tag to compare

This release contains no user-facing changes.

  • readxl embeds libxls v1.6.3, with release notes at
    https://github.com/libxls/libxls/releases/tag/v1.6.3.
    This version of libxls fixes several vulnerabilities.
  • Other small internal changes have been made to comply with CRAN requests,
    such as avoiding the use of Rf_StringTrue().

readxl 1.4.3

06 Jul 22:57
Compare
Choose a tag to compare

This release contains no user-facing changes.

readxl 1.4.2

09 Feb 16:30
Compare
Choose a tag to compare

This release contains no user-facing changes.

  • We embed a development version of libxls (https://github.com/libxls/libxls), which is based on the most recent released version, v1.6.2.
    The reason for embedding a development version is to ship a version of libxls that incorporates the fix for this CVE (#679):

  • readxl no longer declares the use of C++11.

  • readxl should once again compile on Alpine Linux.

  • Other small readxl-specific patches have been made to the embedded libxls code to comply with CRAN requests, such as avoiding the use of sprintf().

readxl 1.4.1

17 Aug 15:58
Compare
Choose a tag to compare

Help files below man/ have been re-generated, so that they give rise to valid HTML5. (This is the impetus for this release, to keep the package safely on CRAN.)

readxl 1.4.0

28 Mar 19:36
Compare
Choose a tag to compare

This release is mostly about substantial internal changes that should not be
noticeable to most users (but that set the stage for future work):

  • Updating the embedded version of libxls (more below)
  • Switching from Rcpp to cpp11 (more below)
  • Refactoring to reduce duplication between the .xls and .xlsx branches

However, there are a few small features / bug fixes:

  • "Date or Not Date": The classification of number formats as being datetime-ish
    is more sophisticated and should no longer be so easily fooled by, e.g.,
    colours or currencies. This affects cell and column type guessing, hopefully
    for the better (#388, #559, @nacnudus, @reviewher).

  • Cell location is determined more robustly in .xlsx files, guarding against
    the idiosyncratic way in which certain 3rd party tools include (or, rather,
    do not include) cell location in individual cell nodes (#648, #671).

  • Warning messages for impossible dates are more specific.
    Unsupported dates prior to 1900 have their own message now, instead of being
    lumped in with dates on the non-existent day of February 29, 1900
    (#551, #554, @cderv).

Dependency and licensing changes

  • readxl is now licensed as MIT (#632).

  • readxl now states its support for R >= 3.4 explicitly.
    Why 3.4? Because the tidyverse policy
    is to support the current version, the devel version, and four previous
    versions of R.
    It was necessary to introduce a minimum R version, in order to state a minimum
    version for a package listed in LinkingTo.

  • readxl embeds libxls v1.6.2 (the previous release embedded v1.5.0).
    The libxls project is hosted at https://github.com/libxls/libxls and you can
    learn more about the cumulative changes in its release notes:

  • readxl has switched from Rcpp to cpp11 and now requires C++11 (#659,
    @sbearrows).

  • The minimum version of tibble has been bumped to 2.0.1 (released 2019-01-12),
    completing the transition to an approach to column name repair used across the
    tidyverse.

readxl 1.3.1

13 Mar 16:39
Compare
Choose a tag to compare

Pragmatic patch release to update some tests in advance of v2.1.0 of the tibble package. That release updates name repair: standard suffix becomes ...j, instead of ..j, partially motivated by user experience in readxl.

readxl 1.3.0

15 Feb 16:10
Compare
Choose a tag to compare

Dependency changes

readxl embeds libxls v1.5.0. This is the first official release of libxls in several years, although readxl has been tracking the development version in the interim. The libxls project is now officially hosted at https://github.com/libxls/libxls. In particular, libxls v1.5.0 addresses these two CVEs:

readxl 1.2.0

20 Dec 02:45
Compare
Choose a tag to compare

Column name repair

readxl exposes the .name_repair argument that is coming to version 2.0.0 of the tibble package. The readxl default is .name_repair = "unique", keeping with the readxl convention to ensure column names are neither missing nor duplicated.

  • Column Names is a new article about this feature.
  • readxl delegates name repair to tibble, therefore the installed tibble version determines how names are repaired.
  • If tibble >= v2.0.0, the full power of .name_repair is available, defaulting to .name_repair = "unique". Otherwise, the legacy function tibble::repair_names(prefix = "X", sep = "__") is used, replicating the behaviour of readxl v1.1.0.
    • Consider a spreadsheet with three columns: one unnamed and two named x.
    • Content of cells in Excel: "", x, x
    • New style column names: ..1, x..2, x..3
    • Legacy column names: X__1, x, x__1
  • Once per session, readxl emits a message stating that it works best with tibble >= v2.0.0. It is anticipated that this will become a hard minimum version requirement in a future version of readxl.

Other changes

  • read_excel() and friends gain a progress argument that controls a progress spinner (#243, #538).

  • read_xls() and read_xlsx() pass the trim_ws argument along (#514).

  • readxl has a new article on reading Excel files with multiple header rows (#486, #492 @apreshill).

  • xlsx files that do not have a "styles" part can now be read (#505, #506 @jt6)

  • All paths are passed through normalizePath() (#498, #499, new behaviour for xlsx but not xls) and enc2native() (#370).

Dependency changes

readxl is now tested back to R >= 3.1.

Embedded libxls has been updated, using the source in https://github.com/evanmiller/libxls. readxl's DESCRIPTION now records the SHA associated to the embedded libxls in a Note.

readxl 1.1.0

20 Apr 04:23
Compare
Choose a tag to compare
  • read_excel() and excel_sheets() associate a larger set of file extensions with xlsx and are better able to guess the format of a file with a nonstandard or missing extension. This is about deciding whether to treat a file as xls or xlsx. (#342, #411, #457)

    • excel_format() is the newly-exported format-guessing function.
    • format_from_ext() is a low-level helper, also exported, that only consults file extension. In addition to the obvious interpretation of .xls and .xlsx, the extensions .xlsm, .xltx, and .xltm are now associated with xlsx.
    • format_from_signature() is a low-level helper, also exported, that consults the file's signature (a.k.a. magic number). It's handy for files that lack an extension.
  • Embedded libxls has been updated to address security vulnerabilitities identified in late 2017 (#441, #442).

  • xlsx structured as a "minimal conformant SpreadsheetML package" can be read. Most obvious feature of such sheets is the lack of an xl/ directory in the unzipped form. (xlsx, #435, #437)

  • Reading xls sheet with exactly 65,536 rows no longer enters an infinite loop. (xls, #373, #416, #432 @vkapartzianis)

  • Doubles, including datetimes, coerced to character from xls now have much higher precision, comparable to the xlsx behaviour. (xls, #430, #431)

  • Integer-y numbers larger than 2^31 are coerced properly to string (xls, #346)

  • Shared strings are only compared to NA strings after lookup, never on the basis of their index. (xlsx, #401)

  • Better checks and messaging around nonexistent files. (#392)

  • Add $(C_VISIBILITY) to compiler flags to hide internal symbols from the dll. (#385 @jeroen)

  • Numeric data in a logical column now coerces properly to logical. (xlsx, #385 @nacnudus)