Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apparent character encoding issue when reading csv on windows #33

Closed
jhpoelen opened this issue Sep 18, 2019 · 3 comments
Closed

apparent character encoding issue when reading csv on windows #33

jhpoelen opened this issue Sep 18, 2019 · 3 comments

Comments

@jhpoelen
Copy link
Contributor

To reproduce:

desired_fields = c("source_taxon_name", "target_taxon_name", "target_specimen_frequency_of_occurrence_percent")
otherkeys = list("limit"=1000, "skip"=0)
Box2_1 <- get_interactions_in_area(bbox = c(-88.422, 23.914, -80.591, 25.910), interactiontype = c("eats"), showfield = desired_fields, returnobservations = T, otherkeys = otherkeys)

On ubuntu / linux, no warnings / errors are seen:

> Box2_1 <- rglobi::get_interactions_in_area(bbox = c(-88.422, 23.914, -80.591, 25.910), interactiontype = c("eats"), showfield = desired_fields, returnobservations = T, otherkeys = otherkeys)
200
200
> head(Box2_1)
  source_taxon_name target_taxon_name
1        Clupanodon          Annelida
2 Galeocerdo cuvier          Annelida
3 Galeocerdo cuvier       Cheloniidae
4 Galeocerdo cuvier    Chelonia mydas
5 Galeocerdo cuvier   Caretta caretta
6 Galeocerdo cuvier           Equidae
  target_specimen_frequency_of_occurrence_percent
1                                              NA
2                                              NA
3                                              NA
4                                              NA
5                                              NA
6                                              NA

However, on windows, warning are observed and no data is made available.

"Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote,  :
  invalid input found on input connection 'C:\Users\XXXX'
2: In read.table(file = file, header = header, sep = sep, quote = quote,  :
  incomplete final line found by readTableHeader on 'C:\Users\XXXX"

This seems consistent with encoding issues seen on windows as described by https://dss.iq.harvard.edu/blog/escaping-character-encoding-hell-r-windows and related stackoverflows posts like: https://stackoverflow.com/questions/16838613/cannot-read-unicode-csv-into-r .

@jhpoelen
Copy link
Contributor Author

Some suggest to use the readr package at https://github.com/tidyverse/readr . This package is said to deal with these encoding issues.

jhpoelen pushed a commit to globalbioticinteractions/globalbioticinteractions that referenced this issue Sep 18, 2019
@jhpoelen
Copy link
Contributor Author

Please note that warning 2: In read.table(file = file, header = header, sep = sep, quote = quote, : incomplete final line found by readTableHeader on 'C:\Users\XXXX" was caused by the GloBI api not adding a newline to the end of the last line of data. This has been resolved in globalbioticinteractions/globalbioticinteractions@3a089c9 in the GloBI API.

This leaves the warning 1: In read.table(file = file, header = header, sep = sep, quote = quote, : invalid input found on input connection 'C:\Users\XXXX' .

@jhpoelen
Copy link
Contributor Author

Confirmed fix on Windows. Included in https://github.com/ropensci/rglobi/releases/tag/v0.2.20 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant