Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #751 and #882 #931

Merged
merged 2 commits into from
May 11, 2021
Merged

Fixes #751 and #882 #931

merged 2 commits into from
May 11, 2021

Conversation

lujop
Copy link
Contributor

@lujop lujop commented May 9, 2021

Thanks for contributing.

Description

Column type detection uses the same logic as before but all column cell types are used instead of only the first one.
If all are the same type it's used, and if not string is used to be able to accommodate all values.

This fixes #751 and with the combination of 74a54aa fixes also #882

Testing

Existing tests were extended to cover new behavior and use cases from #751 and #882

Column type detection uses all column cells to determine the columnType instead of only the first one.
@lujop
Copy link
Contributor Author

lujop commented May 9, 2021

The CI fails seems to be temporary ones due to network problems.
Can someone relaunch the suite?

@lwhite1
Copy link
Collaborator

lwhite1 commented May 9, 2021

i don't know anything about the Excel reader code, but why not reuse the value-based approach used for text files?

@lujop
Copy link
Contributor Author

lujop commented May 10, 2021

i don't know anything about the Excel reader code, but why not reuse the value-based approach used for text files?
I think that's not a good idea and that current implementation is better.
Excel contrary to other readers provides a type for each cell and current implementation uses it to determine column type.

IMO using the provided Excel type is the best option as is the best intent from the application and the user to define what type a cell is. That takes into account locale options of the user, formatting options, ...
And trying to obviate that info I think that will be confusing and very error-prone for the user.

There are also implementation concerns, in Excel, or at least in POI when you have a cell of type Number you can't get the cell as a String, your best intention is to try to format the number as a string using the format options that the user used
That is what it tries the appendValue function in line 338

private Column<?> appendValue(Column<?> column, Cell cell) {

But it will require a refactor and I think that won't be robust and that not in all cases we will get the same string value that the user sees.

@lwhite1
Copy link
Collaborator

lwhite1 commented May 10, 2021

i will bow to your expertise. I will merge after the conflicts are resolved.

@lwhite1 lwhite1 merged commit 4d0b9ab into jtablesaw:master May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Column type detection fails in XlsxReader
2 participants