Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reading of malformed CSV #901

Merged
merged 2 commits into from
May 6, 2021
Merged

Allow reading of malformed CSV #901

merged 2 commits into from
May 6, 2021

Conversation

ChangzhenZhang
Copy link
Contributor

@ChangzhenZhang ChangzhenZhang commented Apr 23, 2021

Thanks for contributing.

Description

add ignoreInvalidRows option in CsvReadOptions and ReadOptions to skip invalid csv rows

Testing

add 2 unit tests

@benmccann benmccann linked an issue Apr 26, 2021 that may be closed by this pull request
@benmccann benmccann changed the title Fix issue #396 Allow reading of malformed CSV Apr 26, 2021
@ChangzhenZhang
Copy link
Contributor Author

@benmccann, I find that I didn't create branch for this issue. What should I do now?

@@ -154,6 +157,10 @@ public boolean ignoreZeroDecimal() {
return ignoreZeroDecimal;
}

public boolean ignoreInvalidRows() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a method comment.

I'm not sure this is a good method name. It seems there are many ways that a row could be invalid, but the behavior is simply to skip (more descriptive than ignore) rows when the row length differs from the expected number of columns (skipRowsWithInvalidColumnCount()?)

The method comment should describe how the 'valid' number of columns is determined. Is it from the header?, What if there is no header (ie, the noHeader() option is used). Add a test that shows that it works as expected when there is no header.

@benmccann
Copy link
Collaborator

I find that I didn't create branch for this issue. What should I do now?

That's okay. You can push additional changes to your master and they will show up here. Then when this PR is done you can simply delete and recreate your fork as an easy way to reset your repo and next time you can create a branch

@ChangzhenZhang
Copy link
Contributor Author

@lwhite1 @benmccann, Thanks a lot! I changed the method name and add comments. Also added a test without Header. Could you please check if there is anything else that needs to be changed?

@lwhite1 lwhite1 merged commit 40a663c into jtablesaw:master May 6, 2021
@lwhite1 lwhite1 mentioned this pull request May 18, 2021
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow to read malformed CSV
3 participants