Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing CSV "wrong number of fields" #6

Closed
memon opened this issue Jan 11, 2022 · 11 comments · Fixed by multiprocessio/datastation#163
Closed

Parsing CSV "wrong number of fields" #6

memon opened this issue Jan 11, 2022 · 11 comments · Fixed by multiprocessio/datastation#163

Comments

@memon
Copy link

memon commented Jan 11, 2022

Trying out with a CSV file that has empty rows etc and receiving the following error:

image

@eatonphil
Copy link
Member

Could you share the csv that causes this issue? Or enough of the csv (if there are sensitive parts, delete them) so that I can reproduce?

@memon
Copy link
Author

memon commented Jan 12, 2022

@eatonphil sent via email.

@eatonphil
Copy link
Member

Got it thanks! I'll take a look.

@eatonphil
Copy link
Member

Is it ok if I commit this csv file to the repo for regression testing or are there still sensitive values in it?

@eatonphil
Copy link
Member

On looking into this further I'm not sure that skipping blank lines makes sense because in a single column CSV file a blank line is actually a valid value meaning the empty string. See the discussion here.

I think this means there'll have to be a new checkbox in the UI that says "skip blank lines" to get what you're looking for.

p.s. this repo is for the CLI tool. https://github.com/multiprocessio/datastation is the repo for the UI you are showing in the screenshot. But it's ok I'll keep tracking the fix in this issue.

@memon
Copy link
Author

memon commented Jan 13, 2022

Agreed, skipping lines probably doesn’t make sense plus it’s easy enough to do when need with a code block. We would just need the ability for it to get to a code block. Re. File I’d prefer to send/commit a different one

@eatonphil
Copy link
Member

Ah ok so this isn't about blank lines this is about the number of columns (go figure, that's what the error says :D). There are 42 commas in the first line and 50 commas in the next line. Is it intentional that the number of columns don't line up?

@memon
Copy link
Author

memon commented Jan 14, 2022

the CSV i was using came from a third party so I don't have control over how they're generating it. I assume dsq determining the number of columns based on the first row? I suppose it's a papaparse thing?

@eatonphil
Copy link
Member

Sorry for the wait, this is fixed in the main branch of DataStation now. But it will be a few days before I make a release. When DataStation 0.7.0 comes out this fix will be in!

@memon
Copy link
Author

memon commented Feb 3, 2022

Thank you @eatonphil

@eatonphil
Copy link
Member

This has been released. See the notes here: https://datastation.multiprocess.io/docs/0.7.0-release-notes.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants