Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RowsPostProcessingError in the viewer of several datasets #602

Closed
albertvillanova opened this issue Oct 10, 2022 · 4 comments
Closed

RowsPostProcessingError in the viewer of several datasets #602

albertvillanova opened this issue Oct 10, 2022 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@albertvillanova
Copy link
Member

We find a RowsPostProcessingError in the viewer of several datasets:

Server error while post-processing the split rows. Please report the issue.

Error code:   RowsPostProcessingError
@albertvillanova albertvillanova added the bug Something isn't working label Oct 10, 2022
@severo severo self-assigned this Oct 10, 2022
@severo
Copy link
Collaborator

severo commented Oct 10, 2022

It seems like the issue is that we don't support a Sequence of dicts adequately:

A datasets.Sequence with a internal dictionary feature will be automatically converted into a dictionary of lists. This behavior is implemented to have a compatilbity layer with the TensorFlow Datasets library but may be un-wanted in some cases. If you don’t want this behavior, you can use a python list instead of the datasets.Sequence.

https://huggingface.co/docs/datasets/v2.5.2/en/package_reference/main_classes#datasets.Features

@severo
Copy link
Collaborator

severo commented Oct 10, 2022

#603 only fixed part of the datasets, ie https://huggingface.co/datasets/qasper

Capture d’écran 2022-10-10 à 22 02 41

But not https://huggingface.co/datasets/multi_woz_v22

Capture d’écran 2022-10-10 à 22 02 58

Working on it in #605

@severo severo reopened this Oct 10, 2022
@severo
Copy link
Collaborator

severo commented Oct 10, 2022

OK, #605 fixed https://huggingface.co/datasets/multi_woz_v22 as well

Capture d’écran 2022-10-10 à 22 19 35

Still 29 datasets failing with RowsPostProcessingError -> it seems to be due to another reason: missing fields (columns), or None values, in some rows. Created #606 to work on it

@severo
Copy link
Collaborator

severo commented Oct 10, 2022

With #603, #605 and #606, we have fixed all the RowsPostProcessingError occurrences!

@severo severo closed this as completed Oct 10, 2022
@albertvillanova albertvillanova changed the title RowsPostProcessingError in the viewer of serveral datasets RowsPostProcessingError in the viewer of several datasets Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants