-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent handling of Tuple and NamedTuple #204
Comments
Somewhat intentional; i.e. we have explicit code in |
This is how things currently work for non-empty vectors of respective types:
|
Yeah, the difference is we have: schema(x::AbstractVector{NamedTuple{names, types}}) where {names, types} = Schema(names, types) defined for vector of namedtuples, but no similar definition for vector of tuple. In this case, Marking this as "help wanted" if someone wanted to take a stab at implementing this, since it's not too hard and could give someone a taste of Tables.jl internals. |
Yes - this is exactly this difference. Actually this is a more general thing as schema can be inferred from collection
|
Good point. So instead of just special-casing |
This is what I assumed would be best (hopefully someone will be willing to grab it as a hacktoberfest challenge - maybe tomorrow? - I will post on #data). (if this is not resolved by someone else till DataFrames.jl 0.22 is released I can try to propose a PR) |
I can leave a few bread crumbs for anyone who wants to take a stab at this:
|
Right - I will review that constructor and make a PR (as there might be some more logic to add below also). In particular I would change EDIT: it is not that simple - I will propose what I think is OK in the PR |
Fixed in a commit to JuliaData/DataFrames.jl#2464 |
Fixes #204. For empty row interator inputs, we were just returning an empty `NamedTuple`. This has the disadvantage of not preserving an input's schema in the case like `NamedTuple{(:a, :b), Tuple{Int64, Float64}}[]`. Sometimes the input may be empty, but have a queryable schema, so we should try to preserve that. For `Tuple` rows, we generate column names like `Column$i`. I think this is fine because we're in the fallback `buildcolumns` code where we just want to return a standard "table" object, so returning a NamedTuple instead of `Tuple` of vectors seems more appropriate; i.e. it's Tables.jl job here to "build the columns", so we have latitude and control over what we return.
Ok, I went ahead and did this: #206 |
…ts (#206) * For Tables.columns fallback, attempt to preserve schema on empty inputs Fixes #204. For empty row interator inputs, we were just returning an empty `NamedTuple`. This has the disadvantage of not preserving an input's schema in the case like `NamedTuple{(:a, :b), Tuple{Int64, Float64}}[]`. Sometimes the input may be empty, but have a queryable schema, so we should try to preserve that. For `Tuple` rows, we generate column names like `Column$i`. I think this is fine because we're in the fallback `buildcolumns` code where we just want to return a standard "table" object, so returning a NamedTuple instead of `Tuple` of vectors seems more appropriate; i.e. it's Tables.jl job here to "build the columns", so we have latitude and control over what we return.
Thank you! |
Currently we have this:
Is this intended?
(I know it is a corner case but I test against it in DataFrames.jl)
The text was updated successfully, but these errors were encountered: