-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keeping names in *cat #74
Comments
In NamedArrays the lookup name->index happens through a dictionary, so names must be unique. Indeed, names are always reset in the concatenating direction. This would not always be necessary, indeed. Come to think of it, the resetting of names probably has an impact on type stability (would not know now whether that would be positive or negative). Major hurdle in generating better names along the concatenating direction is coming up with a unique naming scheme, that would work for any key type. And then there is type stability---the resulting keytype should be computable by the compiler. Writing type-stable code is beyond my capabilities. But perhaps, for keys of type String and Symbol, we can make specialized versions that do try to combine the keys in a sensible way. |
I could try to make some test-implementation focusing on the type-stability problem, but my concern was mainly the warning/error question. Following the package "standard" until now, it seems to me that the most consistent solution would be to silently drop names. |
DataFrames has a Regarding the type of the combined names, I'd just call |
@nalimilan I was thinking more about |
|
DataFrame keys are always Symbol, right? Then it is probably much easier than the general case. I didn't know of |
I think that the |
That wouldn't be applied by default, so it would have zero cost. |
Yes, I meant "when invoked", and mainly in term of implementation time, but even if feasible |
I jus ran into what I think is this issue while constructing an aggregated table like this:
[ ft[top_names, top_breeds] sum(ft[top_names, other_breeds], dims=2)
sum(ft[other_names, top_breeds], dims=1) sum(ft[other_names, other_breeds]) ] This concatenation works beautifully but sadly the labels are lost. Now that I've typed this out I see that one problem here is that of making up a name for the final row and column, which got reduced out in all three cases. It would still be very nice if the known labels were preserved. Thanks for a fantastic package at any rate (and FreqTables) — having names around has been an incredibly useful thing! Edit: I figured out that I can do this:
|
Hello, it took me a while before I realized you were using the space/newline array constructor operator, which I always find difficult to parse and matlabish somehow... Is it like you are trying to make
which keeps the dimensions and the names, that match the other marginals. The space/newline concatenation might work automatically in that case. Well, I just see that the |
Hello, I did run into a similar problem when writing the result of The solution I came up with was to overwrite CSV.write follows:
I'd be willing to help, submit a PR or else, depending on what you would suggest, @davidavdav:
Let me know what you think... |
@arnaudmgh This sounds like a completely different problem, please file a separate issue. |
Currently
*cat
ofNamedArray
s drop the column/row names. It would be nice if the behavior was the same as R, where names are merged in the dimension perpendicular to the bound and are kept in the other iff they overlap, dropped otherwise.This would raise the question of what doing if two combined NamedArrays contain the same name for a row/column: warning and dropping names or error? R ignores this problem and permit that named vectors / matrices have multiple rows/cols with the same name (when you select that name it will return only the first istance among the cases).
I have already implemented an alternative
hcat
function that would keep names, but it's important to define the behavior in caso of conflicting names.The text was updated successfully, but these errors were encountered: