-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Juno Display #1121
Juno Display #1121
Conversation
const SIZE = 25 | ||
|
||
@render Inline frame::AbstractDataFrame begin | ||
width = min(size(frame, 2), SIZE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use 4 spaces for indentation for consistency with the rest of the repo
This is nice, but why does DataFrames need to depend on Juno for that? I'd rather put this into a separate package that Juno would use. It doesn't sound right to add dependencies on each GUI which will be developed in the future, especially since DataFrames is a dependency of a lot of packages. |
width = min(size(frame, 2), SIZE) | ||
height = min(size(frame, 1), SIZE) | ||
header = map(x->strong(string(x)), names(frame)[1:width]') | ||
body = hcat([frame[1:height,i] for i = 1:width]...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splatting vectors of arbitrary length is generally not a good idea as it will trigger compilation of specialized methods without a clear gain. I think you can just use Array(frame[1:height, 1:width])
here.
See for example the discussion over at Gadfly: GiovineItalia/Gadfly.jl#902. To a large extent this is a stop gap; we'll be able to remove these kinds of dependencies once conditional modules are working. Until then, the only realistic way for two large modules A and B to communicate, without one depending on the other, is to implement a small interface package that both depend on. As long as the package is a small pure-Julia dependency there's very little downside. We're also working on making these interfaces more general (see e.g. WebDisplay.jl which Juno will move towards) so that a single dependency can provide an interface to multiple frontends. It'll just take a couple of iterations to get to the ideal situation. |
Actually, there is an alternative, which is for DataFrames to provide the minimal interface rather than Juno. If there was a DataFramesCore.jl package which exported a minimal set of definitions ( |
I love the rich display! |
Actually that interface is precisely being developed right now at https://github.com/davidagold/AbstractTables.jl So I'd rather wait for it to be ready; anyway we won't make a new release without it (except maybe for bug fixes). |
That package seems pretty substantial to me. If it provides a bulk of functionality (i.e. NullableArrays and StructuredQueries), as opposed to just the minimal interface, it's not what I mean, though it may be the right direction. Aside from that, it doesn't seem like the DataFrames overhaul will be ready for some time. Can we figure something out for the existing release? If you have concrete concerns about the effects of this patch I can try to address those. |
If the dependency is lightweight and Julia only, then I support this PR. I know that in principle we should be looking to remove dependencies from DF rather than add them. However, I consider the following mitigating points in this case: (i) integration of Re: AbstractTables.jl. This package could be a solution for this and related interfacing issues, but I've been questioning lately whether or not |
@@ -579,3 +579,18 @@ function showcols(io::IO, df::AbstractDataFrame) # -> Void | |||
end | |||
|
|||
showcols(df::AbstractDataFrame) = showcols(STDOUT, df) # -> Void | |||
|
|||
using Juno | |||
import Juno: Inline, LazyTree, Table, Row, strong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need import
rather than using
here?
|
||
const SIZE = 25 | ||
|
||
@render Inline frame::AbstractDataFrame begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frame
is more commonly called df
.
using Juno | ||
import Juno: Inline, LazyTree, Table, Row, strong | ||
|
||
const SIZE = 25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you want to allow users to tweak this somewhere in Juno?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, but we'd have to implement a Juno.jl API for that. I think a better approach would be to extend this view to be capable of showing the entire dataframe at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Though showing the full data frame could be quite problematic if it contains one million rows and one hundred columns. Better take a subset first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not if we lazy-load it as needed
OK, let's go with that then until we have optional dependencies. Though please add tests for the new feature. |
FWIW, I'm not jazzed about this, purely for the dependency reasons. I'd much rather create a separate package for now that can ship w/ a Juno bundle or Atom bundle by default that provides the functionality and requires DataFrames and Juno. That and I think we need to at least play the part of "squeaky wheel" in Base for conditional modules and/or help in any way we can to push that forward. Bloating one of the most popular Julia packages with a "requirement" on optional stuff just seems like a cop out. |
It's inherently somewhat tricky to test conditional dependencies (with language support or not), since by definition the functionality is not available unless both packages are loaded. For now, I've added a simple test that Juno will be able to call the method; if you want I can add a test dependency on Atom.jl and add more comprehensive testing. |
If we add this feature, we should bite the bullet and test it thoroughly even if that means adding more requirements. Test dependencies are not so much a concern as runtime dependencies anyway. |
FWIW, supporting the use cases for this feature is a great way to do that. Having packages in existence where the feature provides a clear immediate benefit (fewer deps) is far more compelling than a few people saying "wouldn't it be nice if...". It also helps for figuring out the implementation issues and design (explanation by extrication and all that). |
Ok, better test added. |
Thanks. The test failure is real, though. |
|
||
const SIZE = 25 | ||
|
||
function tomat(df::AbstractDataFrame) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is essentially convert(Matrix, ...)
, except that it doesn't throw when undefined entries are present. Better improve the existing method then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convert(Matrix, ...)
also throws on NA
s so this would involve a change of semantics – are you sure that's what you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, indeed. Let's keep it that way, then, but please remove tomat
to something more explicit (to_matrix
?), as this one reads too much like 🍅.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, will go with a less fruity name :)
Tests will require JuliaLang/METADATA.jl#6973 to pass |
@@ -6,3 +6,4 @@ SortingAlgorithms | |||
Reexport | |||
Compat 0.8.4 | |||
FileIO 0.1.2 | |||
Juno |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tests require a specific version of Juno to be able to pass, that should be set as the minimum required Juno version.
function tomat(df::AbstractDataFrame) | ||
res = Array{Any}(size(df)) | ||
for (j, col) in enumerate(columns(df)), i = 1:length(col) | ||
isassigned(col, i) && (res[i, j] = col[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong indent
@@ -579,3 +579,26 @@ function showcols(io::IO, df::AbstractDataFrame) # -> Void | |||
end | |||
|
|||
showcols(df::AbstractDataFrame) = showcols(STDOUT, df) # -> Void | |||
|
|||
using Juno |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed, redundant with following line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using
will import all of the symbols exported from Juno, but using Juno: stuff
gets specific, potentially unexported symbols, so it's not entirely redundant, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah right. only other thing that should be needed that isn't listed yet is @render
, so adding that could replace the first line.
Mike, please communicate in sentences, single word responses (and commit messages) are not an effective way of justifying anything.
@@ -35,4 +38,7 @@ module TestShow | |||
show(io, A) | |||
A = DataFrames.RepeatedVector([1, 2, 3], 1, 5) | |||
show(io, A) | |||
|
|||
render(Juno.Editor(), df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should try to verify the result is correct instead of just that the method doesn't error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MikeInnes Is there any way of doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value of this is a JSON representation designed to be consumed by the frontend; in other words it's an implementation detail that's not part of the API. I'm assuming you don't want the DataFrames tests to break if/when Juno's implementation changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't have to compare the exact output, but anything that the code in this package should be adding to the representation should be tested for as a sanity check. otherwise it could easily be broken by accident and regress here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this sounds like another indication that this code shouldn't live in DataFrames.jl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see why. The important thing on the DataFrames side is that it's using Juno.jl's API correctly, and calling the function tests that. All of the actual functionality and logic that could regress is on the Juno side and can be tested there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I've added a test for the implementation. Given that there's no logic it's equivalent to checking the source code hasn't changed, but whatever, it's a test.
Any idea why tests failed only on 0.4? |
Ok, added a test guard for 0.4 – Juno only calls this method on 0.5 anyway. |
render(Juno.Editor(), df) | ||
if VERSION > v"0.5-" | ||
using Atom, Juno | ||
render(Juno.Editor(), df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong indent here
Tony had requested--and I agree--that more descriptive commit messages be used, but the two most recent messages are "four word commit message" and just ".". The commits should be squashed on merge, but in case someone mistakenly merges without squashing, it would be nice if the commit messages at least vaguely reflected the changes implemented. I've been guilty of this as well, but I've been trying to be better about it. Anyway, just something to consider. 😉 |
The commit log in its current form is not a good way to convey intent regardless of the messages, because there's little ordering and it's mixed with lots of noise from small fixups. If the DataFrames maintainers particularly value having a pretty commit log (perhaps they want to hang it on their wall or something?) I'm more than happy to rebase once the patch is ready to go. |
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum ornare, purus vel consequat consectetur, urna enim imperdiet sem, ut dapibus nibh nisi quis dolor. Pellentesque erat libero, ornare vitae volutpat vitae, viverra eget ante. Ut tortor nulla, suscipit eget facilisis vitae, fermentum non metus. Praesent sed porttitor enim, et commodo justo. Suspendisse vitae turpis in ipsum tempor placerat. Donec varius, lectus quis ultricies lobortis, purus orci rhoncus justo, id gravida mauris lacus porta sem. Pellentesque tincidunt felis porttitor volutpat luctus. Sed lorem enim, varius eget iaculis non, ultrices efficitur libero. Maecenas cursus risus metus, ut cursus purus iaculis vel. Praesent maximus viverra nunc vel tempor. Phasellus finibus cursus elementum. Vivamus non erat mattis lorem lacinia convallis. Nullam commodo, tortor et porttitor interdum, velit dolor feugiat magna, eu egestas magna lacus et lorem. Curabitur rhoncus tellus nec pharetra euismod. Maecenas ut lorem nisl. Phasellus elementum massa mi, id ullamcorper leo fringilla dapibus. Nullam sit amet massa at diam porta porttitor. Cras lacinia pellentesque mauris, non vestibulum quam tempus tincidunt. Integer convallis lectus non nisl tincidunt, et posuere tortor aliquet. Mauris sodales lacus at malesuada eleifend. Phasellus ut sodales massa. Vivamus aliquet ex eu risus pellentesque, nec sodales massa dictum. Mauris tincidunt ligula nulla, ut pharetra augue consequat ut. Sed faucibus venenatis sem vel pretium. Ut bibendum eleifend neque. Nunc risus risus, vestibulum ac consectetur sit amet, congue eu massa. Morbi volutpat eros arcu, id sodales erat maximus vitae. Nunc gravida et felis sit amet ultrices. Etiam congue tortor sapien, eu cursus tellus pretium quis. Donec finibus lorem mi, sed venenatis metus porta vel. Nulla non mi cursus, condimentum enim accumsan, dignissim ligula. In facilisis tortor eu pharetra tincidunt. Nam bibendum massa elementum enim rhoncus mollis. Nunc volutpat consequat eros, in mollis lorem vestibulum at. Praesent feugiat, sem in efficitur pharetra, dui massa luctus purus, id luctus nulla odio ut tellus. Pellentesque porttitor dictum consectetur. Aenean turpis nunc, bibendum nec enim eu, lobortis laoreet est. Duis felis mi, molestie ut malesuada in, gravida vel purus. Nullam vitae placerat enim. Nunc vitae tortor at massa finibus dapibus in sit amet elit. Ut mollis tellus vel metus laoreet, vitae lobortis ante tempor. Nam eget molestie mauris. Nunc sollicitudin tortor ut pellentesque dictum. Proin faucibus leo eu ex volutpat egestas. Phasellus convallis vitae nulla non interdum. Donec enim diam, auctor quis nunc et, gravida molestie felis. Ut facilisis ex in consectetur blandit. Mauris placerat bibendum diam, a tempor felis tempus in. In elementum, leo quis scelerisque tempus, tellus augue suscipit dui, vitae hendrerit ante est sit amet erat. Fusce sodales magna id elit sollicitudin, sit amet mattis quam tincidunt. Aliquam erat volutpat. Morbi nec leo imperdiet, tempor nulla non, rutrum turpis. Phasellus fermentum tristique nibh, sagittis iaculis sapien iaculis in. Ut tincidunt tristique ligula nec rutrum. Donec sem ex, sollicitudin non turpis sit amet, ullamcorper euismod justo. Nunc leo erat, consectetur tincidunt mi a, sodales tristique elit. Cras purus nulla, ultrices nec molestie sit amet, ultrices eget purus. Quisque elementum purus tortor, nec egestas quam aliquet vel. Sed efficitur at nulla vitae scelerisque. Mauris interdum hendrerit libero blandit dictum. Curabitur lacinia finibus dictum.
I would like to hang the julia commit log on my wall. :-) |
On reflection – if this integration is barely worth a few extra characters in require, then it's certainly not worth the hours of time I'm now sinking into it. I would have liked to use this to start prototyping things like spreadsheet views, but that's not even a feature I'd use much personally, so if it's all the same to you I'm happy to focus on more productive things and stick with plain text output here indefinitely. |
Hey, keep cool, looks like the PR is ready now. :-) Sorry if the lack of support for optional dependencies made it more painful than it should. @quinnj Are you OK with this living in DataFrames until we have a better solution? |
I already "cast my vote", so to speak in saying I'd rather not, but I also realize that it's just one vote. Happy to go along if more people want this in. |
Not many people have voted, and I'm myself torn about this. But well, I think I'd rather have this feature leave here for now rather than not exist at all. |
I vote for having it here since having it merged will keep it from bitrotting and its presence will prod people to figure out a better way to do it. The perfect is the enemy of the good and all that. |
Thanks @MikeInnes. Could you make the same PR against master so that we don't lose support in the next release? |
Will do, thanks @nalimilan and others for reviewing and pushing this through. |
function _render(df::AbstractDataFrame) | ||
width = min(size(df, 2), SIZE) | ||
height = min(size(df, 1), SIZE) | ||
header = map(x->strong(string(x)), names(df)[1:width]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
julia> [:foo, :bar]'
WARNING: the no-op `transpose` fallback is deprecated, and no more specific `transpose` method for Symbol exists. Consider `permutedims(x, [2, 1])` or writing a specific `transpose(x::Symbol)` method if appropriate.
1×2 Array{Symbol,2}:
:foo :bar
This seems like broken behaviour on Julia's part – I can only transpose arrays of transposable objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transpose is recursive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transpose is a linear algebraic operation, and recursive on elements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where the error is coming from, but the ultimate behaviour – arrays of Xs are transposable but arrays of Ys are not – seems pretty arbitrary to me. What makes a number transposable, other than as a hack to get array transpose to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reshape(names(df)[1:width], 1, 2)
works, btw.
This patch adds an extremely lightweight dependency on Juno.jl, and implements rich display inside of Juno, like so:
In future I'd like to improve on this by having access to the entire dataframe rather than truncating it, but this is something for now.