Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use interval selection on all points of a line chart #1049

Closed
ZachEd96 opened this issue Jul 27, 2018 · 15 comments
Closed

Unable to use interval selection on all points of a line chart #1049

ZachEd96 opened this issue Jul 27, 2018 · 15 comments
Labels

Comments

@ZachEd96
Copy link

I'm having an issue creating a parallel coordinate chart that uses an interval selection brush to highlight lines on the chart. The brush only works correctly when highlighting the points for the first element on the x-axis, it doesn't highlight lines when selecting any other point on the line.

full_selection_interval_line_chart_example
first_point_selection_interval_line_chart_example
second_point_selection_interval_line_chart_example

I'm also struggling to make the text elements 'belong' to the line, so that when highlighting a line all text elements that belong to the line remain. I've layered two charts in this case but ideally I would use text encoding on the line chart but I can't get that to work.

    brush = alt.selection_interval()
    
    pCoord = alt.Chart(mergedDF[mergedDF['Procedure_Step'] == mergedDF['Variable']]).mark_line().encode(
        x=alt.X('Procedure_Step:O', sort=alt.SortField(op='sum',field='Sort_Order',order='ascending'), axis=alt.Axis(grid=True)),
        y=alt.Y('Normalised_Val:Q', axis=alt.Axis(grid=False, labels=False, domain=False, ticks=False), scale=alt.Scale(domain=[0,1])),
        detail='Identifier:N',
        opacity = alt.value(0.3),
        color=alt.condition(brush, if_true=alt.value('#005EB8'), if_false=alt.value('lightgray'))
    ).properties(
        height=750,width=1000
    ).add_selection(
        brush
    )
    
    axisText = alt.Chart(mergedDF[mergedDF['Procedure_Step'] == mergedDF['Variable']].drop_duplicates(['Variable','Procedure_Code'])).mark_text().encode(
        x=alt.X('Procedure_Step:O',sort=procSortList, axis=alt.Axis(grid=False, labels=False, domain=False, ticks=False, title='')),
        y=alt.Y('Normalised_Val:Q'),
        detail = 'Identifier:N',
        text=alt.condition(brush,alt.value('ex.'),alt.value(''))
    ).properties(
        height=750,width=1000
    )
    
    display(alt.layer(pCoord,axisText).resolve_scale(x='independent'))
@jakevdp
Copy link
Collaborator

jakevdp commented Jul 27, 2018

Can you provide data that lets me reproduce your chart?

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 27, 2018

Even if it's not "real" data, a toy dataset that illustrates the problem would make it much easier for me to help you.

@ZachEd96
Copy link
Author

ZachEd96 commented Jul 27, 2018

import pandas as pd
import numpy as np
import altair as alt

rows = {}
normalisedValues ={}
procedureSteps = ['A','B','C']
stepOrder = pd.DataFrame({'Step':procedureSteps, 'Order':[1,2,3]})
for x in range(1,201):
    rows[x] = [np.random.randint(1,15),np.random.randint(1,10),np.random.randint(1,6)]

df = pd.DataFrame(rows).T
df.columns = procedureSteps

for step in procedureSteps:
    size = df[step].max()
    for ind, row in df[step].iteritems():
        normalisedValues[(step, ind)] = row/(size + 1)

procDF = pd.DataFrame(pd.Series(normalisedValues).reset_index())
procDF.columns = ['Step', 'ID', 'Normalised_Val']
procDF = procDF.merge(stepOrder, how='left', on='Step')
df = df.reset_index().rename(columns={'index':'ID'})
meltedDF = df.melt(id_vars=['ID'],var_name='Step')
procDF = procDF.merge(meltedDF, how='left', on=['ID','Step'])

brush = alt.selection_interval()

pCoord = alt.Chart(procDF).mark_line().encode(
    x=alt.X('Step:O', axis=alt.Axis(grid=True)),
    y=alt.Y('Normalised_Val:Q', axis=alt.Axis(grid=False, labels=False, domain=False, ticks=False), scale=alt.Scale(domain=[0,1])),
    detail='ID:N',
    opacity = alt.value(0.3),
    color=alt.condition(brush, if_true=alt.value('#005EB8'), if_false=alt.value('lightgray'))
).properties(
    height=750,width=1000
).add_selection(
    brush
)

axisText = alt.Chart(procDF.drop_duplicates(['Step','value'])).mark_text().encode(
    x=alt.X('Step:O', axis=alt.Axis(grid=False, labels=False, domain=False, ticks=False, title='')),
    y=alt.Y('Normalised_Val:Q'),
    detail = 'ID:N',
    text=alt.condition(brush,'value:N',alt.value(''))
).properties(
    height=750,width=1000
)

alt.layer(pCoord,axisText).resolve_scale(x='independent')

That will generate a similar output with dummy data created in a similar sort of way to how I transformed the data for the original chart.

I've moved to another PC now, and in this example the brush isn't highlighting the line when selecting any of the points (but I'm not sure why).

EDIT: It is working, but backwards (C->B->A)

If this helps your understanding at all - the visualisation is looking at patient flows, and in what order they have procedures. 'ID' in this example would be patients, 'Steps' is procedure order and the 'value' column is the procedure ID. I've enumerated and normalised the procedure IDs so they can all fit on one y-axis.

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 27, 2018

What is the desired behavior? You want lines to be included in the selection when the brush touches part of them?

@ZachEd96
Copy link
Author

Any selected point (procedure) on a step's y-axis should show all lines connected to that procedure. So highlighting a procedure in step B should show all the lines that are connected to that procedure between step B and A, and step B and C.

For the text elements, the procedure's text on any lines that have been highlighted should show up regardless of the step they were selected from.

So in the 2nd picture in my initial post, all the lines highlighted would also have their text items shown. And in the 3rd picture, all the lines attached to the selected point would be highlighted (and have their text shown).

The desired line selection behaviour seems like a feature that should exist and be working with the current code, but the text elements definitely need a different set-up to display the desired behaviour.

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 27, 2018

Ah, I see.

You cannot do that as written. In particular, because you are manipulating the data in Pandas before passing it to your chart, Altair treats them as two distinct datasets, and so a signal that triggers on one dataset cannot be propagated to the other dataset.

If you want signals to pass between these two views of your data, you'll have to use Altair's built-in data transformations on a single data source, or provide a lookup transform to tell Altair how to map between two data sources (I'd recommend the single data source route). Once that's working, we can talk about propagating selections between the different data views.

@ZachEd96
Copy link
Author

The two charts can use a single data source without manipulation by dropping 'remove_duplicates' - the only issue is that the text gets darker the more data points have the value. Removing duplicates was just a way to ensure a consistent appearance of text marks, but if there's another way to do that I'm all ears.
I've done that and added a transform_filter field using the selection as the filter and it's still not producing the desired output. I've also removed the 'resolve_scale' from the layer chart to let the filter work properly, but I can't guarantee that this will work on my original data source as it won't sort properly due to this issue #820.

brush = alt.selection_interval()

pCoord = alt.Chart(procDF).mark_line().encode(
    x=alt.X('Step:O', axis=alt.Axis(grid=True)),
    y=alt.Y('Normalised_Val:Q', axis=alt.Axis(grid=False, labels=False, domain=False, ticks=False), scale=alt.Scale(domain=[0,1])),
    detail='ID:N',
    opacity = alt.value(0.3),
    color=alt.condition(brush, if_true=alt.value('#005EB8'), if_false=alt.value('lightgray'))
).properties(
    height=750,width=1000
).add_selection(
    brush
)

axisText = alt.Chart(procDF).mark_text().encode(
    x=alt.X('Step:O'),
    y=alt.Y('Normalised_Val:Q'),
    detail = 'ID:N',
    text='value:N'
).properties(
    height=750,width=1000
).transform_filter(
    brush
)

alt.layer(pCoord,axisText)

altair select interval ex

I still can't work out why the line has to be selected from what altair determines is the 'point of origin' (step C in this instance) for the data and not from any point along the line.

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 27, 2018

Do you want to be able to select the text and have it highlight everything with the same ID?

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 27, 2018

Or do you want to be able to select the lines?

@ZachEd96
Copy link
Author

Preferably select lines as opposed to the text itself, sort of like this: http://mbostock.github.io/d3/talk/20111116/iris-parallel.html but with no need for multi-selection.

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 27, 2018

I don't think you can generally select lines using an interval selection, but you could do something based on hover, like this:

hover = alt.selection_single(fields=['ID'], on='mouseover', empty='none')

pCoord = alt.Chart(procDF).mark_line().encode(
    x=alt.X('Step:O', axis=alt.Axis(grid=True)),
    y=alt.Y('Normalised_Val:Q', axis=alt.Axis(grid=False, labels=False, domain=False, ticks=False), scale=alt.Scale(domain=[0,1])),
    detail='ID:N',
    opacity = alt.value(0.3),
    color=alt.condition(hover, if_true=alt.value('#005EB8'), if_false=alt.value('lightgray')),
    size=alt.condition(hover, alt.value(5), alt.value(3))
).properties(
    height=750,width=1000
).add_selection(
    hover
)

axisText = alt.Chart(procDF).mark_text().encode(
    x=alt.X('Step:O'),
    y=alt.Y('Normalised_Val:Q'),
    text='min(value):N'
).properties(
    height=750,width=1000
).transform_filter(
    hover
)

alt.layer(pCoord,axisText)

@ZachEd96
Copy link
Author

I don't quite think that will provide the functionality I wanted though - you wouldn't be able to select a group of similar procedures with just a mouse over as it's stuck at one line or set of lines from one point.

I don't need the interval selection to work on the lines themselves, just to be able to select a point or set of points for any given step and highlight all lines attached to the point(s). If it works like that for the start point of the line but not the other points along the line then that must be a deficiency of vega-lite right?

@jakevdp
Copy link
Collaborator

jakevdp commented Jul 30, 2018

Yeah, I think what you want is for the interval selection to propagate on the "ID" field, and I'm not sure if that's possible in Vega-Lite.

That is, you would want to have something like

brush = alt.selection_interval(fields=['ID'])

but that leads to an error in the renderer.

Maybe @kanitw or @domoritz would have ideas about how to proceed?

@domoritz
Copy link
Member

@arvind is the selections guru.

@amitkaps
Copy link

amitkaps commented Aug 7, 2018

interval selection can be projected over encodings not fields. You will need to use layering to achieve what you are trying to do...

https://vega.github.io/vega-lite/docs/project.html#current-limitations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants