Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VegaFusion data transformer with mime renderer, save, and to_dict/to_json integration #3094

Merged
merged 16 commits into from
Jul 8, 2023

Conversation

jonmmease
Copy link
Contributor

Overview

Following on from the discussion in #3054, this PR introduces a new data transformer named "vegafusion". When activated, we use VegaFusion to pre-evaluate data transformations before displaying charts with mime renderers, and before saving charts with Chart.save.

The Data Transformer

The "vegafusion" data transformer replaces DataFrames in Vega-Lite specs with URLs of the form "table://{uuid}". These DataFrames are stashed in a local WeakValueDictionary, with the UUID as the key. The table:// URL is a VegaFusion convention used to mean that the associated data will be provided using the inline_datasets argument to the pre_transform_spec function.

Mimerenderer and Save

A function is defined in _vegafusion_data.py called compile_with_vegafusion that inputs a Vega-Lite spec containing these table:// URLs, compiles it to a Vega spec, then invokes VegaFusion's pre_transform_spec function. The stashed DataFrames that correspond to the table:// URLs are extracted from the WeakValueDictionary and passed to pre_transform_spec as the inline_datasets argument. compile_with_vegafusion then returns the pre-transformed Vega spec as a dictionary.

When the "vegafusion" data transformer is enabled, the default_renderer_base() function calls compile_with_vegafusion() before returning a Vega mime bundle. Similarly, the spec_to_mimebundle function has been updated to call compile_with_vegafusion() when the "vegafusion" data transformer is enabled. This way, saved charts have their transforms pre-evaluated before saving.

Transformed Data

This PR also updates the chart.transformed_data functionality added in #3081 and #3084 to rely on the new "vegafusion" data transformere rather than the "vegafusion-inline" transformer that is provided by the vegafusion Python package. ("vegafusion-inline" will be deprecated in vegafusion once this is released).

MaxRowsError

VegaFusion still has a notion of a maximum number of rows. But unlike the regular row limit, VegaFusion's row limit is applied after all supported data transformations have been applied. So you may hit it in the case of a large scatter chart. I decided to reuse the existing infrastructure for setting and disabling max_rows (e.g. alt.data_transformers.disable_max_rows()). When this post-transformed limit is exceeded, we raise the same MaxRowsError exception, but with a message that explains that the limit is applied after data transformations.

This is open for discussion, but I set the "vegafusion" data transformer's row limit to 100k. This is partly justified by the fact that VegaFusion will remove unused data columns. But also, practically speaking I haven't run into issues at this scale. One case where a high default limit is helpful is for creating fine-grained heatmaps (e.g. 300x300 get's you to 90k).

Testing

I added a test for the spec_to_mimebundle updates, but still need to find a place to test the renderer path.

Follow-up

After this PR, I want to work on how to document this. We can talk more about it, but I'm thinking of adding a section to the top of the "Large Datasets" section that recommends enabling this data transformer as the first step. And perhaps also updating the default MaxRowsError with instructions on how to enable it.

Copy link
Contributor

@mattijn mattijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jonmmease! Without testing, one quick inline comment.

# Special URL prefix that VegaFusion uses to denote that a
# dataset in a Vega spec corresponds to an entry in the `inline_datasets`
# kwarg of vf.runtime.pre_transform_spec().
VEGAFUSION_PREFIX: Final = "table://"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename this to vf_source://? Since you also prefix a DataFrame as table_<x>, here. This will keep space for the support for arrays/tensors/matrices once they will be introduced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prefix is build into VegaFusion. But there is an alternative prefix supported that I can switch to vegafusion+dataset://.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed in 3a5517b

@mattijn
Copy link
Contributor

mattijn commented Jul 2, 2023

Thanks @jonmmease! I've read the code-diff a few times and it looks solid to me.

I've tested a bit and only have a few observations. maybe interesting for documenting purposes. I don't think these are bugs.

  1. I can update vegafusion-python-embed independent from vegafusion (during installation). Is this intended? I've the feeling these versions should be in sync with each other, but not sure.
  2. A roundtrip with a pandas dataframe as input including a column with datetime objects (without timezones connected) will have timezones connected upon doing chart.transformed_data().
  3. As far I can see, this does not happen if the input is a pyarrow table. Not sure if (my installed version of) pyarrow support timezones.
  4. chart.transformed_data() also works without alt.data_transformers.enable("vegafusion") activated.
  5. If vegafusion is activated and I click 'Open in Vega Editor' I see the data inlined. But if I do chart.to_dict() I see it as such "url": "vegafusion+dataset://table_72f60cdf_3ede_47cf_9eb6_53f65059a9f5".
  6. The generated JSON code in the vega-editor is Vega JSON and not Vega-Lite. The order of attributes is not fixed upon re-rendering.
  7. Can I force inline the vegafusion-generated data in python? Return "values": [] instead of "url": "vegafusion+dataset://"?

Tests I did were similar to:

from datetime import datetime
import pandas as pd
import pyarrow as pa
import altair as alt

alt.data_transformers.enable("vegafusion")
#alt.data_transformers.enable("default")

data = {    
    'date': [datetime(2004, 8, 30, 23, 15), datetime(2004, 9, 1, 12, 10)],
    'value': [102, 129]
 }
pd_df = pd.DataFrame(data)

pa_table = pa.table(data)
# pandas dataframe
c = alt.Chart(pd_df).mark_point().encode(x='date:T', y='value:Q')
c.transformed_data()
date value
2004-08-30 23:15:00+02:00 102
2004-09-01 12:10:00+02:00 129
c.to_dict()
{'config': {'view': {'continuousWidth': 300, 'continuousHeight': 300}},
 'data': {'url': 'vegafusion+dataset://table_27a58b61_c93c_44ae_8476_31212b82f513'},
 'mark': {'type': 'point'},
 'encoding': {'x': {'field': 'date', 'type': 'temporal'},
  'y': {'field': 'value', 'type': 'quantitative'}},
 '$schema': 'https://vega.github.io/schema/vega-lite/v5.8.0.json'}
c.to_dict(format='vega')
{'$schema': 'https://vega.github.io/schema/vega/v5.json',
 'background': 'white',
 'padding': 5,
 'width': 300,
 'height': 300,
 'style': 'cell',
 'data': [{'name': 'source_0',
   'url': 'vegafusion+dataset://table_9d518c4b_c318_4e1f_b3fa_82d57b90275c',
   'format': {'type': 'json', 'parse': {'date': 'date'}},
   'transform': [{'type': 'filter',
     'expr': '(isDate(datum["date"]) || (isValid(datum["date"]) && isFinite(+datum["date"]))) && isValid(datum["value"]) && isFinite(+datum["value"])'}]}],
 'marks': [{'name': 'marks',
   'type': 'symbol',
   'style': ['point'],
   'from': {'data': 'source_0'},
   'encode': {'update': {'opacity': {'value': 0.7},
     'fill': {'value': 'transparent'},
     'stroke': {'value': '#4c78a8'},
     'ariaRoleDescription': {'value': 'point'},
     'description': {'signal': '"date: " + (timeFormat(datum["date"], \'%b %d, %Y\')) + "; value: " + (format(datum["value"], ""))'},
     'x': {'scale': 'x', 'field': 'date'},
     'y': {'scale': 'y', 'field': 'value'}}}}],
 'scales': [{'name': 'x',
   'type': 'time',
   'domain': {'data': 'source_0', 'field': 'date'},
   'range': [0, {'signal': 'width'}]},
  {'name': 'y',
   'type': 'linear',
   'domain': {'data': 'source_0', 'field': 'value'},
   'range': [{'signal': 'height'}, 0],
   'nice': True,
   'zero': True}],
 'axes': [{'scale': 'x',
   'orient': 'bottom',
   'gridScale': 'y',
   'grid': True,
   'tickCount': {'signal': 'ceil(width/40)'},
   'domain': False,
   'labels': False,
   'aria': False,
   'maxExtent': 0,
   'minExtent': 0,
   'ticks': False,
   'zindex': 0},
  {'scale': 'y',
   'orient': 'left',
   'gridScale': 'x',
   'grid': True,
   'tickCount': {'signal': 'ceil(height/40)'},
   'domain': False,
   'labels': False,
   'aria': False,
   'maxExtent': 0,
   'minExtent': 0,
   'ticks': False,
   'zindex': 0},
  {'scale': 'x',
   'orient': 'bottom',
   'grid': False,
   'title': 'date',
   'labelFlush': True,
   'labelOverlap': True,
   'tickCount': {'signal': 'ceil(width/40)'},
   'zindex': 0},
  {'scale': 'y',
   'orient': 'left',
   'grid': False,
   'title': 'value',
   'labelOverlap': True,
   'tickCount': {'signal': 'ceil(height/40)'},
   'zindex': 0}]}
c.save('c.png')  # success
print(c.to_json())
{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.8.0.json",
  "config": {
    "view": {
      "continuousHeight": 300,
      "continuousWidth": 300
    }
  },
  "data": {
    "url": "vegafusion+dataset://table_6ca2f4b0_90aa_4f17_b792_d7943bbae44b"
  },
  "encoding": {
    "x": {
      "field": "date",
      "type": "temporal"
    },
    "y": {
      "field": "value",
      "type": "quantitative"
    }
  },
  "mark": {
    "type": "point"
  }
}
# pyarrow table
d = alt.Chart(pa_table).mark_point(tooltip=True).encode(x='date:T', y='value:Q')
d.transformed_data()
pyarrow.Table
date: timestamp[ms]
value: int64
----
date: [[2004-08-30 21:15:00.000,2004-09-01 10:10:00.000]]
value: [[102,129]]
d.to_dict()
{'config': {'view': {'continuousWidth': 300, 'continuousHeight': 300}},
 'data': {'url': 'vegafusion+dataset://table_5d35f0b9_8f6c_4c68_a5dd_8a5d0cf5213d'},
 'mark': {'type': 'point', 'tooltip': True},
 'encoding': {'x': {'field': 'date', 'type': 'temporal'},
  'y': {'field': 'value', 'type': 'quantitative'}},
 '$schema': 'https://vega.github.io/schema/vega-lite/v5.8.0.json'}
```python
d.save('d.png')  # success
print(d.to_json())  # print(d.to_json(format='vega'))
{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.8.0.json",
  "config": {
    "view": {
      "continuousHeight": 300,
      "continuousWidth": 300
    }
  },
  "data": {
    "url": "vegafusion+dataset://table_2412a125_3419_42ff_a469_9644fcb3a537"
  },
  "encoding": {
    "x": {
      "field": "date",
      "type": "temporal"
    },
    "y": {
      "field": "value",
      "type": "quantitative"
    }
  },
  "mark": {
    "tooltip": true,
    "type": "point"
  }
}
# large table
alt.data_transformers.enable("vegafusion")
#alt.data_transformers.enable("default")

flights = pd.read_parquet(
    "https://vegafusion-datasets.s3.amazonaws.com/vega/flights_1m.parquet"
)

f = alt.Chart(flights).mark_bar().encode(
    alt.X("delay", bin=alt.Bin(maxbins=30)),
    alt.Y("count()")
)
f.save('f.png')  # success
# print(f.to_json())
# f
f.transformed_data().head()
bin_maxbins_30_delay bin_maxbins_30_delay_end __count
0.0 20.0 392700
-20.0 0.0 419400
20.0 40.0 92700
140.0 160.0 2000
40.0 60.0 38400

@jonmmease
Copy link
Contributor Author

Thanks for taking the time to go through this @mattijn! I'll respond in more detail tomorrow, but wanted to mention one thing.

I debated about whether to change the behavior of to_dict and to_json. One possibility I considered was the apply the VegaFusion pre-transform logic when format="vega", but not when `format=vega-lite" (default). But this seemed like it might end up being confusing though. Let me know if you have any thoughts.

Also, I want to wait until @binste is back and has a chance to look at this before merging.

@jonmmease
Copy link
Contributor Author

jonmmease commented Jul 3, 2023

  1. I can update vegafusion-python-embed independent from vegafusion (during installation). Is this intended? I've the feeling these versions should be in sync with each other, but not sure.

Yeah, I'm not sure of the best way to handle this. vegafusion-python-embed is technically an optional dependency because it's possible to use a subset of the functionality of the vegafusion package by connecting it to an instance of VegaFusion server over gRPC. This mode isn't full featured, or documented well, but the idea is to make it possible to use VegaFusion in architectures that aren't supported by connecting to an instance of VegaFusion server running somewhere else.

When you install with vegafusion[embed], the versions are pinned exactly to match. But I'm not sure if there's a way to get pip to constrain an optional dependencies version. Let me know if you have any ideas!

  1. A roundtrip with a pandas dataframe as input including a column with datetime objects (without timezones connected) will have timezones connected upon doing chart.transformed_data().

This is expected (and needs to be documented). Vega-Lite and VegaFusion treat everything as UTC internally, so on output we pretty much need to assign some timezone. I chose to have the output timezone match the local timezone that's used for timeunit calculations (this defaults to the kernel's local timezone but can be overridden with vf.set_local_tz().

  1. As far I can see, this does not happen if the input is a pyarrow table. Not sure if (my installed version of) pyarrow support timezones.

This conversion to the local timezone works with pandas as polars, but I did see a way to perform the conversion with pyarrow. There might be a way that I didn't find.

  1. chart.transformed_data() also works without alt.data_transformers.enable("vegafusion") activated.

This is expected

  1. If vegafusion is activated and I click 'Open in Vega Editor' I see the data inlined. But if I do chart.to_dict() I see it as such "url": "vegafusion+dataset://table_72f60cdf_3ede_47cf_9eb6_53f65059a9f5".

See note above. Because VegaFusion works at the Vega level it's not possible to inline data into a Vega-Lite spec. So there are 3 options when it comes to to_dict and to_json.

  1. Do nothing, which passes through the "vegafusion+dataset://" urls
  2. Do nothing when format="vega-lite" (default), but inline then data when format="vega"
  3. Always inline the data and return a Vega spec, ignoring the format argument.

I wasn't really happy with (2) or (3), but I agree that (1) is also not great. Happy to consider other alternatives as well.

  1. The generated JSON code in the vega-editor is Vega JSON and not Vega-Lite. The order of attributes is not fixed upon re-rendering.

Expected to be Vega. I haven't looked into the ordering of attributes. I'm not certain where this comes from, but there's probably somewhere we could sort the keys.

  1. Can I force inline the vegafusion-generated data in python? Return "values": [] instead of "url": "vegafusion+dataset://"?

Right now you can accomplish this with:

from altair.utils._vegafusion_data import compile_with_vegafusion
compile_with_vegafusion(chart.to_dict())

But it would probably be nice to have a simpler way to do this.

Copy link
Contributor

@binste binste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the general approach of integrating and enabling VegaFusion via a data transformer and the implementation in this PR! 🥳 I added some comments inline.

Regarding the to_dict/to_json discussion, I also think this should produce something usable but I see that it's not straightforward. What do you think about, if the vegafusion transformer is called, to:

  • raising an exception if to_dict/to_json are called without format specified (default vega-lite or with format="vega-lite"
  • Mention in error that only format="vega" is supported for this transformer

This would make it clear to a user that with the vegafusion transformer a vega-lite spec is not possible and they can explicitly opt in to receive a vega spec in case they want to send the spec to e.g. a frontend in a web application.

I'd be ok if this is implemented in a follow-up PR and we could also continue the discussion there or in the existing discussion thread on the VegaFusion integration.

Your ideas on the documentation sound good to me.

# Check from row limit warning and convert to MaxRowsError
for warning in warnings:
if warning.get("type") == "RowLimitExceeded":
raise MaxRowsError(
Copy link
Contributor

@binste binste Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also never experienced any issues with larger datasets for single charts. However, if you do some exploratory data analysis with many charts in a Jupyter notebook, that notebook can get rather slow after a while if you plot many larger charts. I still like the idea though of increasing it and 100k sounds as good as any as it would be difficult to benchmark and figure out a good compromise. Just wanted to mention this.

@jonmmease
Copy link
Contributor Author

raising an exception if to_dict/to_json are called without format specified (default vega-lite or with format="vega-lite"

This makes sense to me for the public API, but a complication is that I need the to_dict(format="vega-lite") functionality (where the resulting Vega-Lite spec has vegafusion+dataset:// urls) internally.

I haven't dug into what the context argument to to_dict() is for. Are you familiar with this? I wonder if it would make sense to add an option to context to suppress the VegaFusion pre-transform step, which would suppress the format="vega-lite" error. Then we could use this optional internally where needed.

jonmmease added 2 commits July 6, 2023 08:23
Raise a ValueError when the "vegafusion" transformer is enabled and format="vega-lite".

Use context={"pre_transform": False} to disable pre_transforming when "vegafusion" is enabled, for internal usage.
@jonmmease
Copy link
Contributor Author

@binste, I updated to_dict/to_json to apply pre-transformation when "vegafusion" is enabled and format="vega" and to raise when format="vega-lite". I also used the context argument to to_dict to disable for internal usage. See what you think!

@jonmmease jonmmease changed the title Add VegaFusion data transformer with mime renderer and save integration Add VegaFusion data transformer with mime renderer, save, and to_dict/to_json integration Jul 6, 2023
Copy link
Contributor

@binste binste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the implementation with the ValueError and using context! I think it needs to be extended on line 903 in api.py (can't comment there) as it now raises this error for a layered chart:

import altair as alt
from vega_datasets import data

alt.data_transformers.enable("vegafusion")

source = data.wheat()

bar = alt.Chart(source).mark_bar().encode(
    x='year:O',
    y='wheat:Q'
)

rule = alt.Chart(source).mark_rule(color='red').encode(
    y='mean(wheat):Q'
)

layer_chart = (bar + rule).properties(width=600)
layer_chart.to_json(format="vega")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[3], line 1
----> 1 layer_chart.to_dict(format="vega")

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/vegalite/v5/api.py:903](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/vegalite/v5/api.py:903), in TopLevelMixin.to_dict(self, validate, format, ignore, context)
    898 context["top_level"] = False
    900 # TopLevelMixin instance does not necessarily have to_dict defined
    901 # but due to how Altair is set up this should hold.
    902 # Too complex to type hint right now
--> 903 vegalite_spec = super(TopLevelMixin, copy).to_dict(  # type: ignore[misc]
    904     validate=validate, ignore=ignore, context=context
    905 )
    907 # TODO: following entries are added after validation. Should they be validated?
    908 if is_top_level:
    909     # since this is top-level we add $schema if it's missing

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:807](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:807), in SchemaBase.to_dict(self, validate, ignore, context)
    805     if "mark" in kwds and isinstance(kwds["mark"], str):
    806         kwds["mark"] = {"type": kwds["mark"]}
--> 807     result = _todict(
    808         kwds,
    809         context=context,
    810     )
    811 else:
    812     raise ValueError(
    813         "{} instance has both a value and properties : "
    814         "cannot serialize to dict".format(self.__class__)
    815     )

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:340](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:340), in _todict(obj, context)
    338     return [_todict(v, context) for v in obj]
    339 elif isinstance(obj, dict):
--> 340     return {k: _todict(v, context) for k, v in obj.items() if v is not Undefined}
    341 elif hasattr(obj, "to_dict"):
    342     return obj.to_dict()

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:340](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:340), in (.0)
    338     return [_todict(v, context) for v in obj]
    339 elif isinstance(obj, dict):
--> 340     return {k: _todict(v, context) for k, v in obj.items() if v is not Undefined}
    341 elif hasattr(obj, "to_dict"):
    342     return obj.to_dict()

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:338](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:338), in _todict(obj, context)
    336     return obj.to_dict(validate=False, context=context)
    337 elif isinstance(obj, (list, tuple, np.ndarray)):
--> 338     return [_todict(v, context) for v in obj]
    339 elif isinstance(obj, dict):
    340     return {k: _todict(v, context) for k, v in obj.items() if v is not Undefined}

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:338](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:338), in (.0)
    336     return obj.to_dict(validate=False, context=context)
    337 elif isinstance(obj, (list, tuple, np.ndarray)):
--> 338     return [_todict(v, context) for v in obj]
    339 elif isinstance(obj, dict):
    340     return {k: _todict(v, context) for k, v in obj.items() if v is not Undefined}

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:336](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/utils/schemapi.py:336), in _todict(obj, context)
    334 """Convert an object to a dict representation."""
    335 if isinstance(obj, SchemaBase):
--> 336     return obj.to_dict(validate=False, context=context)
    337 elif isinstance(obj, (list, tuple, np.ndarray)):
    338     return [_todict(v, context) for v in obj]

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/vegalite/v5/api.py:2677](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/vegalite/v5/api.py:2677), in Chart.to_dict(self, validate, format, ignore, context)
   2673     copy.data = core.InlineData(values=[{}])
   2674     return super(Chart, copy).to_dict(
   2675         validate=validate, format=format, ignore=ignore, context=context
   2676     )
-> 2677 return super().to_dict(
   2678     validate=validate, format=format, ignore=ignore, context=context
   2679 )

File [~/Library/CloudStorage/Dropbox/Programming/altair/altair/vegalite/v5/api.py:926](https://file+.vscode-resource.vscode-cdn.net/Users/stefanbinder/Library/CloudStorage/Dropbox/Programming/altair/~/Library/CloudStorage/Dropbox/Programming/altair/altair/vegalite/v5/api.py:926), in TopLevelMixin.to_dict(self, validate, format, ignore, context)
    924 if context.get("pre_transform", True) and _using_vegafusion():
    925     if format == "vega-lite":
--> 926         raise ValueError(
    927             'When the "vegafusion" data transformer is enabled, the \n'
    928             "to_dict() and to_json() chart methods must be called with "
    929             'format="vega". \n'
    930             "For example: \n"
    931             '    >>> chart.to_dict(format="vega")\n'
    932             '    >>> chart.to_json(format="vega")'
    933         )
    934     else:
    935         return _compile_with_vegafusion(vegalite_spec)

ValueError: When the "vegafusion" data transformer is enabled, the 
to_dict() and to_json() chart methods must be called with format="vega". 
For example: 
    >>> chart.to_dict(format="vega")
    >>> chart.to_json(format="vega")

@jonmmease
Copy link
Contributor Author

Great catch @binste, done in 7485b87

Copy link
Contributor

@binste binste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works great! Looking forward to the next Altair release where we can include this. 🥳 Whereas originally I wasn't sure how to integrate VegaFusion without making it confusing for users, I think this solution now is a big usability improvement and ties in well with the existing concepts of Altair (data_transformers, to_dict, ...).

@jonmmease
Copy link
Contributor Author

Thanks for the reviews @mattijn and @binste! Merging this now, and I'll work on updating the documentation next week.

@jonmmease jonmmease merged commit ae8d57b into master Jul 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Performance
Development

Successfully merging this pull request may close these issues.

3 participants