-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add slice_id to query object #25192
Conversation
thank you so much, @hwmarkcheng! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hwmarkcheng thanks for the PR! If form_data
is passed along in the query context, slice_id
should be available via that (if you search for slice_id
under superset/common
you should find how it's being used). But I kind of agree that slice_id
should be a top level property, and I've been considering adding it, too, so maybe this is a good way to be more explicit about the relevant chart.
To avoid duplicated logic related to slice_id
, can you harmonize there to only be one way of passing it in? Also, I didn't notice any frontend changes - we should probably also add automatic population of slice_id
to the frontend.
Hey @villebro! Thank you for your review! Yes, We traced back to where Ah we haven't dived too much into the frontend portion of the code. Would the ask be to add something like a #slice_id into the chart editor UI so that users can easily get the ID of the chart? |
@hwmarkcheng Thanks for your contribution, but I have different potin for this feature, --- how to identify a dashboard or a chart in I don't think this PR is a good pattern in current Query Object because the "slice_id" isn't a part of a Query, we don't need slice_id to generate a SQL. In our fork-superset(Burda Group), we set lots of "tags" in Superset. e.g.: In fact, this is a simple implement, the slice_id or any ID(dashboard_id, or others ids) passed from url parameter. def SQL_QUERY_MUTATOR(sql: str, **kwargs) -> str:
import json
import logging
from enum import Enum
logger = logging.getLogger(__name__)
from flask import request
from superset.security import SupersetSecurityManager
from superset.models.dashboard import Dashboard
from superset.models.slice import Slice as Chart
from superset.datasource.dao import DatasourceDAO
from superset import db
class QueryType(str, Enum):
DASHBOARD = "dashboard"
CHART = "chart"
REPORT = "report"
SQL_LAB = "sql_lab"
EXPLORE = "explore"
NATIVE_FILTER = "native_filter"
form_data = request.args.get("form_data", "")
try:
form_data = json.loads(form_data)
except json.JSONDecodeError:
form_data = {}
slice_id = form_data.get("slice_id")
dashboard_id = request.args.get("dashboard_id")
query_info_dict = {}
if dashboard_id is not None:
# 1) Query from Dashboard
dashboard = Dashboard.get(str(dashboard_id))
query_info_dict["query_type"] = QueryType.DASHBOARD.value
query_info_dict["dashboard_id"] = dashboard_id
query_info_dict["dashboard_title"] = getattr(dashboard, "dashboard_title", None)
if slice_id is not None:
# 2) Query from Chart
chart = Chart.get(slice_id)
query_info_dict["query_type"] = QueryType.CHART.value
query_info_dict["chart_id"] = slice_id
query_info_dict["chart_title"] = getattr(chart, "slice_name", None)
if slice_id is None and dashboard_id is None and "sql_json" in request.path:
# 3) Query from SQL Lab
query_info_dict["query_type"] = QueryType.SQL_LAB.value
if (
slice_id is None
and dashboard_id is None
and ({"/api/v1/chart/data", "/superset/explore_json/"}).intersection({request.path})
):
# 4) Query from Explore(unsaved chart)
query_info_dict["query_type"] = QueryType.EXPLORE.value
if request.args.get("tag") == "report" and request.args.get("execution_id"):
# 5) Query from Report
query_info_dict["query_type"] = QueryType.REPORT.value
query_info_dict["report_id"] = request.args.get("report_id")
query_info_dict["execution_id"] = request.args.get("execution_id")
query_info_dict["report_name"] = request.args.get("report_name")
if request.args.get("tag") == "native_filter" and request.args.get("filter_name"):
# 6) a query from Dashboard Native Filter
query_info_dict["query_type"] = QueryType.NATIVE_FILTER.value
query_info_dict["filter_name"] = request.args.get("filter_name")
....
.... |
yeah, to me it feels a little strange that we're taking the slice_id from the form_data, and then fetching the slice from it and then getting the id off the slice, when we could just pull it from the form_data that's passed into the query_context. Which is prob another reason @villebro why you suggested to create only way to pass in the chart.. makes sense. Maybe to make this simpler for now, I'd suggest can we move forward with just using the form_data from the query_context instead of explicitly passing in the chart/slice_id into the query_obj? It also feels to me at least that the chart/slice_id has no relationship to the query object, but it does have a relationship to the form_data which is more closely related to a chart. |
I agree with @zhaoyongjie on this point as well. I think a middle ground could be what @villebro had alluded to earlier that I commented on, which is that the slice_id already exists in the form data. LMK if you think that approach works @zhaoyongjie. |
Thank you everyone for chiming in here! We agree that it doesn't make much sense to add
|
Thanks @hwmarkcheng. Yeah if everyone else is ok with # 1, I think it sounds like a good plan. I still think it's a good idea to pass form_data to the mutator as taking it from the url could be brittle, and there will be use cases of api requests rather than a browser request that could trigger a query. |
Closing PR as I'm looking to just revert my changes (accidentally merged upstream on fork). Will look to start a clean fork and implement proposed changes, apologies for the confusion here! |
Hi folks! @eschutho @villebro Thank you for your feedback on this thread! I have been looking into the suggested approach you mentioned with @hwmarkcheng, about pulling the While it makes sense that currently The problem is however, that the places in which the final query string is constructed ( This is the reason why we injected the value into the object as opposed to just pulling from Should we be looking to make those changes and pass them around anyways? The other approach of modifying the SQL_QUERY_MUTATOR to use something like request args seems to brittle... If you have any further guidance on what approach would be best, we would love to hear your input. Thank you so much!! |
Thanks for that context @MDeadman, and that's a good assessment. If you'd like to reopen the PR and rebase with master, I'm happy to take another look at the changes proposed and try to help get this merged for you. |
It looks like the linter is having a problem. Hopefully running |
Thanks! I've ran the |
Still noticing some Errors in the linting CI step after running the pre-commit command. slice_id isn't really being directly used, but just passed through.
|
I have some ideas around this. There have been a few other cases where we've needed to pass some data around for the purpose of logging, and maybe I can tackle all of those with a new global pattern. I'll try to get a PR up for review in the next few days. |
SUMMARY
Hi team,
We're looking to make a few changes that would allow folks to populate the
sql_query_mutator
with more useful params. Ultimately we're hoping to get chart_id and dataset_id into query comments, to automate our data masking with Satori. Open to discussion on the best practices to do so.These are the three items we're looking to change:
slice_id
toquery_obj
: As addition context forquery_obj
. Having slice_id is versatile and can open doors for powerful integrations/automations. We originally wanted to add in the whole slice object for more options, but some integration tests failed since the object cannot be converted to a dictionary. Happy to discuss the best way to do this going forward.chart_id
anddataset_id
as default parameters for mutate_query_from_configExample Usecase, Automated Data Access Requests via Satori:
Context: We use Satori for our data access management.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
Changed query_obj tests to account for additional parameters:
ADDITIONAL INFORMATION