-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(native-filters): Fetch only the required dataset fields #23303
chore(native-filters): Fetch only the required dataset fields #23303
Conversation
Codecov Report
@@ Coverage Diff @@
## master #23303 +/- ##
==========================================
+ Coverage 56.27% 65.93% +9.66%
==========================================
Files 1907 1907
Lines 73495 73590 +95
Branches 7977 7982 +5
==========================================
+ Hits 41356 48524 +7168
+ Misses 30091 23018 -7073
Partials 2048 2048
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 341 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
31a91aa
to
1b53e5d
Compare
Closing (for now) in favor of #23314. |
419f484
to
cce1165
Compare
b602228
to
01ffa3b
Compare
01ffa3b
to
8a64476
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @john-bodley. I left some comments.
@@ -85,7 +86,9 @@ export function ColumnSelect({ | |||
} | |||
if (datasetId != null) { | |||
cachedSupersetGet({ | |||
endpoint: `/api/v1/dataset/${datasetId}`, | |||
endpoint: `/api/v1/dataset/${datasetId}?q=${rison.encode({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The columns
state is later used to get currentColumn
which is then passed to the filterValues
function. If you check filterValues
references, you will see that is_dttm
and type_generic
attributes are also used.
@@ -654,7 +655,20 @@ const FiltersConfigForm = ( | |||
useEffect(() => { | |||
if (datasetId) { | |||
cachedSupersetGet({ | |||
endpoint: `/api/v1/dataset/${datasetId}`, | |||
endpoint: `/api/v1/dataset/${datasetId}?q=${rison.encode({ | |||
columns: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found other attributes that need to be queried when analyzing the following files:
Line 138 in da3791a
const { |
database.id
datasource_name
schema
is_sqllab_view
Line 121 in da3791a
const isColumnBoolean = |
columns.type
superset/superset-frontend/packages/superset-ui-chart-controls/src/utils/getTemporalColumns.ts
Line 43 in da3791a
if (isDataset(datasource)) { |
columns.is_dttm
columns.name
main_dttm_col
return !!datasource && 'results' in datasource && 'sql' in datasource; |
results
sql
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michael-s-molina it seem like results
isn't an attribute returned per here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@john-bodley You're right. Given that results
is not listed in show_select_columns
, I'm assuming that it's populated elsewhere and you projections won't affect this behavior. Tagging @zhaoyongjie and @villebro in case they know something different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I think this is needed in the Explore control panel only, hence shouldn't affect dashboards/native filters.
8a64476
to
61dd061
Compare
...ashboard/components/nativeFilters/FiltersConfigModal/FiltersConfigForm/FiltersConfigForm.tsx
Outdated
Show resolved
Hide resolved
@villebro @zhaoyongjie Can you double check if there are no missing attributes or invalid projection names? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested to work as expected and LGTM after @michael-s-molina 's database_name
comment has been addressed.
'columns.is_dttm', | ||
'columns.type_generic', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100 % sure on this (more like 90 %), but I feel like is_dttm
is no longer necessary, and type_generic === GenericDataType.TEMPORAL
should always be true for any col that has is_dttm === true
, and is the preferred way of checking for temporal columns in the frontend. So if we want to reduce payload size, this is kind of a low hanging fruit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro thanks for the review. If your hypothesis is correct then it seems like a follow up PR would be to deprecate the is_dttm
column everywhere.
@@ -654,7 +655,20 @@ const FiltersConfigForm = ( | |||
useEffect(() => { | |||
if (datasetId) { | |||
cachedSupersetGet({ | |||
endpoint: `/api/v1/dataset/${datasetId}`, | |||
endpoint: `/api/v1/dataset/${datasetId}?q=${rison.encode({ | |||
columns: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I think this is needed in the Explore control panel only, hence shouldn't affect dashboards/native filters.
…rsConfigModal/FiltersConfigForm/FiltersConfigForm.tsx Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com>
@@ -664,7 +664,7 @@ const FiltersConfigForm = ( | |||
'columns.type', | |||
'columns.verbose_name', | |||
'database.id', | |||
'database_name', | |||
'database.database_name', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's somewhat weird how FAB doesn't barf when you give it invalid columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
SUMMARY
This PR partially addresses recommendation #1 in #14383 by requesting—via the FAB API—only the required fields (columns) from the
/api/v1/dataset/{pk}
RESTful GET endpoint.Fetching the entirety of dataset (especially a very large datasets) can be rather expensive which. #22413 made significant performance improvements (reducing the response time from > 300 seconds to ~ 5 seconds) however later #23113 was introduced to address some performance tradeoffs (in other contexts) by disabling eager loading (increasing the response time from ~ 5 seconds to ~ 30 seconds). The bulk of this increased cost was from the repeated lazy loading of the back referenced table i.e.,
self.table
, for every dataset column (even though this is always the same table; possibly a SQLAlchemy quirk) for a subset of theTableColumn
fields including the advanced data type.This PR updates—in the context of the native filters modal—the
/api/v1/dataset/{pk}
requests to fetch only the subset of columns required in the response. This helps to ensure the we're:For the dataset in question said change reduced the response time from ~ 30 seconds to ~ 3 seconds which significantly improves the UX when interfacing with the modal.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
CI.
ADDITIONAL INFORMATION