-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Ensure table uniqueness on update #15909
fix: Ensure table uniqueness on update #15909
Conversation
74a6352
to
915c28c
Compare
b8a102c
to
3735c81
Compare
@@ -254,13 +245,31 @@ def get_slice( | |||
return slc | |||
|
|||
@staticmethod | |||
def get_table_by_name(name: str) -> SqlaTable: | |||
return get_table_by_name(name) | |||
def get_table( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup to ensure consistency.
|
||
|
||
class TestDatasource(SupersetTestCase): | ||
def setUp(self): | ||
self.original_attrs = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really have no idea why this logic is here and the before_update
method exposed its fragility. Rather than using instance variables to save state and restore state, using a sub-transaction and rolling back seems considerably cleaner.
@@ -168,25 +158,21 @@ def test_save(self): | |||
else: | |||
self.assertEqual(resp[k], datasource_post[k]) | |||
|
|||
def save_datasource_from_dict(self, datasource_dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. The local variable was never used but instead the global variable was. Seems like a mistake.
data = dict(data=json.dumps(datasource_post)) | ||
resp = self.get_json_resp("/datasource/save/", data) | ||
return resp | ||
|
||
@pytest.mark.usefixtures("load_birth_names_dashboard_with_slices") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea how this test worked without first loading the example data.
|
||
datasource_post = get_datasource_post() | ||
datasource_post["id"] = tbl.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes! This was updating the global variable.
@@ -42,11 +42,6 @@ | |||
from tests.integration_tests.test_app import app | |||
|
|||
|
|||
def get_table_by_name(name: str) -> SqlaTable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be redefined. Adhering to the DRY principle.
0a7a81e
to
9175b6d
Compare
@@ -351,13 +345,6 @@ def test_import_2_slices_for_same_table(self): | |||
self.assert_slice_equals(slc_2, imported_slc_2) | |||
self.assertEqual(imported_slc_2.datasource.perm, imported_slc_2.perm) | |||
|
|||
def test_import_slices_for_non_existent_table(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the purpose of this test was. If the table doesn't exist the get_table(...)
precursor method with throw.
9a8e25c
to
f76f2b5
Compare
Codecov Report
@@ Coverage Diff @@
## master #15909 +/- ##
==========================================
+ Coverage 76.69% 76.88% +0.19%
==========================================
Files 995 995
Lines 52774 52797 +23
Branches 6691 6695 +4
==========================================
+ Hits 40475 40594 +119
+ Misses 12074 11978 -96
Partials 225 225
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
f76f2b5
to
cade3cb
Compare
superset/connectors/sqla/models.py
Outdated
# not exist in the migrations, but is required by `import_from_dict` to ensure the | ||
# correct filters are applied in order to identify uniqueness. | ||
# | ||
# The reason it does not physically exist is MySQL, PostgreSQL have a different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grammar nit: ...is MySQL and PostgreSQL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with one nit. thanks for cleaning up this edge case
@@ -53,9 +53,9 @@ | |||
logger = logging.getLogger(__name__) | |||
|
|||
|
|||
class Slice( | |||
class Slice( # pylint: disable=too-many-instance-attributes,too-many-public-methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why pylint
was complaining on this file as it was unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I also faced this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. :-P
* fix: Ensure table uniqueness on update * Update models.py * Update slice.py * Update datasource_tests.py Co-authored-by: John Bodley <john.bodley@airbnb.com>
* fix: Ensure table uniqueness on update * Update models.py * Update slice.py * Update datasource_tests.py Co-authored-by: John Bodley <john.bodley@airbnb.com>
* fix: Ensure table uniqueness on update * Update models.py * Update slice.py * Update datasource_tests.py Co-authored-by: John Bodley <john.bodley@airbnb.com>
* fix: Ensure table uniqueness on update * Update models.py * Update slice.py * Update datasource_tests.py Co-authored-by: John Bodley <john.bodley@airbnb.com>
SUMMARY
The DAO model which @dpgaspar, @villebro, et al. added goes a long way to making #5449 obsolete, i.e., it has the necessary existence checks when creating or updating a table.
There are however at least two issues,
The Datasource Editor still uses the legacy
/datasource/save/
endpoint (per here) rather than the DAO model and so the existence logic is not invoked on update.The legacy CRUD Editor does not use the DAO model.
meaning it is currently possible to rename a dataset resulting in multiple datasets with the same schema and table name.
Though these issues should be resolved once we fully deprecate the CRUD editor and migrate over to the DAO model, I felt it was prudent to add an interim check which enforces uniqueness. Given then various vectors I thought the simplest and safest path was to leverage the approach in #5449 and add a SQLAlchemy before update listener. Note although the UX may not be ideal, my sense is it is suffice as an interim measure.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Editing/Updating via the Datasource Editor
Editing/Updating via the legacy CRUD Editor
Though this UX is not ideal, i.e., it routes you to explorer, the dataset is not updated and there is messaging indicating that the dataset already exists.
TESTING INSTRUCTIONS
CI and manual testing.
ADDITIONAL INFORMATION