Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Repeated boilerplate code between upload to database forms #16756

Merged
merged 18 commits into from
Oct 25, 2021

Conversation

exemplary-citizen
Copy link
Contributor

SUMMARY

class CsvToDatabaseForm(DynamicForm):
    # pylint: disable=E0211
    def csv_allowed_dbs() -> List[Database]:  # type: ignore
        csv_enabled_dbs = (
            db.session.query(Database).filter_by(allow_csv_upload=True).all()
        )
        return [
            csv_enabled_db
            for csv_enabled_db in csv_enabled_dbs
            if CsvToDatabaseForm.at_least_one_schema_is_allowed(csv_enabled_db)
        ]

There's unnecessary duplication here between CSV and other forms.

Instead we can abstract this into a class UploadToDatabaseForm(DynamicForm) with all the repeated boilerplate and then extend those into class CsvToDatabaseForm(UploadToDatabaseForm), class ExcelToDatabaseForm(UploadToDatabaseForm) etc.

TESTING INSTRUCTIONS

Watch CI and any suggestions from reviewers

ADDITIONAL INFORMATION

Fixes #16046

  • Has associated issue: Repeated boilerplate code between upload to database forms #16046
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

cc: @villebro

@exemplary-citizen exemplary-citizen requested a review from a team as a code owner September 21, 2021 01:40
@superset-github-bot superset-github-bot bot added the dropbox Dropbox related label Sep 21, 2021
@github-actions
Copy link
Contributor

⚠️ @exemplary-citizen Your base branch master has just also updated superset/migrations.

Please consider rebasing your branch to avoid db migration conflicts.

1 similar comment
@github-actions
Copy link
Contributor

⚠️ @exemplary-citizen Your base branch master has just also updated superset/migrations.

Please consider rebasing your branch to avoid db migration conflicts.

@betodealmeida betodealmeida self-requested a review September 24, 2021 04:11
@codecov
Copy link

codecov bot commented Sep 24, 2021

Codecov Report

Merging #16756 (227f8ee) into master (37944e1) will decrease coverage by 0.21%.
The diff coverage is 78.37%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16756      +/-   ##
==========================================
- Coverage   76.89%   76.68%   -0.22%     
==========================================
  Files        1038     1038              
  Lines       55515    55494      -21     
  Branches     7564     7564              
==========================================
- Hits        42690    42556     -134     
- Misses      12575    12688     +113     
  Partials      250      250              
Flag Coverage Δ
hive ?
javascript 70.90% <37.50%> (ø)
mysql 81.91% <89.65%> (+0.01%) ⬆️
postgres 81.92% <89.65%> (+0.01%) ⬆️
presto ?
python 82.00% <89.65%> (-0.41%) ⬇️
sqlite 81.59% <89.65%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...c/views/CRUD/data/database/DatabaseModal/index.tsx 43.57% <0.00%> (ø)
...set-frontend/src/views/CRUD/data/database/types.ts 100.00% <ø> (ø)
superset/config.py 91.47% <ø> (ø)
superset/databases/api.py 93.00% <ø> (ø)
superset/views/database/mixins.py 81.03% <ø> (-1.73%) ⬇️
superset/databases/commands/export.py 90.47% <66.66%> (ø)
superset/views/database/forms.py 94.44% <85.71%> (+7.34%) ⬆️
superset/databases/schemas.py 98.40% <87.50%> (ø)
...tend/src/views/CRUD/data/database/DatabaseList.tsx 78.07% <100.00%> (ø)
.../CRUD/data/database/DatabaseModal/ExtraOptions.tsx 93.18% <100.00%> (ø)
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 37944e1...227f8ee. Read the comment docs.

@exemplary-citizen
Copy link
Contributor Author

@villebro do you mind taking a look when you get a chance?

@github-actions
Copy link
Contributor

⚠️ @exemplary-citizen Your base branch master has just also updated superset/migrations.

Please consider rebasing your branch to avoid db migration conflicts.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question on the db migration (also, this needs a rebase). I think it may be safer to leave the column name in dbs as-is and only change the rest of the codebase so that we can avoid a database migration.

Comment on lines 34 to 44
def upgrade():
with op.batch_alter_table("dbs") as batch_op:
batch_op.add_column(
sa.Column(
"allow_file_upload",
sa.Boolean(),
nullable=False,
server_default=expression.true(),
)
)
batch_op.drop_column("allow_csv_upload")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this remove all existing allow_csv_upload values and set the new allow_file_upload to TRUE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding and dropping the columns, I think we can go with this renaming pattern instead:

Suggested change
def upgrade():
with op.batch_alter_table("dbs") as batch_op:
batch_op.add_column(
sa.Column(
"allow_file_upload",
sa.Boolean(),
nullable=False,
server_default=expression.true(),
)
)
batch_op.drop_column("allow_csv_upload")
def upgrade():
with op.batch_alter_table("dbs") as batch_op:
batch_op.alter_column("allow_csv_upload", new_column_name="allow_file_upload")

class ExcelToDatabaseForm(DynamicForm):
# pylint: disable=E0211
def excel_allowed_dbs() -> List[Database]: # type: ignore
# TODO: change allow_csv_upload to allow_file_upload
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a longstanding TODO so I do think we should go ahead with the db migration.

Comment on lines 34 to 44
def upgrade():
with op.batch_alter_table("dbs") as batch_op:
batch_op.add_column(
sa.Column(
"allow_file_upload",
sa.Boolean(),
nullable=False,
server_default=expression.true(),
)
)
batch_op.drop_column("allow_csv_upload")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding and dropping the columns, I think we can go with this renaming pattern instead:

Suggested change
def upgrade():
with op.batch_alter_table("dbs") as batch_op:
batch_op.add_column(
sa.Column(
"allow_file_upload",
sa.Boolean(),
nullable=False,
server_default=expression.true(),
)
)
batch_op.drop_column("allow_csv_upload")
def upgrade():
with op.batch_alter_table("dbs") as batch_op:
batch_op.alter_column("allow_csv_upload", new_column_name="allow_file_upload")

@github-actions
Copy link
Contributor

⚠️ @exemplary-citizen Your base branch master has just also updated superset/migrations.

Please consider rebasing your branch to avoid db migration conflicts.

@exemplary-citizen
Copy link
Contributor Author

@villebro can you take another look?

@villebro
Copy link
Member

Thanks @exemplary-citizen, sorry for the delay, I'll give this a new review + test as soon as I have some spare time (hoping to do it this week).

@villebro villebro self-requested a review October 21, 2021 11:57
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work @exemplary-citizen ! LGTM and tested to work well!

@villebro villebro merged commit ef3afbd into apache:master Oct 25, 2021
@betodealmeida
Copy link
Member

betodealmeida commented Oct 28, 2021

This migration is failing on MySQL for me, for some reason:

sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (3959, "Check constraint 'dbs_chk_9' uses column 'allow_csv_upload', hence column cannot be dropped or renamed.")

Looks like it happens in MySQL 8: sqlalchemy/alembic#699

@john-bodley
Copy link
Member

@betodealmeida @exemplary-citizen @villebro it seems like there should be an entry in UPDATING.md for this as the DDL operation may require downtime.

exemplary-citizen added a commit to exemplary-citizen/superset that referenced this pull request Oct 29, 2021
@john-bodley
Copy link
Member

Also @betodealmeida @exemplary-citizen @villebro the current migration is not suffice. The dbs.extra JSON encoded field needs to be updated given that the schemas_allowed_for_csv_upload field was renamed to schemas_allowed_for_file_upload.

@exemplary-citizen
Copy link
Contributor Author

@john-bodley just opened #17294. feel free to assign to me. I'll start working on a fix

@john-bodley
Copy link
Member

Thanks @exemplary-citizen. I've assigned the issue to you. Feel free to add me as a reviewer and/or ping me in Slack—given that most of my GitHub mentions seem to get lost in the aether.

nytai pushed a commit that referenced this pull request Oct 30, 2021
Co-authored-by: John Bodley <4567245+john-bodley@users.noreply.github.com>
@betodealmeida
Copy link
Member

Another problem with this PR that I missed: it changed a versioned schema (ImportV1DatabaseExtraSchema), breaking import/export.

(Not complaining or assigning blame, just calling it out so we catch these things in the future.)

@junlincc
Copy link
Member

Sorry for being late the the discussion. @betodealmeida @john-bodley thanks for calling out.
Looks like this PR involves DB migration? We should definitely come up with a guideline for reviewing PRs that requires DB migration as they have higher risk. At least add the risk label while reviewing the PR.
@villebro

AAfghahi pushed a commit that referenced this pull request Jan 10, 2022
Co-authored-by: John Bodley <4567245+john-bodley@users.noreply.github.com>
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.5.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels dropbox Dropbox related size/L 🚢 1.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Repeated boilerplate code between upload to database forms
6 participants