Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add schema level access control on csv upload #5787

Merged

Conversation

youngyjd
Copy link
Contributor

@youngyjd youngyjd commented Aug 31, 2018

This PR is to support schema-level access control for csv upload. There are several changes here:

  1. When add/edit a database configuration, admin users can set schemas allowed to be accessed by csv uploads in Extra field.
  2. schema field is changed to a dropdown menu instead of a text input, if schema-level access control has been provided for this database.
  3. If allow_csv_upload is checked for a database but nothing is set for schemas_allowed_for_csv_upload, all schemas can be accessed, so it will not break backward compatibility.

demo

@youngyjd youngyjd force-pushed the csv-upload-schema-level-access-control branch from 5c80077 to 3f19778 Compare August 31, 2018 20:26
@codecov-io
Copy link

codecov-io commented Sep 5, 2018

Codecov Report

Merging #5787 into master will increase coverage by 0.03%.
The diff coverage is 75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5787      +/-   ##
==========================================
+ Coverage   48.19%   48.23%   +0.03%     
==========================================
  Files         393      393              
  Lines       23600    23645      +45     
  Branches     2630     2630              
==========================================
+ Hits        11375    11406      +31     
- Misses      12225    12239      +14
Impacted Files Coverage Δ
superset/utils.py 89.28% <100%> (+0.01%) ⬆️
superset/security.py 75.55% <50%> (ø) ⬆️
superset/forms.py 92.3% <75%> (-5.81%) ⬇️
superset/models/core.py 85% <75%> (-0.1%) ⬇️
superset/views/core.py 73.84% <75.86%> (-0.11%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 71f014e...1c72ee4. Read the comment docs.

@youngyjd
Copy link
Contributor Author

youngyjd commented Sep 5, 2018

PTAL
@mistercrunch Per discussed, I put the field in the Extra of database configuration.
@betodealmeida @hughhhh

@betodealmeida
Copy link
Member

@youngyjd, taking a look at this, sorry for the delay.

@timifasubaa
Copy link
Contributor

Looks good, but can you use the security manager to control access to uploading Schemas? This seems like a functionality different deployments will want to override.

@youngyjd youngyjd force-pushed the csv-upload-schema-level-access-control branch 2 times, most recently from 0b58521 to 041063e Compare September 13, 2018 21:04
@youngyjd youngyjd force-pushed the csv-upload-schema-level-access-control branch from 041063e to e87cf4d Compare September 13, 2018 21:28
@youngyjd
Copy link
Contributor Author

@mistercrunch @betodealmeida @timifasubaa Introduced one more layer of access control by bringing in security_manager. PTAL. Thanks a lot.

Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youngyjd, thanks for the thorough work! I have just a few comments, but this looks awesome!

description=_('Specify a schema (if database flavour supports this).'),
validators=[Optional()],
widget=BS3TextFieldWidget(),
filters=[lambda x: x or None])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the or None here? Is it a way of casting falsy values (0, [], '') to None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not notice that. I just copy-pasted to move these lines of code for a better order showing on the UI. Will take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick look. I think the original author wrote this because that if no schema is provided on the frontend, the backend will receive an empty string which is '', but it should be None otherwise the database will throw some kind of error like schema does not exist. So yes, I think it is just casting falsy values, specifically '' to None.

if (security_manager.database_access(database) or
security_manager.all_datasource_access()):
return True
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else here is not needed, so you might want to remove it and dedent the block below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. thanks

get_pk=lambda a: a.id, get_label=lambda a: a.database_name)
schema = StringField(
_('Schema'),
description=_('Specify a schema (if database flavour supports this).'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "flavour" is British English, it would be better to have US English ("flavor") as the default language.

allow_ctas = Column(Boolean, default=False)
allow_dml = Column(Boolean, default=False)
force_ctas_schema = Column(String(250))
allow_multi_schema_metadata_fetch = Column(Boolean, default=True)
extra = Column(Text, default=textwrap.dedent("""\
{
"metadata_params": {},
"engine_params": {}
"engine_params": {},
"schemas_allowed_for_csv_upload": []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: leave a trailing comma on the last line, it makes it easier to sort lines and for better git blame.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to remove the trailing comma. Other wise it will show error Expecting property name enclosed in double quotes: line 4 column 1 (char 57) when I try to save a db. I guess it is related to json formatting issue?

screen shot 2018-09-19 at 1 50 04 pm

message = _('Database "{0}" Schema "{1}" is not allowed for csv uploads. '
'Please contact Superset Admin'.format(database.database_name,
schema_name))
flash(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be written in a single language:

flash(message, 'danger')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops. I must have copy-paste from somewhere and then made some changes but did not aware of the styling issue. thanks

This situation is impossible and user is unable to upload csv
b) if database supports schema
user is able to upload to schema in schemas_allowed_for_csv_upload
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@youngyjd youngyjd force-pushed the csv-upload-schema-level-access-control branch from d3a35ef to 72b3122 Compare September 19, 2018 22:05
@youngyjd youngyjd force-pushed the csv-upload-schema-level-access-control branch from 72b3122 to 1c72ee4 Compare September 19, 2018 22:31
@betodealmeida betodealmeida merged commit b6d7d57 into apache:master Sep 20, 2018
betodealmeida pushed a commit to lyft/incubator-superset that referenced this pull request Oct 12, 2018
* Add schema level access control on csv upload

* add db migrate merge point

* fix flake 8

* fix test

* remove unnecessary db migration

* fix flake

* nit

* fix test for test_schemas_access_for_csv_upload_endpoint

* fix test_csv_import test

* use security_manager to check whether schema is allowed to be accessed

* bring security manager to the party

* flake8 & repush to retrigger test

* address comments

* remove trailing comma
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Oct 17, 2018
* Add schema level access control on csv upload

* add db migrate merge point

* fix flake 8

* fix test

* remove unnecessary db migration

* fix flake

* nit

* fix test for test_schemas_access_for_csv_upload_endpoint

* fix test_csv_import test

* use security_manager to check whether schema is allowed to be accessed

* bring security manager to the party

* flake8 & repush to retrigger test

* address comments

* remove trailing comma

(cherry picked from commit b6d7d57)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Oct 17, 2018
* Add schema level access control on csv upload

* add db migrate merge point

* fix flake 8

* fix test

* remove unnecessary db migration

* fix flake

* nit

* fix test for test_schemas_access_for_csv_upload_endpoint

* fix test_csv_import test

* use security_manager to check whether schema is allowed to be accessed

* bring security manager to the party

* flake8 & repush to retrigger test

* address comments

* remove trailing comma

(cherry picked from commit b6d7d57)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Oct 17, 2018
* Add schema level access control on csv upload

* add db migrate merge point

* fix flake 8

* fix test

* remove unnecessary db migration

* fix flake

* nit

* fix test for test_schemas_access_for_csv_upload_endpoint

* fix test_csv_import test

* use security_manager to check whether schema is allowed to be accessed

* bring security manager to the party

* flake8 & repush to retrigger test

* address comments

* remove trailing comma

(cherry picked from commit b6d7d57)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Oct 17, 2018
* Add schema level access control on csv upload

* add db migrate merge point

* fix flake 8

* fix test

* remove unnecessary db migration

* fix flake

* nit

* fix test for test_schemas_access_for_csv_upload_endpoint

* fix test_csv_import test

* use security_manager to check whether schema is allowed to be accessed

* bring security manager to the party

* flake8 & repush to retrigger test

* address comments

* remove trailing comma

(cherry picked from commit b6d7d57)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Nov 2, 2018
* Add schema level access control on csv upload

* add db migrate merge point

* fix flake 8

* fix test

* remove unnecessary db migration

* fix flake

* nit

* fix test for test_schemas_access_for_csv_upload_endpoint

* fix test_csv_import test

* use security_manager to check whether schema is allowed to be accessed

* bring security manager to the party

* flake8 & repush to retrigger test

* address comments

* remove trailing comma

(cherry picked from commit b6d7d57)
youngyjd added a commit to lyft/incubator-superset that referenced this pull request Nov 2, 2018
* Add schema level access control on csv upload

* add db migrate merge point

* fix flake 8

* fix test

* remove unnecessary db migration

* fix flake

* nit

* fix test for test_schemas_access_for_csv_upload_endpoint

* fix test_csv_import test

* use security_manager to check whether schema is allowed to be accessed

* bring security manager to the party

* flake8 & repush to retrigger test

* address comments

* remove trailing comma

(cherry picked from commit b6d7d57)
@@ -844,6 +844,7 @@ def get_or_create_main_db():
dbobj.set_sqlalchemy_uri(conf.get('SQLALCHEMY_DATABASE_URI'))
dbobj.expose_in_sqllab = True
dbobj.allow_run_sync = True
dbobj.allow_csv_upload = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youngyjd why is this explicitly set? Note in general one would not want to have users upload to the main database. Secondly if you disable this setting in the UI then every time the app is restarted the database record is updated.

@mistercrunch personally I don't think we should be setting any of these fields if the dbobj exists, i.e., I feel the logic should be:

if not dbobj:
    dbobj = models.Database(database_name='main')
    dbobj.set_sqlalchemy_uri(conf.get('SQLALCHEMY_DATABASE_URI'))
    dbobj.expose_in_sqllab = False
    dbobj.allow_run_sync = False
    dbobj.allow_csv_upload = False

i.e., all these settings should be initialized as False and never updated.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.28.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.28.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants