Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(reports): execute as other than selenium user #21931

Merged
merged 11 commits into from
Oct 31, 2022

Conversation

villebro
Copy link
Member

@villebro villebro commented Oct 25, 2022

SUMMARY

Currently Alerts & Reports only supports executing the report and alert queries as the THUMBNAIL_SELENIUM_USER. This can be problematic in a scenario where user impersonation is enabled and there isn't a service account that has full access to all datasources or Superset objects.

To remedy this, this PR adds a config option ALERT_REPORTS_EXECUTE_AS making it possible to execute reports as

  1. the creator if contained in the owners list
  2. the creator
  3. the last modifier if contained in the owners list
  4. the last modifier
  5. an owner (giving priority to the last modifier, then the creator if either is contained within the owners list, finally picking the first owner in the list)
  6. the THUMBNAIL_SELENIUM_USER

The config is a list of values, meaning that if the first option isn't found (e.g. in the case of alerts/reports created programmatically where the created_by, changed_by or owners is undefined), it will move to the next option. By default it will be set to use the selenium user so as to not introduce a breaking change.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Oct 25, 2022

Codecov Report

Merging #21931 (adeaeb2) into master (66f166b) will increase coverage by 0.05%.
The diff coverage is 97.29%.

@@            Coverage Diff             @@
##           master   #21931      +/-   ##
==========================================
+ Coverage   66.92%   66.98%   +0.05%     
==========================================
  Files        1807     1808       +1     
  Lines       69238    69289      +51     
  Branches     7407     7407              
==========================================
+ Hits        46338    46412      +74     
+ Misses      20991    20968      -23     
  Partials     1909     1909              
Flag Coverage Δ
hive 52.89% <35.13%> (-0.03%) ⬇️
mysql 78.35% <75.67%> (-0.02%) ⬇️
postgres 78.41% <75.67%> (?)
presto 52.80% <35.13%> (-0.03%) ⬇️
python 81.51% <97.29%> (+0.09%) ⬆️
sqlite 76.90% <75.67%> (-0.02%) ⬇️
unit 51.16% <75.67%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/reports/commands/execute.py 92.60% <93.75%> (+0.32%) ⬆️
superset/reports/utils.py 97.43% <97.43%> (ø)
superset/config.py 91.77% <100.00%> (+0.05%) ⬆️
superset/reports/commands/exceptions.py 98.26% <100.00%> (ø)
superset/reports/types.py 100.00% <100.00%> (ø)
superset/utils/machine_auth.py 94.23% <100.00%> (+0.11%) ⬆️
superset/utils/screenshots.py 48.69% <100.00%> (+0.45%) ⬆️
superset/utils/webdriver.py 79.77% <100.00%> (+0.22%) ⬆️
superset/dashboards/api.py 92.52% <0.00%> (ø)
superset/databases/api.py 95.29% <0.00%> (+0.31%) ⬆️
... and 8 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@villebro villebro force-pushed the villebro/report-execute-as branch 2 times, most recently from 69ed115 to 0af896d Compare October 25, 2022 18:35
if user_type == "selenium":
return app.config["THUMBNAIL_SELENIUM_USER"]
if user_type == "creator":
if user := report_schedule.created_by:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@villebro the walrus operator isn't supported in Python 3.8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dashboard_state = self._report_schedule.extra.get("dashboard")
if dashboard_state:
permalink_key = CreateDashboardPermalinkCommand(
dashboard_id=self._report_schedule.dashboard_id,
dashboard_id=str(self._report_schedule.dashboard_id),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Mypy not working as expected? Also why should the ID be a string?

Copy link
Member Author

@villebro villebro Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mypy actually picked this up. This is due to slugs and ids being used interchangeably in the codebase (I remember being frustrated about this when working on the permalink feature). In older dashboard and chart code this is usually referred to as id_or_slug which is a str, but in the permalink code we decided to just go with (dashboard|chart)_id.

@villebro villebro force-pushed the villebro/report-execute-as branch 10 times, most recently from 1dfe1ca to e833ea3 Compare October 26, 2022 13:47
@villebro villebro changed the title [WIP] feat(reports): execute as other than selenium user feat(reports): execute as other than selenium user Oct 26, 2022
@@ -15,6 +15,8 @@
# specific language governing permissions and limitations
# under the License.

from __future__ import annotations
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor bycatches

Comment on lines -253 to +254
class ReportScheduleSelleniumUserNotFoundError(CommandException):
message = _("Report Schedule sellenium user not found")
class ReportScheduleUserNotFoundError(CommandException):
message = _("Report Schedule user not found")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a typo here + we should make this more generic now as it's not really bound to a specific Selenium user

Copy link
Member

@zhaoyongjie zhaoyongjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in my local, and work as expected. leave a non-blocked suggestions.

Could output the Report/Alert's sender in the logging? Because it's hard to determine who send the notification from the current logging.

[2022-10-27 15:38:00,050: INFO/ForkPoolWorker-8] Scheduling alert Test Report eta: 2022-10-27 07:38:00
Executing alert/report, task id: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79, scheduled_dttm: 2022-10-27T07:38:00
[2022-10-27 15:38:00,059: INFO/ForkPoolWorker-8] Executing alert/report, task id: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79, scheduled_dttm: 2022-10-27T07:38:00
session is validated: id 4, executionid: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79
[2022-10-27 15:38:00,060: INFO/ForkPoolWorker-8] session is validated: id 4, executionid: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79
Getting chart from http://127.0.0.1:8088/api/v1/chart/316/data/?format=json&type=post_processed&force=true
[2022-10-27 15:38:00,135: INFO/ForkPoolWorker-8] Getting chart from http://127.0.0.1:8088/api/v1/chart/316/data/?format=json&type=post_processed&force=true
Would send notification for alert Test Report, to {"target": "foo@foo.com"}
[2022-10-27 15:38:00,398: INFO/ForkPoolWorker-8] Would send notification for alert Test Report, to {"target": "foo@foo.com"}

@villebro
Copy link
Member Author

Tested in my local, and work as expected. leave a non-blocked suggestions.

Could output the Report/Alert's sender in the logging? Because it's hard to determine who send the notification from the current logging.

[2022-10-27 15:38:00,050: INFO/ForkPoolWorker-8] Scheduling alert Test Report eta: 2022-10-27 07:38:00
Executing alert/report, task id: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79, scheduled_dttm: 2022-10-27T07:38:00
[2022-10-27 15:38:00,059: INFO/ForkPoolWorker-8] Executing alert/report, task id: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79, scheduled_dttm: 2022-10-27T07:38:00
session is validated: id 4, executionid: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79
[2022-10-27 15:38:00,060: INFO/ForkPoolWorker-8] session is validated: id 4, executionid: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79
Getting chart from http://127.0.0.1:8088/api/v1/chart/316/data/?format=json&type=post_processed&force=true
[2022-10-27 15:38:00,135: INFO/ForkPoolWorker-8] Getting chart from http://127.0.0.1:8088/api/v1/chart/316/data/?format=json&type=post_processed&force=true
Would send notification for alert Test Report, to {"target": "foo@foo.com"}
[2022-10-27 15:38:00,398: INFO/ForkPoolWorker-8] Would send notification for alert Test Report, to {"target": "foo@foo.com"}

@zhaoyongjie that's an excellent idea! 👏 I will add it

Copy link
Member

@zhaoyongjie zhaoyongjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I love the new logging because it makes debugging so easy.

superset/config.py Outdated Show resolved Hide resolved
superset/reports/commands/execute.py Outdated Show resolved Hide resolved
superset/utils/machine_auth.py Show resolved Hide resolved
@dpgaspar
Copy link
Member

Tested in my local, and work as expected. leave a non-blocked suggestions.
Could output the Report/Alert's sender in the logging? Because it's hard to determine who send the notification from the current logging.

[2022-10-27 15:38:00,050: INFO/ForkPoolWorker-8] Scheduling alert Test Report eta: 2022-10-27 07:38:00
Executing alert/report, task id: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79, scheduled_dttm: 2022-10-27T07:38:00
[2022-10-27 15:38:00,059: INFO/ForkPoolWorker-8] Executing alert/report, task id: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79, scheduled_dttm: 2022-10-27T07:38:00
session is validated: id 4, executionid: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79
[2022-10-27 15:38:00,060: INFO/ForkPoolWorker-8] session is validated: id 4, executionid: 0aa521ec-670d-4c0f-aec7-78c13fb7ba79
Getting chart from http://127.0.0.1:8088/api/v1/chart/316/data/?format=json&type=post_processed&force=true
[2022-10-27 15:38:00,135: INFO/ForkPoolWorker-8] Getting chart from http://127.0.0.1:8088/api/v1/chart/316/data/?format=json&type=post_processed&force=true
Would send notification for alert Test Report, to {"target": "foo@foo.com"}
[2022-10-27 15:38:00,398: INFO/ForkPoolWorker-8] Would send notification for alert Test Report, to {"target": "foo@foo.com"}

@zhaoyongjie that's an excellent idea! 👏 I will add it

Let's avoid logging user emails please

@villebro villebro force-pushed the villebro/report-execute-as branch 2 times, most recently from 98eb58e to 6575aa7 Compare October 31, 2022 07:28
@villebro villebro merged commit a02a778 into apache:master Oct 31, 2022
@villebro villebro deleted the villebro/report-execute-as branch October 31, 2022 12:34
@Always-prog
Copy link
Contributor

Hello @villebro!
I have installed lastest Superset (including this PR), and tried to use my old config for Alerts and Reports. And not works :(.
Every report just waits endlessly.
But I use ALERT_REPORTS_NOTIFICATION_DRY_RUN it says that task is finished successfully.
Did you tested email reports without ALERT_REPORTS_NOTIFICATION_DRY_RUN?

@villebro
Copy link
Member Author

villebro commented Nov 7, 2022

@Always-prog yes, I'm actually using this locally and it works as expected. Endless execution usually implies that the report wasn't able to render successfully.

To double check that there's a regression, are you able to revert this PR and make it work again? Also, did you try to execute as the Selenium user with this PR as is the default? Doing it like this should work similarly as before - if not there may very well be a regression that I'm not aware of.

@Always-prog
Copy link
Contributor

Always-prog commented Nov 8, 2022

Thank you for answer. I'll test it again.
But I see it not working in docker-compose-non-dev also.

@Always-prog
Copy link
Contributor

Always-prog commented Nov 8, 2022

Sorry, my bad!
I used wrong config.
This PR works fine.

@villebro
Copy link
Member Author

villebro commented Nov 8, 2022

@Always-prog thanks for confirming 👍

@jileeon
Copy link
Contributor

jileeon commented Mar 13, 2023

@villebro, Thanks for your commitment :)
I look forward to using this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants