Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for external Hive Metastores including AWS Glue #400

Merged
merged 19 commits into from
Oct 28, 2023

Conversation

FastLee
Copy link
Contributor

@FastLee FastLee commented Oct 6, 2023

Closes #337

@FastLee FastLee requested review from a team October 6, 2023 21:11
@codecov
Copy link

codecov bot commented Oct 6, 2023

Codecov Report

Merging #400 (403a376) into main (ebf66ad) will decrease coverage by 0.29%.
The diff coverage is 62.96%.

@@            Coverage Diff             @@
##             main     #400      +/-   ##
==========================================
- Coverage   80.61%   80.32%   -0.29%     
==========================================
  Files          31       31              
  Lines        3240     3294      +54     
  Branches      625      643      +18     
==========================================
+ Hits         2612     2646      +34     
- Misses        479      493      +14     
- Partials      149      155       +6     
Files Coverage Δ
src/databricks/labs/ucx/config.py 86.92% <100.00%> (+0.20%) ⬆️
src/databricks/labs/ucx/install.py 81.56% <61.53%> (-2.33%) ⬇️

src/databricks/labs/ucx/config.py Show resolved Hide resolved
src/databricks/labs/ucx/install.py Outdated Show resolved Hide resolved
@FastLee FastLee requested a review from nfx October 9, 2023 16:25
@nfx nfx added feat/crawler migrate/groups Corresponds to Migrate Groups Step of go/uc/upgrade migrate/external go/uc/upgrade SYNC EXTERNAL TABLES step step/assign metastore go/uc/upgrade Assign Metastore migrate/redash go/uc/upgrade Upgrade DBSQL Warehouses migrate/clusters go/uc/upgrade Upgrade Interactive Clusters feat/installer install/upgrade the app labels Oct 19, 2023
@gitguardian
Copy link

gitguardian bot commented Oct 25, 2023

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

src/databricks/labs/ucx/install.py Outdated Show resolved Hide resolved
src/databricks/labs/ucx/install.py Outdated Show resolved Hide resolved
src/databricks/labs/ucx/install.py Outdated Show resolved Hide resolved
src/databricks/labs/ucx/install.py Outdated Show resolved Hide resolved
src/databricks/labs/ucx/install.py Show resolved Hide resolved
src/databricks/labs/ucx/install.py Outdated Show resolved Hide resolved
src/databricks/labs/ucx/install.py Show resolved Hide resolved
@FastLee FastLee requested a review from nfx October 27, 2023 17:03
FastLee and others added 7 commits October 27, 2023 13:05
Any installations that are done from unreleased versions will get a
different format. For example, the latest released version is `0.24.0`
and there were 15 commits to `main` branch since the last release.
`./install.sh` will tell about `Installing v0.24.1+1520231006230102`.
Copy link
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure integration tests pass and then you can merge

@nfx nfx changed the title Feature/glue support Added support for external Hive Metastores including AWS Glue Oct 27, 2023
@FastLee FastLee temporarily deployed to account-admin October 27, 2023 20:07 — with GitHub Actions Inactive
@nfx nfx merged commit d68dacc into main Oct 28, 2023
@nfx nfx deleted the feature/glue_support branch October 28, 2023 15:08
nfx added a commit that referenced this pull request Nov 17, 2023
**Breaking changes** (existing installations need to reinstall UCX and re-run assessment jobs)

 * Switched local group migration component to rename groups instead of creating backup groups ([#450](#450)).
 * Mitigate permissions loss in Table ACLs by folding grants belonging to the same principal, object id and object type together ([#512](#512)).

**New features**

 * Added support for the experimental Databricks CLI launcher ([#517](#517)).
 * Added support for external Hive Metastores including AWS Glue ([#400](#400)).
 * Added more views to assessment dashboard ([#474](#474)).
 * Added rate limit for creating backup group to increase stability ([#500](#500)).
 * Added deduplication for mount point list ([#569](#569)).
 * Added documentation to describe interaction with external Hive Metastores ([#473](#473)).
 * Added failure injection for job failure message propagation  ([#591](#591)).
 * Added uniqueness in the new warehouse name to avoid conflicts on installation ([#542](#542)).
 * Added a global init script to collect Hive Metastore lineage ([#513](#513)).
 * Added retry set/update permissions when possible and assess the changes in the workspace ([#519](#519)).
 * Use `~/.ucx/state.json` to store the state of both dashboards and jobs ([#561](#561)).

**Bug fixes**

 * Fixed handling for `OWN` table permissions ([#571](#571)).
 * Fixed handling of keys with and without values. ([#514](#514)).
 * Fixed integration test failures related to concurrent group delete ([#584](#584)).
 * Fixed issue with workspace listing process on None type `object_type` ([#481](#481)).
 * Fixed missing group entitlement migration bug ([#583](#583)).
 * Fixed entitlement application for account-level groups ([#529](#529)).
 * Fixed assessment throwing an error when the owner of an object is empty ([#485](#485)).
 * Fixed installer to migrate between different configuration file versions ([#596](#596)).
 * Fixed cluster policy crawler to be aware of deleted policies ([#486](#486)).
 * Improved error message for not null constraints violated ([#532](#532)).
 * Improved integration test resiliency ([#597](#597), [#594](#594), [#586](#586)).
 * Introduced Safer access to workspace objects' properties. ([#530](#530)).
 * Mitigated permissions loss in Table ACLs by running appliers with single thread ([#518](#518)).
 * Running apply permission task before assessment should display message ([#487](#487)).
 * Split integration tests from blocking the merge queue ([#496](#496)).
 * Support more than one dashboard per step ([#472](#472)).
 * Update databricks-sdk requirement from ~=0.11.0 to ~=0.12.0 ([#505](#505)).
 * Update databricks-sdk requirement from ~=0.12.0 to ~=0.13.0 ([#575](#575)).
@nfx nfx mentioned this pull request Nov 17, 2023
nfx added a commit that referenced this pull request Nov 17, 2023
**Breaking changes** (existing installations need to reinstall UCX and
re-run assessment jobs)

* Switched local group migration component to rename groups instead of
creating backup groups
([#450](#450)).
* Mitigate permissions loss in Table ACLs by folding grants belonging to
the same principal, object id and object type together
([#512](#512)).

**New features**

* Added support for the experimental Databricks CLI launcher
([#517](#517)).
* Added support for external Hive Metastores including AWS Glue
([#400](#400)).
* Added more views to assessment dashboard
([#474](#474)).
* Added rate limit for creating backup group to increase stability
([#500](#500)).
* Added deduplication for mount point list
([#569](#569)).
* Added documentation to describe interaction with external Hive
Metastores ([#473](#473)).
* Added failure injection for job failure message propagation
([#591](#591)).
* Added uniqueness in the new warehouse name to avoid conflicts on
installation ([#542](#542)).
* Added a global init script to collect Hive Metastore lineage
([#513](#513)).
* Added retry set/update permissions when possible and assess the
changes in the workspace
([#519](#519)).
* Use `~/.ucx/state.json` to store the state of both dashboards and jobs
([#561](#561)).

**Bug fixes**

* Fixed handling for `OWN` table permissions
([#571](#571)).
* Fixed handling of keys with and without values.
([#514](#514)).
* Fixed integration test failures related to concurrent group delete
([#584](#584)).
* Fixed issue with workspace listing process on None type `object_type`
([#481](#481)).
* Fixed missing group entitlement migration bug
([#583](#583)).
* Fixed entitlement application for account-level groups
([#529](#529)).
* Fixed assessment throwing an error when the owner of an object is
empty ([#485](#485)).
* Fixed installer to migrate between different configuration file
versions ([#596](#596)).
* Fixed cluster policy crawler to be aware of deleted policies
([#486](#486)).
* Improved error message for not null constraints violated
([#532](#532)).
* Improved integration test resiliency
([#597](#597),
[#594](#594),
[#586](#586)).
* Introduced Safer access to workspace objects' properties.
([#530](#530)).
* Mitigated permissions loss in Table ACLs by running appliers with
single thread ([#518](#518)).
* Running apply permission task before assessment should display message
([#487](#487)).
* Split integration tests from blocking the merge queue
([#496](#496)).
* Support more than one dashboard per step
([#472](#472)).
* Update databricks-sdk requirement from ~=0.11.0 to ~=0.12.0
([#505](#505)).
* Update databricks-sdk requirement from ~=0.12.0 to ~=0.13.0
([#575](#575)).
pritishpai pushed a commit that referenced this pull request Nov 21, 2023
**Breaking changes** (existing installations need to reinstall UCX and
re-run assessment jobs)

* Switched local group migration component to rename groups instead of
creating backup groups
([#450](#450)).
* Mitigate permissions loss in Table ACLs by folding grants belonging to
the same principal, object id and object type together
([#512](#512)).

**New features**

* Added support for the experimental Databricks CLI launcher
([#517](#517)).
* Added support for external Hive Metastores including AWS Glue
([#400](#400)).
* Added more views to assessment dashboard
([#474](#474)).
* Added rate limit for creating backup group to increase stability
([#500](#500)).
* Added deduplication for mount point list
([#569](#569)).
* Added documentation to describe interaction with external Hive
Metastores ([#473](#473)).
* Added failure injection for job failure message propagation
([#591](#591)).
* Added uniqueness in the new warehouse name to avoid conflicts on
installation ([#542](#542)).
* Added a global init script to collect Hive Metastore lineage
([#513](#513)).
* Added retry set/update permissions when possible and assess the
changes in the workspace
([#519](#519)).
* Use `~/.ucx/state.json` to store the state of both dashboards and jobs
([#561](#561)).

**Bug fixes**

* Fixed handling for `OWN` table permissions
([#571](#571)).
* Fixed handling of keys with and without values.
([#514](#514)).
* Fixed integration test failures related to concurrent group delete
([#584](#584)).
* Fixed issue with workspace listing process on None type `object_type`
([#481](#481)).
* Fixed missing group entitlement migration bug
([#583](#583)).
* Fixed entitlement application for account-level groups
([#529](#529)).
* Fixed assessment throwing an error when the owner of an object is
empty ([#485](#485)).
* Fixed installer to migrate between different configuration file
versions ([#596](#596)).
* Fixed cluster policy crawler to be aware of deleted policies
([#486](#486)).
* Improved error message for not null constraints violated
([#532](#532)).
* Improved integration test resiliency
([#597](#597),
[#594](#594),
[#586](#586)).
* Introduced Safer access to workspace objects' properties.
([#530](#530)).
* Mitigated permissions loss in Table ACLs by running appliers with
single thread ([#518](#518)).
* Running apply permission task before assessment should display message
([#487](#487)).
* Split integration tests from blocking the merge queue
([#496](#496)).
* Support more than one dashboard per step
([#472](#472)).
* Update databricks-sdk requirement from ~=0.11.0 to ~=0.12.0
([#505](#505)).
* Update databricks-sdk requirement from ~=0.12.0 to ~=0.13.0
([#575](#575)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/crawler feat/installer install/upgrade the app migrate/clusters go/uc/upgrade Upgrade Interactive Clusters migrate/external go/uc/upgrade SYNC EXTERNAL TABLES step migrate/groups Corresponds to Migrate Groups Step of go/uc/upgrade migrate/redash go/uc/upgrade Upgrade DBSQL Warehouses step/assign metastore go/uc/upgrade Assign Metastore
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate EXTERNAL tables from AWS Glue or other external HMS
2 participants