Skip to content

Commit

Permalink
V0.8.43 (#84)
Browse files Browse the repository at this point in the history
* feat(ingest): working with multiple bigquery projects (datahub-project#5240)

* fix(build): missing libs (datahub-project#5254)

* fix(build): use correct creds (datahub-project#5261)

* feat(ingest): Option to define path spec for Redshift lineage generation (datahub-project#5256)

* fix(ui): Enable previews properly when browsing for DataJob (datahub-project#5250)

* fix(docs): Fix acronym on mxe docs (datahub-project#5249)

* fix(ui): Support deleting references to glossary terms / nodes, users, assertions, and groups (datahub-project#5248)

* Adding referential integrity to deletes API

* Updating comments

* Fix build

* fix checkstyle

* Fixing Delete Entity utils Test

* feat(docs) add links in quickstart for adding users (datahub-project#5267)

* fix(siblings) Display sibling assertions in Validations tab (datahub-project#5268)

* fix(siblings) Display sibling assertions in Validations tab

* query changes

Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2-192.lan>

* feat(domain) Add ability to edit a Domain name from the UI (datahub-project#5266)

* feat(ingest): delta-lake: adding support for delta lake (datahub-project#5259)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* fix(siblings) Update the names of siblings utils args for readability (datahub-project#5269)

Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2-193.lan>

* docs(adopters): add showroomprive and n26 as DataHub adopters (datahub-project#5271)

* feat(glossary) Add Source section to sidebar for Glossary Terms (datahub-project#5262)

* fix(ingest): delta-lake - fix dependency issue for snowflake due to s3_util (datahub-project#5274)

* fix(ingest): s3 - Remove unneeded methods from s3_util (datahub-project#5276)

* feat(ui): Selector recommendations in Owner, Tag and Domain Modal (datahub-project#5197)

* fix(security) Sanitize rich text before sending to backend or rendering on frontend (datahub-project#5278)

* feat(GraphQL): Support for Deleting Domains, Tags via GraphQL API (datahub-project#5272)

* feat(build): reduce build time for ingestion image (datahub-project#5225)

* fix(ingestion): profiling - Fixing partitioned table profiling in BQ (datahub-project#5283)

* fix(ingest) redshift: Adding missing dependencies and relaxing sqlalchemy dependency (datahub-project#5284)

Relaxing sqlalchemy deps to make our plugins work with Airflow 2.3

* fix(ingestion): Reverting sqlalchemy upgrade because it caused issues with mssql and redshift-usage (datahub-project#5289)

* fix(Siblings): Have sibling hook use entity client (datahub-project#5279)

* fixing dbt platform issues

* have sibling hook use entity client over entity service

* switching search service as well

* lint

* more lint

* more specific exceptions

* refactor(ui): Show message when related glossary terms are empty. (datahub-project#5285)

* docs(adopter): add Digital Turbine as DataHub adopter (datahub-project#5290)

* docs(docker): Update schema-registry  docker.env (datahub-project#5231)

* feat(siblings): index sibling aspects for historical dbt metadata (datahub-project#5291)

* fixing dbt platform issues

* starting sibling restore index job work

* finish restore indices

* migrating to list urns

* rename constant

* disaster recovery

* feat(ui) Adding support for deleting Tags and Domains via the UI (datahub-project#5280)

* Adding support for deleting tags and domains via the UI

* Fixing tests

* fix(test): add cleanup in tests, make urls configurable (datahub-project#5287)

* fix(docs,quickstart): release related changes for 0.8.40 (datahub-project#5299)

* fix(doc): config typo on confluent cloud doc (datahub-project#5293)

* fix(cli): suppress secrets in stacktraces (datahub-project#5302)

* Minor UI bug fuix (datahub-project#5292)

* fix(cli): timeline - category should be owner not ownership (datahub-project#5304)

* perf(ui): reduce data fetched by siblings in lineage (datahub-project#5308)

* fix(ingest): bigquery - Fix for bigquery error when there was no bigquery catalog specified (datahub-project#5303)

* fix(ui) Fix entity profile sidebar width issues (datahub-project#5305)

Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2.lan>

* perf(search): Improve search default performance  (datahub-project#5311)

* perf(ui): Performance improvements and misc refactorings in the UI (datahub-project#5310)

* feat(ui): Modified the drop down of Menu Items (datahub-project#5301)

* fix(validation) Fail validation error silently instead of crashing (datahub-project#5314)

* feat(docs) Add documentation on authorization & authentication (datahub-project#5265)

* fix(ui) Make profile icon clickable to expand header menu (datahub-project#5317)

* refactor(ui): Extract searchable page into its own component (perf + ux)  (datahub-project#5318)

* fix(gms) Remove auto-creating status aspects if not present when ingesting metadata (datahub-project#5315)

* fix(ui): Add missing SearchRoutes component (datahub-project#5321)

* feat(ingest): looker - ingest dashboard create/update/delete timestamps (datahub-project#5312)

* fix(ui): Fix pipeline tasks list loading (datahub-project#5332)

* feat(ingest): lookml - adding support for only emitting reachable views from explores (datahub-project#5333)

* fix(ingest): tableau - omit schema fields when name is absent (datahub-project#5275)

* fix(siblings) Combine siblings data but remove duplicate data (datahub-project#5337)

* fix(docs): Fix typo in metadata-ingestion.md (datahub-project#5338)

* fix(me) Cache the me query for performance reasons (datahub-project#5316)

* fix(tokens) Adds non-admin tests for access tokens (datahub-project#5174)

* feat(bigquery): support size, rowcount, lastmodified based table selection for profiling (datahub-project#5329)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* chore: Refactor Python Codebase (datahub-project#5113)

* docs(bigquery): profiling report enhancement (datahub-project#5342)

* feat(ingest): update CSV source to support description and ownership type (datahub-project#5346)

* fix(ui): fixed the ui issue (datahub-project#5341)

* feat(ingest): salesforce - add connector (datahub-project#5104)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Vincent Koc <koconder@users.noreply.github.com>

* feat(bootstrap): create abstract class UpgradeStep to abstract away upgrade logic (datahub-project#5349)

* fix(ingest): bigquery-usage - dataset name for sharded tables (datahub-project#5347)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* docs(features): update grammar (datahub-project#5350)

* fix(ci): fix mysql test and attempt kafka-connect ingestion (datahub-project#5352)

* feat(ui): add copy function for stats table sample value (datahub-project#5331)

* fix(ui) Correct show/hide tabs in Settings based on privileges (datahub-project#5355)

Co-authored-by: Chris Collins <chriscollins@Chriss-MacBook-Pro-2.local>

* fix(siblings): add useMutationUrn to domain section (datahub-project#5270)

* fixing dbt platform issues

* useMutationUrn for domains modal

* feat(schema) Show last observed timestamp in the schema tab (datahub-project#5348)

* fix(glossary) Fixes a bug for yaml ingested terms without source_url (datahub-project#5356)

* feat(lineage) Add Lineage tab to Chart and Dashboard entity profiles (datahub-project#5357)

* fix(cassandra): fix Cassandra queries used by IngestDataPlatformInstancesStep (datahub-project#5199)

* refactor(ui): Use createTag mutation for creating new tags from the UI (datahub-project#5359)

* feat(ui): Added recommendation on group modal (datahub-project#5362)

* refactor(ui): Remove unnecessary fields in GraphQL (datahub-project#5358)

* feat(ingest) - add audit actor urn to auditStamp (datahub-project#5264)

* feat(ingest): improve domain ingestion usability (datahub-project#5366)

* fix(config): fixes config key in DataHubAuthorizerFactory (datahub-project#5371)

* fix(ingest): domains - check whether urn based domain exists during resolution (datahub-project#5373)

* feat(quickstart): Adding env variables and cli options for customizing mapped ports in  quickstart (datahub-project#5353)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* fix(build): tweak ingestion build (datahub-project#5374)

* feat(sdk): python - add get_aspects_for_entity (datahub-project#5255)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* fix(airflow): fix for failing serialisation when Param was specified + support for external task sensor (datahub-project#5368)

fixes datahub-project#4546

* fix(users): fix to not get invite token unless the invite token modal is visible (datahub-project#5380)

* fix(gms) Propagate cache exception upstream (datahub-project#5381)

* fix(bootstrap): skip ingesting data platforms that already exist (datahub-project#5382)

* fix(cli): respect server telemetry settings correctly (datahub-project#5384)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* fix(ingest): bigquery - Graceful bq partition id date parsing failure (datahub-project#5386)

* feat(airflow): Circuit breaker and python api for Assertion and Operation (datahub-project#5196)

* feat(kafka-setup): add options for sasl_plaintext (datahub-project#5385)

allow sasl_plaintext options using environment variables

* fix(bigquery): multi-project GCP setup run query through correct project (datahub-project#5393)

* fix(bigquery): add storage project name (datahub-project#5395)

* Add Changes to support smoke test on Datahub deployed on kubernetes Cluster (datahub-project#5334)

Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>

* fix(PlayCookie) PLAY_TOKEN cookie rejected because userprofile exceeds 4096 chars (datahub-project#5114)

* feat(dashboards): add datasets field to DashboardInfo aspect (datahub-project#5188)

Co-authored-by: John Joyce <john@acryl.io>

* feat(siblings): allow viewing siblings separately (datahub-project#5390)

* allow pulling back curtain for siblings

* sibling pullback working for lineage + property merge

* propagating provinence to ui

* fixups from merge & some renames

* fix styling & add tooltip

* adding cypress tests

* fix lint

* updating mocks

* updating smoke test

* fixing domains smoke test

* responding to comments

* refactor(ui): Added Cursor pointer to tags (datahub-project#5389)

* feat(GMS): Adding Dashboard Usage Models (datahub-project#5399)

* fix(quickstart): use platform agnostic way to get folder (datahub-project#5400)

* Adds support for Domains in CSV source (datahub-project#5372)

* feat(ingestion) Build out UI form for Snowflake Managed Ingestion (datahub-project#5391)

* fix(kafka): add missing configs (datahub-project#5394)

* feat(model): dashboard usage model, is_null condition added (datahub-project#5397)

* fix(datahub-client): Fix kafka config issue (datahub-project#5403)

* build: improve comprehensiveness of gradle clean (datahub-project#5003)

* fix(gms): Change MessageDigest to be thread safe (datahub-project#5405)

* fix(metadata-ingestion) Fix broken csv enricher test (datahub-project#5406)

* fix(tests): Removes duplicate policies tests & makes DataHub user configurable (datahub-project#5365)

* feat(quickstart,docs): updates for v0.8.41 (datahub-project#5409)

* fix(ingest): ensure upgrade checks run async (datahub-project#5383)

* fix(ingest): looker - pass transport options to all api calls (datahub-project#5417)

* feat(quickstart): moving to official confluent images for m1 (datahub-project#5416)

* fix(documentation) Fix erratic cursor in documentation editor bug (datahub-project#5411)

Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2-280.lan>

* feat(ui): Supporting enriched search preview + misc improvements  (datahub-project#5419)

* chore: remove unnecessary modules from codebase (datahub-project#5420)

* fix(ingest): looker - extract usage for dashboards allowed by pattern (datahub-project#5424)

* fix(docker): fix kafka-setup command to support same capabilities as previous (datahub-project#5428)

* fix(protobuf) Set undeprecated ownership type & fix case sentitive urn corpGroup (datahub-project#5425)

* fix(ui): add dataset qualifiedName parameter to lineage query (datahub-project#5427)

* fix(glossary) Fix dropdown where disabled buttons are still clickable (datahub-project#5430)

Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2.lan>

* docs(bigquery): add changelog and unittest for profiling limits (datahub-project#5407)

* fix(siblings): fixing lineage fetching for siblings & sources (datahub-project#5415)

* fix(ui): Fixing unreleased search preview bugs  (datahub-project#5432)

* feat(ui): Adding Statistics Summary to Dataset + Dashboard Profiles  (datahub-project#5440)

* feat(ingest): add test source connection feature, structured report file (datahub-project#5442)

* fix(ingest/glue): handle error when generating s3 tags for virtual view tables (datahub-project#5398)

Co-authored-by: Tim Costa <timcosta@amazon.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* feat(ingest): model - adding a small extension to support communicating structured responses (datahub-project#5429)

* fix(ingest): bigquery-usage - fix dataset name for sharded table (datahub-project#5412)

* feat(ingestion) Add new endpoint to test an ingestion connection (datahub-project#5438)

* feat(cli,build): remove deprecated variables GMS_HOST/_PORT (datahub-project#5451)

* fix(search): make filters by default an empty list if null (datahub-project#5454)

* fix(ingest): hive - add column comment as a column description (datahub-project#5449)

* feat(groups): add native groups concept to DataHub (datahub-project#5443)

* fix(ingest): fix serialization of report to handle nesting (datahub-project#5455)

* fix(ingest): tableau - fix tableau db error, add more logs (datahub-project#5423)

* build(deps): bump terser from 5.9.0 to 5.14.2 in /docs-website (datahub-project#5448)

Bumps [terser](https://github.com/terser/terser) from 5.9.0 to 5.14.2.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs: spark-lineage - configuration details for Amazon EMR (datahub-project#5459)

* feat(app): schema-history - remove blame language for the schema history feature (datahub-project#5457)

* Worked on the alignment of menu icon in search header (datahub-project#5458)

* build(deps): bump terser from 4.8.0 to 4.8.1 in /datahub-web-react (datahub-project#5446)

Bumps [terser](https://github.com/terser/terser) from 4.8.0 to 4.8.1.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat(ingest): snowflake - basic test connection capability (datahub-project#5464)

* fix(ingest/trino): Avoid exception if $properties table empty or not readable (datahub-project#5447)

Under some configuration of access rules in Trino, the user may not have
read access to the content of the table, which will result in an exception
(`fetchone()` returns `None`)

This commit ensures no exception are raised and the ingestion can proceed.

* feat(ingest): preflight - Add way to check/upgrade brew package version in preflight if needed (datahub-project#5435)

* fix(build): add base image with gradle wrapper cached (datahub-project#5467)

* doc(bigquery): groups grants by requirements (datahub-project#5468)

* fix(docs,build): remove base image not needed, cleanup docs (datahub-project#5469)

* feat(ui): Partial support for Chart usage (datahub-project#5473)

* fix(ingest): bigquery: multiproject profiling fix (datahub-project#5474)

* fix(ingest): kafka - revert deps back to < 1.9.0 (datahub-project#5476)

* feat(ci): datahub-upgrade - support multiplatform image (datahub-project#5477)

* feat(cli): quickstart - experimental support for backup restore (datahub-project#5418)

* feat(ingest): dbt - updating source lineage logic (datahub-project#5414)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* Ingestion: Added form in Big Query type to edit the queries. (datahub-project#5431)

* docs(reindex): fix docsearch config (datahub-project#5479)

* refactor(ui): Adding checkbox option to select multiple results at once. (datahub-project#5422)

* feat(cli): delete - hard delete deletes soft deleted entities (datahub-project#5478)

* fix(docs): add missing closing marker for note section (datahub-project#5480)

* fix(ci): intermittent failure in github actions (datahub-project#5452)

* feat(model, ingest): add user email in dashboard user usage counts (datahub-project#5471)

* feat(ingest): snowflake - test_connection add support for capability report (datahub-project#5472)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* feat(build): automatically mark issues as stale to close inactive issues (datahub-project#5482)

* fix(ingest): loosen confluent-kafka dep requirement (datahub-project#5489)

* refactor(ingest): cleanup importlib.import_module calls (datahub-project#5490)

* build(ingest): make gradle build less chatty (datahub-project#5491)

* fix(ingest): dbt - add support for trino datatypes (datahub-project#5379)

* refactor(ci): use custom action for checking codegen status (datahub-project#5493)

* feat(spark-lineage, java-emitter): Support ssl cert disable verification functionality (datahub-project#5488)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* docs(auth): fix link to point to new doc (datahub-project#5501)

* docs(updating-datahub): add note for breaking change in looker usage ingestion (datahub-project#5499)

* fix(ingest): cleanup unused flake8 noqa statements (datahub-project#5492)

* fix(ingest): cleanup unused flake8 noqa statements

In the future, we can discover these using `flake8-noqa`.

* add back c901

* refactor(ci): refactor Docker build-and-push workflows (datahub-project#5494)

* docs(slack): update to Slack guidelines (datahub-project#5504)

* feat(cli): delete - add --only-soft-deleted option, perf improvements (datahub-project#5485)

* fix(ingest): use temp dir for file generated during test (datahub-project#5505)

* feat(ui) Show Glossary and Domains header links to everyone (datahub-project#5506)

Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2.lan>

* fix(ui): Fix Flickering Issue on search input field (datahub-project#5503)

* fix(ingest): respect rest emitter timeout setting (datahub-project#5508)

* fix(ui): Flickering Issue on search input field (datahub-project#5515)

* feat(ui): Added form to Looker and Tableau (datahub-project#5487)

* feat(identity): update azure and okta connectors to emit Origin aspects (datahub-project#5495)

* feat(ui): Adding Search Select feature(frontend only)  (datahub-project#5507)

* test(ingest): limit GMS retries in test (datahub-project#5509)

* fix(ingest): airflow: update subdag check for compatibility with older Airflow versions (datahub-project#5523)

* use getattr to default None if no subdag

* add None check

* add other None check

* Apply suggestions from code review- double quotes

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* minor tweak to fix lint

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* fix(ingest): fix unbound variable bug in cli ingest list-runs (datahub-project#5527)

* fix(ui) Display Term Group name properly in Recently Viewed (datahub-project#5528)

* feat(ingestion) Add frontend connection test for Snowflake (datahub-project#5520)

* fix(glossary) Fix Glossary success messages and sort Glossary (datahub-project#5533)

* show error and success messages in glossary properly

* sort glossary nodes and terms alphabetically

Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2.lan>

* feat(apache-ranger): Apache Ranger Authorizer support in datahub-gms (datahub-project#4999)

* feat(ingest): add deprecation warning for Python 3.6 (datahub-project#5519)

* docs(townhall) add past townhall agendas (datahub-project#5536)

* feat(ingestion): add groups to ldap users (datahub-project#5470)

* chore(issues): reduce time for issues to be marked stale and then closed (datahub-project#5537)

* fix(ingestion) Set pipeline_name on UI recipes with forms (datahub-project#5535)

* Fixing OIDC logout issues (datahub-project#5538)

* fix(analytics-tab) - fix analytics tab config variable for gms (datahub-project#5529)

* feat(ui): Support batch adding / remove tags from search lists. (Batch Actions part 2/7)  (datahub-project#5534)

* fix(ingestionSource): improve error experience when ingestion source is in an inconsistent state (datahub-project#5522)

* fix(docs): Fixed typo in schema history markdown! (datahub-project#5545)

* fix(docker): Fixing dev docker and quickstart  (datahub-project#5550)

* feat(ui): Support Batch adding and removing Glossary Terms (Batch Actions 3/7) (datahub-project#5544)

* feat(ci): test quickstart works (datahub-project#5518)

* feat(ci): test quickstart works

* do not fail fast

* remove macos

* add some debug information

* tweak triggers

* fix workflow file

* remove running on every PR

* Update .github/workflows/check-quickstart.yml

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* Update .github/workflows/check-quickstart.yml

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* test(ingest): mark trino/hana tests as xfail due to flakes (datahub-project#5549)

* feat(ingestion): superset - add display_uri to config (datahub-project#5408)

* fix(quickstart): failure on a path not being present (datahub-project#5554)

* fix(dbt): fix issue of assertion error when stateful ingestion is used with dbt tests (datahub-project#5540)

* fix(dbt): fix issue of dbt stateful ingestion with tests

Co-authored-by: MugdhaHardikar-GSLab <mugdha.hardikar@gslab.com>
Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>

* feat(ui): Batch add & remove Owners to assets via the UI (datahub-project#5552)

* feat(ingestion) Update managed ingestion scheduler to be easier to use (datahub-project#5559)

* fix(ingestion): correct trino datatype handling (datahub-project#5541)

Co-authored-by: Ravindra Lanka <rlanka@acryl.io>

* feat(ingest) Allow ingestion of Elasticsearch index template (datahub-project#5444)


Co-authored-by: Ravindra Lanka <rlanka@acryl.io>

* fix(ingest): fix some typos and logging issues (datahub-project#5564)

* feat(transformers): Add domain transformer for dataset (datahub-project#5456)

Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>

* chore(0.8.42): update breaking changes doc (datahub-project#5563)

* fix(ingest): activate mypy support for ParamSpec typing annotation (datahub-project#5551)

* (chore): upgrading ingestion to 0.8.42 (datahub-project#5562)

* fix(gms): ensure directory is present (datahub-project#5568)

* fix(ci): flaky smoke test fix (datahub-project#5569)

* fix(gms): missing directory for gms (datahub-project#5570)

* chore(build): tweak stale issue timing (datahub-project#5571)

* feat(ui): Batch set & unset Domain for assets via the UI (datahub-project#5560)

* extending assertion std model (datahub-project#5575)

* feat(ui): Support batch deprecation from the UI (Batch actions part 6/7) (datahub-project#5572)

* feat(graphql): add MutableTypeBatchResolver (datahub-project#4976)

* feat(ingestion) Implement secrets in new managed ingestion form (datahub-project#5574)

* fix(ui): Fixing batch set domains bug (datahub-project#5580)

* chore(gradle): update node version for docs site (datahub-project#5581)

* feat(test): add read-only smoke tests (datahub-project#5558)

* feat(ingestion) Add Save & Run button to managed ingestion builder (datahub-project#5579)

* fix(ingest): handle when current server version is unavailable (datahub-project#5547)

* feat(ingest): dbt - control over emitting test_results, test_definitions, etc. (datahub-project#5328)

Co-authored-by: Piotr Sierkin <piotr.sierkin@getindata.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* feat(datahub-client): add java file emitter (datahub-project#5578)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* feat(ingest): infer aspectName from aspect type in MCP (datahub-project#5566)

* fix(ingest): sql-common - db2, snowflake bug fixes to extract table descriptions (datahub-project#5526)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* fix(ingest): moving delta-lake connector to be 3.7+ only (datahub-project#5584)

* feat(ingest): delta-lake - extract table history into operation aspect (datahub-project#5277)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* fix apache ranger plugin readme file rendering (datahub-project#5585)

* feat(ui): make container description searchable and have description show up in results (datahub-project#5586)

* fix(groups): fix user, search, and preview group membership to be fetched for both external and native group memberships (datahub-project#5587)

* feat(ingest): power-bi - make ownership ingestion optional (datahub-project#5335)


Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* Expose catalog_name in athena.py (datahub-project#5548)

* expose catalog_name to the sql alchemy uri that is passed into pyathena

Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* Fix profiling when using {table}. (datahub-project#5531)

* profiling fix for when using {table}

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>

* feat(ui): Support batch deleting from ui (datahub-project#5582)

* feat(ingest): clickhouse - add metadata modification time and data size (datahub-project#5330)

Co-authored-by: Ravindra Lanka <rlanka@acryl.io>

* feat(ui): Add rich UI ingestion run summary (datahub-project#5577)

* fix(ci): smoke test less flaky, add src, dev dep in smoke image (datahub-project#5594)

* updated mock custom to pass the test suite

* added env for mysql-setup for smoketest to pass

* added env for mysql-setup for smoketest to pass

* added env for mysql-setup for smoketest to pass

* push to heruko repo instead of linkedin

Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Michael A. Schlosser <mikeschlosser16@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: Pedro Silva <pedro.cls93@gmail.com>
Co-authored-by: Chris Collins <chriscollins3456@gmail.com>
Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2-192.lan>
Co-authored-by: Mugdha Hardikar <mugdha.hardikar@gslab.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2-193.lan>
Co-authored-by: Maggie Hays <maggiem.hays@gmail.com>
Co-authored-by: Ankit keshari <86347578+Ankit-Keshari-Vituity@users.noreply.github.com>
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
Co-authored-by: liyuhui666 <71497399+liyuhui666@users.noreply.github.com>
Co-authored-by: Tengis Batsaikhan <tengee0411@gmail.com>
Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2.lan>
Co-authored-by: Pedro Silva <pedro@acryl.io>
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
Co-authored-by: dougpm <60357516+dougpm@users.noreply.github.com>
Co-authored-by: Vincent Koc <koconder@users.noreply.github.com>
Co-authored-by: Aditya Radhakrishnan <aditya@acryl.io>
Co-authored-by: Amanda Ng <10681923+ngamanda@users.noreply.github.com>
Co-authored-by: Chris Collins <chriscollins@Chriss-MacBook-Pro-2.local>
Co-authored-by: Justin Marozas <justin.marozas@ext.gresearch.co.uk>
Co-authored-by: Sergio Gómez Villamor <sgomezvillamor@gmail.com>
Co-authored-by: Navin Sharma <103643430+NavinSharma13@users.noreply.github.com>
Co-authored-by: Aezo <45879156+aezomz@users.noreply.github.com>
Co-authored-by: abiwill <abhi13101993@gmail.com>
Co-authored-by: Felix Lüdin <13187726+Masterchen09@users.noreply.github.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Chris Collins <chriscollins@Chriss-MBP-2-280.lan>
Co-authored-by: leifker <leifker@users.noreply.github.com>
Co-authored-by: Alexey Kravtsov <Havok.08@mail.ru>
Co-authored-by: Tim Costa <tim@timcosta.io>
Co-authored-by: Tim Costa <timcosta@amazon.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Guillaume Gardey <glinmac@gmail.com>
Co-authored-by: Vishal Shah <vshah@etsy.com>
Co-authored-by: mohdsiddique <mohdsiddiquebagwan@gmail.com>
Co-authored-by: Salih Can <salih.can@udemy.com>
Co-authored-by: RyanHolstien <RyanHolstien@users.noreply.github.com>
Co-authored-by: Skyler Sinclair <skyler.r.sinclair@gmail.com>
Co-authored-by: Dan Andreescu <dan.andreescu@gmail.com>
Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Marcin Szymański <ms32035@gmail.com>
Co-authored-by: xiphl <50935738+xiphl@users.noreply.github.com>
Co-authored-by: NoahFournier <63198198+NoahFournier@users.noreply.github.com>
Co-authored-by: Piotr Sierkin <psierkin@gmail.com>
Co-authored-by: Piotr Sierkin <piotr.sierkin@getindata.com>
Co-authored-by: Jordan Wolinsky <jordan@zephyrai.bio>
  • Loading branch information
Show file tree
Hide file tree
Showing 1,180 changed files with 131,816 additions and 65,795 deletions.
92 changes: 92 additions & 0 deletions .github/actions/docker-custom-build-and-push/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
name: Custom Docker build and push
description: "Build and push a Docker image to Docker Hub"

inputs:
username:
description: "Docker Hub username"
password:
description: "Docker Hub password"
publish:
description: "Set to true to actually publish the image to Docker Hub"

context:
description: "Same as docker/build-push-action"
required: false
file:
description: "Same as docker/build-push-action"
required: false
platforms:
description: "Same as docker/build-push-action"
required: false

images:
# e.g. linkedin/datahub-gms
description: "List of Docker images to use as base name for tags"
required: true
tags:
# e.g. latest,head,sha12345
description: "List of tags to use for the Docker image"
required: true
outputs:
image_tag:
description: "Docker image tags"
value: ${{ steps.docker_meta.outputs.tags }}
# image_name: ${{ env.DATAHUB_GMS_IMAGE }}

runs:
using: "composite"

steps:
- name: Docker meta
id: docker_meta
uses: crazy-max/ghaction-docker-meta@v1
with:
# list of Docker images to use as base name for tags
images: ${{ inputs.images }}
# add git short SHA as Docker tag
tag-custom: ${{ inputs.tags }}
tag-custom-only: true

# Code for testing the build when not pushing to Docker Hub.
- name: Build and Load image for testing (if not publishing)
uses: docker/build-push-action@v2
if: ${{ inputs.publish != 'true' }}
with:
context: ${{ inputs.context }}
file: ${{ inputs.file }}
# TODO this only does single-platform builds in testing?
# leaving it for now since it matches the previous behavior
platforms: linux/amd64
tags: ${{ steps.docker_meta.outputs.tags }}
load: true
push: false
- name: Upload image locally for testing (if not publishing)
uses: ishworkh/docker-image-artifact-upload@v1
if: ${{ inputs.publish != 'true' }}
with:
image: ${{ steps.docker_meta.outputs.tags }}

# Code for building multi-platform images and pushing to Docker Hub.
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
if: ${{ inputs.publish == 'true' }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
if: ${{ inputs.publish == 'true' }}
- name: Login to DockerHub
uses: docker/login-action@v1
if: ${{ inputs.publish == 'true' }}
with:
username: ${{ inputs.username }}
password: ${{ inputs.password }}
- name: Build and Push Multi-Platform image
uses: docker/build-push-action@v2
if: ${{ inputs.publish == 'true' }}
with:
context: ${{ inputs.context }}
file: ${{ inputs.file }}
platforms: ${{ inputs.platforms }}
tags: ${{ steps.docker_meta.outputs.tags }}
push: true

# TODO add code for vuln scanning?
16 changes: 16 additions & 0 deletions .github/actions/ensure-codegen-updated/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: 'Ensure codegen is updated'
description: 'Will check the local filesystem against git, and abort if there are uncommitted changes.'

runs:
using: "composite"
steps:
- shell: bash
run: |
if output=$(git status --porcelain) && [ ! -z "$output" ]; then
# See https://unix.stackexchange.com/a/155077/378179.
echo 'There are uncommitted changes:'
echo $output
exit 1
else
echo 'All good!'
fi
15 changes: 5 additions & 10 deletions .github/workflows/build-and-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,15 @@ on:
branches:
- master
paths-ignore:
- "docker/**"
- "docs/**"
- "**.md"
release:
types: [published, edited]

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
build:
runs-on: ubuntu-latest
Expand Down Expand Up @@ -45,15 +48,7 @@ jobs:
**/build/test-results/test/**
**/junit.*.xml
- name: Ensure codegen is updated
run: |
if output=$(git status --porcelain) && [ ! -z "$output" ]; then
# See https://unix.stackexchange.com/a/155077/378179.
echo 'There are uncommitted changes:'
echo $output
exit 1
else
echo 'All good!'
fi
uses: ./.github/actions/ensure-codegen-updated
- name: Slack failure notification
if: failure() && github.event_name == 'push'
uses: kpritam/slack-job-status-action@v1
Expand Down
5 changes: 5 additions & 0 deletions .github/workflows/check-datahub-jars.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ on:
branches:
- master
paths-ignore:
- "docker/**"
- "docs/**"
- "**.md"
pull_request:
Expand All @@ -17,6 +18,10 @@ on:
release:
types: [published, edited]

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:

check_jars:
Expand Down
47 changes: 47 additions & 0 deletions .github/workflows/check-quickstart.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: check quickstart
on:
push:
branches:
- master
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true


jobs:
test-quickstart:
strategy:
fail-fast: false
matrix:
os: [ubuntu-20.04]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/setup-python@v2
with:
python-version: "3.9.9"
- name: Install acryl-datahub
run: |
pip install --upgrade acryl-datahub
datahub version
python -c "import platform; print(platform.platform())"
- name: Run quickstart
run: |
datahub docker quickstart
- name: Ingest sample data
run: |
datahub docker ingest-sample-data
- name: See status
run: |
docker ps -a && datahub docker check
- name: store logs
if: failure()
run: |
docker logs datahub-gms >& quickstart-gms.log
- name: Upload logs
uses: actions/upload-artifact@v2
if: failure()
with:
name: docker-quickstart-logs-${{ matrix.os }}
path: "*.log"
22 changes: 22 additions & 0 deletions .github/workflows/close-stale-issues.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Close inactive issues
on:
schedule:
- cron: "30 1 * * *"

jobs:
close-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v5
with:
days-before-issue-stale: 30
days-before-issue-close: 30
stale-issue-label: "stale"
stale-issue-message: "This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io"
close-issue-message: "This issue was closed because it has been inactive for 30 days since being marked as stale."
days-before-pr-stale: -1
days-before-pr-close: -1
repo-token: ${{ secrets.GITHUB_TOKEN }}
13 changes: 7 additions & 6 deletions .github/workflows/docker-feast-source.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,22 @@ on:
push:
branches:
- master
paths-ignore:
- 'docs/**'
- '**.md'
paths:
- 'metadata-ingestion/src/datahub/ingestion/source/feast_image/**'
- '.github/workflows/docker-feast-source.yml'
pull_request:
branches:
- master
paths:
- 'metadata-ingestion/src/datahub/ingestion/source/feast_image/**'
- '.github/workflows/docker-feast-source.yml'
paths_ignore:
- '**.md'
- '**.env'
release:
types: [published, edited]

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
setup:
runs-on: ubuntu-latest
Expand Down
43 changes: 43 additions & 0 deletions .github/workflows/docker-ingestion-base.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: ingestion base
on:
release:
types: [published, edited]
push:
branches:
- master
paths:
- "docker/datahub-ingestion/**"
- "gradle*"
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:

build-base:
name: Build and Push Docker Image to Docker Hub
runs-on: ubuntu-latest
steps:
- name: Check out the repo
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Login to DockerHub
uses: docker/login-action@v1
with:
username: ${{ secrets.ACRYL_DOCKER_USERNAME }}
password: ${{ secrets.ACRYL_DOCKER_PASSWORD }}
- name: Build and Push image
uses: docker/build-push-action@v2
with:
context: ./docker/datahub-ingestion
file: ./docker/datahub-ingestion/base.Dockerfile
platforms: linux/amd64,linux/arm64
tags: acryldata/datahub-ingestion-base:latest
push: true
42 changes: 42 additions & 0 deletions .github/workflows/docker-ingestion-smoke.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: ingestion smoke
on:
release:
types: [published, edited]
push:
branches:
- master
paths:
- "docker/datahub-ingestion/**"
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:

build-smoke:
name: Build and Push Docker Image to Docker Hub
runs-on: ubuntu-latest
steps:
- name: Check out the repo
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Login to DockerHub
uses: docker/login-action@v1
with:
username: ${{ secrets.ACRYL_DOCKER_USERNAME }}
password: ${{ secrets.ACRYL_DOCKER_PASSWORD }}
- name: Build and Push image
uses: docker/build-push-action@v2
with:
context: .
file: ./docker/datahub-ingestion/smoke.Dockerfile
platforms: linux/amd64,linux/arm64
tags: acryldata/datahub-ingestion-base:smoke
push: true
13 changes: 8 additions & 5 deletions .github/workflows/docker-ingestion.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,17 @@ on:
branches:
- master
paths:
- "docker/**"
- "metadata-ingestion/**"
- "metadata-models/**"
- "docker/datahub-ingestion/**"
- ".github/workflows/docker-ingestion.yml"
paths_ignore:
- "**.md"
- "**.env"
release:
types: [published, edited]

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
setup:
runs-on: ubuntu-latest
Expand Down Expand Up @@ -66,7 +69,7 @@ jobs:
with:
# list of Docker images to use as base name for tags
images: |
linkedin/datahub-ingestion
heruko/datahub-ingestion
# add git short SHA as Docker tag
tag-custom: ${{ needs.setup.outputs.tag }}
tag-custom-only: true
Expand Down
Loading

0 comments on commit 3e03da5

Please sign in to comment.