Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add programmatic descriptions parser for [AtlasProxy] #152

Merged

Conversation

mgorsk1
Copy link
Contributor

@mgorsk1 mgorsk1 commented Jul 10, 2020

Summary of Changes

This MR introduces parsing of programmatic descriptions using parameters field of Table entity in Atlas Proxy. The parameters field is a map of String -> Any and seems like a perfect fit for this usecase.

Moreover, this property can be set for example on hive_table entity programatically with spark sql:

ALTER [TABLE|VIEW] table_name SET TBLPROPERTIES (key1=val1, key2=val2, ...) - Atlas hive-hook will propagate such action in Hive Metastore to Atlas metadata

All of those key1, key2 etc then become programmatic description entries.

The idea is also to have a filter to remove unwanted properties (like spark technical ones that appear in parameters after creating table with spark). This could also be of use for other proxies.

Tests

Tested for presence and filtering of programmatic desriptions.

Documentation

What documentation did you add or modify and why? Add any relevant links then remove this line

CheckList

Make sure you have checked all steps below to ensure a timely review.

  • PR title addresses the issue accurately and concisely. Example: "Updates the version of Flask to v1.0.2"
  • PR includes a summary of changes.
  • PR adds unit tests, updates existing unit tests, OR documents why no test additions or modifications are needed.
  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
  • PR passes make test

@mgorsk1
Copy link
Contributor Author

mgorsk1 commented Jul 10, 2020

cc @verdan

@codecov-commenter
Copy link

codecov-commenter commented Jul 10, 2020

Codecov Report

Merging #152 into master will increase coverage by 0.34%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #152      +/-   ##
==========================================
+ Coverage   72.91%   73.25%   +0.34%     
==========================================
  Files          26       26              
  Lines        1233     1249      +16     
  Branches      128      132       +4     
==========================================
+ Hits          899      915      +16     
  Misses        307      307              
  Partials       27       27              
Impacted Files Coverage Δ
metadata_service/config.py 100.00% <100.00%> (ø)
metadata_service/proxy/atlas_proxy.py 84.49% <100.00%> (+0.95%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a8c117...73c81c9. Read the comment docs.

@feng-tao
Copy link
Member

cc @verdan

@mgorsk1 mgorsk1 force-pushed the feature/programmatic_description_parser branch from 4143cea to 40cafc9 Compare July 13, 2020 05:57
Copy link
Member

@verdan verdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments

metadata_service/proxy/atlas_proxy.py Outdated Show resolved Hide resolved
metadata_service/proxy/atlas_proxy.py Show resolved Hide resolved
@mgorsk1 mgorsk1 force-pushed the feature/programmatic_description_parser branch from 2c226cc to 723841a Compare July 14, 2020 06:25
@mgorsk1 mgorsk1 force-pushed the feature/programmatic_description_parser branch from 039e887 to 73c81c9 Compare July 14, 2020 06:29
Copy link
Member

@verdan verdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM

@feng-tao feng-tao merged commit e8b46a4 into amundsen-io:master Jul 14, 2020
jerryzhu2007 pushed a commit to kylg/amundsenmetadatalibrary that referenced this pull request Aug 20, 2020
* commit '369685cc715e95af82dfa4dc14d0c58af8bb1ac9':
  chore: replace references to Lyft -> Amundsen (amundsen-io#174)
  feat: Data Owner Implementation of Atlas Proxy (amundsen-io#156)
  chore: fix docker push action (amundsen-io#172)
  chore: add docker publish action and remove travis (amundsen-io#171)
  chore: add pypi publish action (amundsen-io#170)
  fix: removing OidcConfig file and making statsd configurable through envrionment variable (amundsen-io#157)
  ci: add dependabot config (amundsen-io#169)
  Update repo name in travis file (amundsen-io#163)
  feat: Populate is_view property in AtlasProxy (amundsen-io#155)
  fix: Overlapping table name issue in Readers [AtlasProxy]
  feat: Add resource_reports field in Table API ( Atlas proxy) (amundsen-io#149)
  chore: apply license headers to all the source files (amundsen-io#153)
  feat: Add programmatic descriptions parser for [AtlasProxy] (amundsen-io#152)
  feat: Add Frequent Users feature in [AtlasProxy] (amundsen-io#147)
  feat: Implement configurable minimum number of readers for popular tables (amundsen-io#146)
  chore: update the email for the project (amundsen-io#148)

# Conflicts:
#	README.md
#	docs/configurations.md
#	docs/structure.md
#	metadata_service/config.py
#	metadata_service/oidc_config.py
#	metadata_service/proxy/neo4j_proxy.py
#	requirements.txt
#	setup.py
@pPanda-beta
Copy link

@mgorsk1
I see there is a particular part which breaks the camel case field into space separated words. Looks like an UI logic. Any particular reason to make it mandatory in the metadata service instead of frontend?

Source:

source = re.sub("([a-z])([A-Z])", "\g<1> \g<2>", source).lower()

My previous comment on test case:

e8b46a4#r42358611

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants