-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add source code facet for operators #1537
Conversation
Codecov ReportBase: 94.01% // Head: 94.00% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1537 +/- ##
==========================================
- Coverage 94.01% 94.00% -0.02%
==========================================
Files 89 89
Lines 4347 4372 +25
Branches 428 432 +4
==========================================
+ Hits 4087 4110 +23
Misses 178 178
- Partials 82 84 +2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests for changes and deepsource also failing
6dda086
to
997522d
Compare
62625a8
to
d461dc8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the following before merging this PR:
- Screenshot of DAG with lineage values on Marquez populating for operators for databases SQLite, google bigquery and Snowflake.
- The documentation changes for details around
OPENLINEAGE_AIRFLOW_DISABLE_SOURCE_CODE
. - Issue link on Openlineage if sourceCodeFacet is not visible on Marquez UI.
- Screenshot of DAG with lineage values on astro-cloud populating for operators for databases SQLite, google bigquery and Snowflake.
Opened an issue in. Openlineage for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ff946ce
to
7c3161e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_source_code
could be extracted to some common module to avoid code duplication. Other than that - LGTM 👍
Merging it. please let we have any further improvement suggestion we will take care of that in a separate PR |
**Please describe the feature you'd like to see** closes: #1467 The python transform should also expose its content with the sourceCode facet, similar to the sql facet. Example: [link](https://github.com/OpenLineage/OpenLineage/blob/3090ced24604c95716dacd667c2cff52bf438aba/integration/airflow/openlineage/airflow/extractors/python_extractor.py#L31) [Action item: Create an issue on astro-sdk] Include the transformation python code that the transformations were running in the OpenLineage events so that they showed up in the Info tab For demo purposes I hard-coded both of these in a custom openlineage-airflow fork like: ``` code = inspect.getsource(task.python_callable) job_facet = {"sql": SqlJobFacet(query=code), "sourceCodeLocation": SourceCodeLocationJobFacet("git", "https://github.com/astronomer/astro-days-chicago/blob/9cca4e166d73106e903f3d9f32af334d7b5560a3/dags/airflow_ecosystem.py")} ``` ^ the code would probably be cleaner as {"sourceCode": SourceCodeJobFacet("python", code)} I stuffed it in sql only due to the lack of: astronomer/astro#2150 which is a temporary limitation **Describe the solution you'd like** - Add source code facet for operators for open lineage integrations Currently source code facet is done for base decorator operators. Co-authored-by: Pankaj <pankaj.singh@astronomer.io> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Please describe the feature you'd like to see
closes: #1467
The python transform should also expose its content with the sourceCode facet, similar to the sql facet. Example: link [Action item: Create an issue on astro-sdk]
Include the transformation python code that the transformations were running in the OpenLineage events so that they showed up in the Info tab
For demo purposes I hard-coded both of these in a custom openlineage-airflow fork like:
^ the code would probably be cleaner as {"sourceCode": SourceCodeJobFacet("python", code)} I stuffed it in sql only due to the lack of: https://github.com/astronomer/astro/issues/2150 which is a temporary limitation
Describe the solution you'd like
Currently source code facet is done for base decorator operators.