-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingest): add lineage_client_project_id field to the BigQuery config #4138
Conversation
e0121f6
to
2e08675
Compare
@@ -336,7 +337,11 @@ def _compute_bigquery_lineage_via_gcp_logging(self) -> None: | |||
def _compute_bigquery_lineage_via_exported_bigquery_audit_metadata(self) -> None: | |||
logger.info("Populating lineage info via exported GCP audit logs") | |||
try: | |||
_client: BigQueryClient = BigQueryClient(project=self.config.project_id) | |||
if self.config.lineage_client_project_id is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only for exported_bigquery_audit_metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For our use case, we only need it for the exported_bigquery_audit_metadata
, but I can also add to the code that makes the client for _compute_bigquery_lineage_via_gcp_logging()
@@ -336,7 +337,11 @@ def _compute_bigquery_lineage_via_gcp_logging(self) -> None: | |||
def _compute_bigquery_lineage_via_exported_bigquery_audit_metadata(self) -> None: | |||
logger.info("Populating lineage info via exported GCP audit logs") | |||
try: | |||
_client: BigQueryClient = BigQueryClient(project=self.config.project_id) | |||
if self.config.lineage_client_project_id is None: | |||
self.config.lineage_client_project_id = self.config.project_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not overwrite the config value if possible but I would store it in a local variable just to be on the safe side like
project_id:str = self.config.lineage_client_project_id if self.config.lineage_client_project_id else self.config.project_id
_client: BigQueryClient = BigQueryClient(
project=project_id
)
if project_id is not None: | ||
return [GCPLoggingClient(**client_options, project=project_id)] | ||
else: | ||
return [GCPLoggingClient(**client_options)] | ||
|
||
def _choose_lineage_client_project_id(self) -> Optional[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I'd make this _get
for consistency and clarity (since this method doesn't actually accept any inputs)
Maybe this should be a property so the value is only computed once instead of on each function call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
62b2c31
to
2e4f24d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…fig (datahub-project#4138) * feat(ingest): add lineage_client_project_id field to the bigquery config * fix linting issues * add type annotation for arguments
This allows users to specify which project to use when creating the BigQuery client, in case the default project_id is not used for querying.
Checklist