-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider making ParentColumnFinder public? #28
Comments
I think we can do two things. On one hand, add a nice method to public Set<ColumnLineage> extractColumnLevelLineage(
ResolvedQueryStmt statement, String outputTable) {
List<ResolvedOutputColumn> outputColumns = statement.getOutputColumnList();
return extractColumnLevelLineageForOutputColumns(outputTable, outputColumns, statement);
} On the other hand, I would still make I'll add these changes in the next couple of days (unless you'd like to submit a PR yourself). Then we'll release it on |
I have opened a pull request #29 to resolve this issue. Thanks for your recommendation of overloading the extractColumnLevelLineage to ColumnLineageExtractor, I have also added that into the PR. |
Thank you very much! I'll review later today. |
Merged on #29 |
* Update ZetaSQL to version 2023.10.1 * Rewrite queries to fully quote all name paths before analysis * Avoid no longer necessary nesting of tables in catalogs * Update query rewritting to only re-quote name paths that refer to resources (i.e. tables, functions, etc) * Avoid no longer necessary nesting of functions in catalogs * Avoid no longer necessary nesting of TVFs in catalogs * Remove unnecessary slf4j dependency Closes #24 * Reduce code duplication in BigQueryCatalog * Reduce the amount of nesting for procedures in catalogs * Avoid duplicate code when creating different types of resources in catalogs * Use the DOUBLE type kind for BigQuery FLOAT64 columns Closes #25 * Added extractColumnLevelLineage for ResolvedQueryStmt Closes #28 * Changed the access modifier of ParentColumnFinder to public Closes #28 * Make ColumnLineageExtractor accept only concrete statement types Previously, the API for ColumnLineageExtractor used the method ::extractColumnLevelLineage(ResolvedStatement). Since introducing support for ResolvedQueryStmts, which needs specifying an output table separately; maintaining the generic ResolvedStatement API required making it confusing, since it would optionally need to accept an output table. This makes it so that teams building lineage applications need to explicitly determine the statements they support and call the corresponding ::extractColumnLevelLineage() method. Such as ::extractColumnLevelLineage(ResolvedInsertStmt) or ::extractColumnLevelLineage(ResolvedQueryStmt, String). * Bump development version to 0.5.0-SNAPSHOT * vuln-fix: Use HTTPS instead of HTTP to resolve deps CVE-2021-26291 (#30) This fixes a security vulnerability in this project where the `pom.xml` files were configuring Maven to resolve dependencies over HTTP instead of HTTPS. Weakness: CWE-829: Inclusion of Functionality from Untrusted Control Sphere Severity: High CVSS: 8.1 Detection: CodeQL & OpenRewrite (https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories) Reported-by: Jonathan Leitschuh <Jonathan.Leitschuh@gmail.com> Bug-tracker: JLLeitschuh/security-research#8 Detection: CodeQL (https://codeql.github.com/codeql-query-help/java/java-maven-non-https-url/) & OpenRewrite (https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories) Reported-by: Jonathan Leitschuh <Jonathan.Leitschuh@gmail.com> Bug-tracker: JLLeitschuh/security-research#8 Use this link to re-run the recipe: https://app.moderne.io/recipes/org.openrewrite.maven.security.UseHttpsForRepositories?organizationId=R29vZ2xl Co-authored-by: Moderne <team@moderne.io> * Make the type parser case insensitive (#35) The type parser was previously case sensitive, while SQL types are case insensitive. This went unnoticed for a while since upper-cased types are usually always used, but is fundamentally incorrect. Fixes #32 * Add reflection-based patching of GRPC's default max nesting depth (#36) ZetaSQL's Java API uses a GRPC service to call into the actual C++ implementation of ZetaSQL. By default, the serialization logic of that communication allows for a nesting depth in protobuf messages of up to 100. However, long queries can exceed that level of nesting and as a result cannot be analyzed by default. This implements a reflection-based patch that allows users to override that limit to a greater number. This is brittle by design and should be used with caution. Fixes #31 * Upgrade to zetasql-2024-03-01 and bump deps (#33) * Upgrade to zetasql-2024-03-01 and bump deps * Enable all features * Rollback Mockito to version 4.11.0 * Remove some v1.4 language options not supported by BigQuery --------- Co-authored-by: Pablo Paglilla <ppaglilla@google.com> * Add missing @return in catalog Javadocs * Update version to v0.5.0 --------- Co-authored-by: Dion Ricky Saputra <dionrickysptr@gmail.com> Co-authored-by: Jonathan Leitschuh <jonathan.leitschuh@gmail.com> Co-authored-by: Moderne <team@moderne.io> Co-authored-by: Erlend Hamnaberg <erlend@hamnaberg.net>
Hi,
My company's ETL pipeline heavily utilizes write disposition and table destination feature in BigQuery. So the majority of our query is actually just a select statement. However, the
ColumnLineageExtractor
class provided by the zetasql-toolkit-core doesn't cover our use cases.I'm currently writing a custom column lineage extractor to support select statement when I realized that
ParentColumnFinder
class is not public. I needed the exact functionality that this class does. If it was set to package private by design could you give me suggestion what I should do.Thanks
The text was updated successfully, but these errors were encountered: