Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import missing source descriptions in documentation (table and columns) from catalog #128

Closed
fabrice-etanchaud opened this issue Aug 19, 2020 · 8 comments
Labels
good_first_issue Good for newcomers Stale

Comments

@fabrice-etanchaud
Copy link

Describe the feature

In the generated documentation, populate source table and columns description from catalog.

Additional context

From what I see, these comments are already in the catalog.json file.

Who will this benefit?

All users with well documented source data.

Are you interested in contributing this feature?

I did not try to circonvent the code impacted, my python is too poor, but my jinja is rich, so why not ?

@jtcohen6
Copy link
Contributor

Hey @fabrice-etanchaud, interesting idea! When you say descriptions in the catalog, do you mean database comments? I'm assuming those are comments left by users or tools external to dbt.

There's a good question here of whether dbt should be in the habit of pulling or pushing information about sources. This proposal is in direct opposition with dbt-labs/dbt-core#2540, which proposes a run-operation for dbt to propagate its source descriptions as database comments on source tables.

@fabrice-etanchaud
Copy link
Author

fabrice-etanchaud commented Aug 20, 2020

Hello @jtcohen6, glad to read you find this interesting !

Yes, these are database comments on relations and their columns, created by the source application.
In case dbt's source documentation does not contain a given table or column description, this could be set to the comment found in the source db.

It's indeed a tedious task to manually document a source, and as dbt already populates its internal catalog with table and column comments, why not use them as default values ?

Yes I understand it seems to contradict dbt-labs/dbt-core#2540, but IMHO these are complementary features !

Another solution could be to create an operation to export source db comments as a source description yaml file. But I honestly think that a simple "COALESCE(yaml_description, db_description)" could be of great help.

By the way, I would like to thank you for providing us with such a powerful tool. As a software engineer, I really love the way dbt brings safety, simplicity and expressiveness to the data world. Here in France, there are yet only a handful companies using dbt, and I am awaiting the time when dbt reveals itself as the right way to go. In my current position, we are stucked on oracle, and I am contributing to the dbt-oracle project.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Aug 24, 2020

Okay, I'm sold! You make some compelling points. I see how this would be a complement to persisting dbt source descriptions, yielding a gradient of control:

  • No worries: Let the ingestion tool document my sources in-database, pull those descriptions into dbt-docs site
  • Partial control: Mask the ingestion tool's descriptions for some tables/columns in the dbt-docs site
  • Total control: Override the ingestion tool's in-database descriptions for those tables/columns with a run-operation

The question on my mind: Is there a common instance in which a dbt developer would not want ingestion-tool-defined descriptions to show up in dbt-docs? Is there a risk it could contain PII, or misleading information? The remedy would be to override all table + column descriptions, e.g. by setting them to ''. (We've not yet implemented show: false for sources.) I can't think of any obvious cases right now, though, that would be cause for concern.

So, this feels worthwhile! As the database descriptions are already in the catalog, I'm going to transfer the issue to the dbt-docs repo, since that's where the requisite code change will need to take place.

Here in France, there are yet only a handful companies using dbt, and I am awaiting the time when dbt reveals itself as the right way to go.

Ah vraiment ? Je prévois de déménager en France bientôt (dès que possible)

In my current position, we are stucked on oracle, and I am contributing to the dbt-oracle project.

You've reminded me that I need to review the PR to add dbt-oracle to docs.getdbt.com. On it!

@jtcohen6 jtcohen6 transferred this issue from dbt-labs/dbt-core Aug 24, 2020
@jtcohen6 jtcohen6 added the good_first_issue Good for newcomers label Aug 26, 2020
@fabrice-etanchaud
Copy link
Author

By the way, @jtcohen6 , were you kidding when you said you envisioned to move to France ? I truly think dbt has a great potential in Europe these next years !

@jtcohen6
Copy link
Contributor

Not kidding!

@fabrice-etanchaud
Copy link
Author

Great ! Do not hesite in contacting me (but I think you already have acquaintances in France to envision such a change !) when the decision is taken ! By the way, dbt has potential all around the world ;-) so you have the choice !

halvorlu added a commit to halvorlu/dbt-docs that referenced this issue Apr 11, 2022
…sing

This will display the column description fetched from the database if there is no column description in the DBT config.
Partially fixes dbt-labs#128, partially because the table description is still missing.
Adding the table description would require more substantial changes, since the table description (as far as I could tell) is not found in catalog.json at the moment.
halvorlu added a commit to halvorlu/dbt-docs that referenced this issue Apr 11, 2022
…sing

This will display the column description fetched from the database if there is no column description in the DBT config.
Partially fixes dbt-labs#128, partially because the table description is still missing.
Adding the table description would require more substantial changes, since the table description (as far as I could tell) is not found in catalog.json at the moment.
@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good_first_issue Good for newcomers Stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants