-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import missing source descriptions in documentation (table and columns) from catalog #128
Comments
Hey @fabrice-etanchaud, interesting idea! When you say descriptions in the catalog, do you mean database comments? I'm assuming those are comments left by users or tools external to dbt. There's a good question here of whether dbt should be in the habit of pulling or pushing information about sources. This proposal is in direct opposition with dbt-labs/dbt-core#2540, which proposes a run-operation for dbt to propagate its source descriptions as database comments on source tables. |
Hello @jtcohen6, glad to read you find this interesting ! Yes, these are database comments on relations and their columns, created by the source application. It's indeed a tedious task to manually document a source, and as dbt already populates its internal catalog with table and column comments, why not use them as default values ? Yes I understand it seems to contradict dbt-labs/dbt-core#2540, but IMHO these are complementary features ! Another solution could be to create an operation to export source db comments as a source description yaml file. But I honestly think that a simple "COALESCE(yaml_description, db_description)" could be of great help. By the way, I would like to thank you for providing us with such a powerful tool. As a software engineer, I really love the way dbt brings safety, simplicity and expressiveness to the data world. Here in France, there are yet only a handful companies using dbt, and I am awaiting the time when dbt reveals itself as the right way to go. In my current position, we are stucked on oracle, and I am contributing to the dbt-oracle project. |
Okay, I'm sold! You make some compelling points. I see how this would be a complement to persisting dbt source descriptions, yielding a gradient of control:
The question on my mind: Is there a common instance in which a dbt developer would not want ingestion-tool-defined descriptions to show up in dbt-docs? Is there a risk it could contain PII, or misleading information? The remedy would be to override all table + column descriptions, e.g. by setting them to So, this feels worthwhile! As the database descriptions are already in the catalog, I'm going to transfer the issue to the dbt-docs repo, since that's where the requisite code change will need to take place.
Ah vraiment ? Je prévois de déménager en France bientôt (dès que possible)
You've reminded me that I need to review the PR to add dbt-oracle to docs.getdbt.com. On it! |
By the way, @jtcohen6 , were you kidding when you said you envisioned to move to France ? I truly think dbt has a great potential in Europe these next years ! |
Not kidding! |
Great ! Do not hesite in contacting me (but I think you already have acquaintances in France to envision such a change !) when the decision is taken ! By the way, dbt has potential all around the world ;-) so you have the choice ! |
…sing This will display the column description fetched from the database if there is no column description in the DBT config. Partially fixes dbt-labs#128, partially because the table description is still missing. Adding the table description would require more substantial changes, since the table description (as far as I could tell) is not found in catalog.json at the moment.
…sing This will display the column description fetched from the database if there is no column description in the DBT config. Partially fixes dbt-labs#128, partially because the table description is still missing. Adding the table description would require more substantial changes, since the table description (as far as I could tell) is not found in catalog.json at the moment.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days. |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days. |
Describe the feature
In the generated documentation, populate source table and columns description from catalog.
Additional context
From what I see, these comments are already in the catalog.json file.
Who will this benefit?
All users with well documented source data.
Are you interested in contributing this feature?
I did not try to circonvent the code impacted, my python is too poor, but my jinja is rich, so why not ?
The text was updated successfully, but these errors were encountered: