Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parseable JSON and text output in quiet mode for dbt show and dbt compile #9958

Merged
merged 25 commits into from
Nov 19, 2024

Conversation

dbeatty10
Copy link
Contributor

@dbeatty10 dbeatty10 commented Apr 16, 2024

resolves #9840 (along with dbt-labs/dbt-common#216)

Problem

As described in #9840:

I would like the output of dbt show to be valid JSON, containing just the data and no logs.

dbt list uses the --quiet flag to isolate the desired output from the log output, allowing the results to be piped or redirected.

But currently, dbt show and dbt compile do not work similarly. Rather, the --quiet flag suppresses all output.

Solution

The latest solution uses dbt-labs/dbt-common#216 so that CompiledNode and ShowNode keep their same event names in the JSON logs, but also allow it to be emitted without timestamps even when --quiet.

When --quiet, also skips any extraneous output like:

  • Previewing node 'my_model':
  • Previewing inline node:
  • Compiled node 'my_model' is:
  • Compiled inline node is:

🎩

$ dbt show --inline "select 1 as id" -q --output json | jq .   

{
  "show": [
    {
      "id": 1
    }
  ]
}
$ dbt compile --inline "select 1 as id" -q --output json | jq .

{
  "compiled": "select 1 as id"
}
$ dbt show --select my_model --log-format json --quiet
$ cat logs/dbt.log

...
{"data": {"is_inline": false, "node_name": "my_model", "output_format": "text", "preview": "| id |\n| -- |\n|  1 |\n", "unique_id": "model.my_project.my_model"}, "info": {"category": "", "code": "Q041", "extra": {}, "invocation_id": "52649b96-c807-4ccc-b3dc-516882c15104", "level": "info", "msg": "Previewing node 'my_model':\n| id |\n| -- |\n|  1 |\n", "name": "ShowNode", "pid": 19995, "thread": "MainThread", "ts": "2024-11-19T01:45:49.800578Z"}}
...
$ dbt compile --select my_model --log-format json --quiet
$ cat logs/dbt.log

...
{"data": {"compiled": "select 1 as id", "is_inline": false, "node_name": "my_model", "output_format": "text", "unique_id": "model.my_project.my_model"}, "info": {"category": "", "code": "Q042", "extra": {}, "invocation_id": "d46ba24d-5968-4384-9277-e460600570ec", "level": "info", "msg": "Compiled node 'my_model' is:\nselect 1 as id", "name": "CompiledNode", "pid": 21075, "thread": "MainThread", "ts": "2024-11-19T01:47:07.203861Z"}}
...

Initial Solution (not adopted)

The initial solution adopted the same exact approach as dbt list here, and basically copy-pasted from there.

Similar to how #10131 stopped using ListCmdOut in favor of PrintEvent, this PR stopped using CompiledNode and ShowNode in favor of PrintEvent.

👉 So any consumers relying on CompiledNode or ShowNode existing with JSON logs wouldn't see those anymore but would see only PrintEvent instead. So I switched to the latest solution, to avoid any unintentional breakage for anyone creating and parsing JSON logs for dbt show / dbt compile.

Before vs. after

Scenarios:

  • show vs. compile
  • --select vs. --inline
  • --quiet vs. --no-quiet
  • --output text vs. json
  • --log-format text vs. json

Example for initial solution

dbt show --select my_model --log-format json --quiet

logs/dbt.log before:

{"data": {"is_inline": false, "node_name": "my_model", "output_format": "text", "preview": "| event_id |   date_day |\n| -------- | ---------- |\n|        1 | 2002-02-02 |\n", "unique_id": "model.my_project.my_model"}, "info": {"category": "", "code": "Q041", "extra": {}, "invocation_id": "b295d589-99ba-450f-b261-0d60c5bcc195", "level": "info", "msg": "Previewing node 'my_model':\n| event_id |   date_day |\n| -------- | ---------- |\n|        1 | 2002-02-02 |\n", "name": "ShowNode", "pid": 96009, "thread": "MainThread", "ts": "2024-11-04T19:40:52.452943Z"}}

logs/dbt.log after:

{"data": {"msg": "| event_id |   date_day |\n| -------- | ---------- |\n|        1 | 2002-02-02 |\n"}, "info": {"category": "", "code": "Z052", "extra": {}, "invocation_id": "d0857f0c-b58b-4374-893c-c454aba291b5", "level": "info", "msg": "| event_id |   date_day |\n| -------- | ---------- |\n|        1 | 2002-02-02 |\n", "name": "PrintEvent", "pid": 96415, "thread": "MainThread", "ts": "2024-11-04T19:41:43.897976Z"}}

This shows a difference in the JSON messages in the logs if CompiledNode and ShowNode events are converted to PrintEvent events.

In contrast, the solution we decided upon keeps CompiledNode and ShowNode events so the logs stay the same and don't change.

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests
  • This PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@cla-bot cla-bot bot added the cla:yes label Apr 16, 2024
Copy link

codecov bot commented Jul 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.08%. Comparing base (945539e) to head (5175efd).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9958      +/-   ##
==========================================
- Coverage   89.13%   89.08%   -0.06%     
==========================================
  Files         183      183              
  Lines       23638    23644       +6     
==========================================
- Hits        21070    21063       -7     
- Misses       2568     2581      +13     
Flag Coverage Δ
integration 86.40% <100.00%> (-0.12%) ⬇️
unit 62.77% <62.50%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Unit Tests 62.77% <62.50%> (-0.01%) ⬇️
Integration Tests 86.40% <100.00%> (-0.12%) ⬇️
---- 🚨 Try these New Features:

@dbeatty10 dbeatty10 marked this pull request as ready for review November 5, 2024 17:35
@dbeatty10 dbeatty10 requested a review from a team as a code owner November 5, 2024 17:35
@dbeatty10
Copy link
Contributor Author

@b-per FYI this and dbt-labs/dbt-common#216 are meant to solve your request in #9840

@aranke aranke added the proto update update proto definitions in CI label Nov 8, 2024
Copy link
Member

@aranke aranke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, large diff is because of change in proto.

dev-requirements.txt Outdated Show resolved Hide resolved
@dbeatty10 dbeatty10 changed the title Allow dbt show and dbt compile to output JSON without extra logs Parseable JSON and text output in quiet mode for dbt show and dbt compile Nov 19, 2024
@dbeatty10 dbeatty10 merged commit 2a75dd4 into main Nov 19, 2024
54 of 56 checks passed
@dbeatty10 dbeatty10 deleted the dbeatty/9840-quiet-show branch November 19, 2024 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes proto update update proto definitions in CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Allow dbt show to output json data without extra logs (by updating --quiet)
3 participants