Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: metadata freshness checks do not work properly #229

Closed
1 task done
KingLommel opened this issue May 23, 2024 · 1 comment · Fixed by #234
Closed
1 task done

[Bug]: metadata freshness checks do not work properly #229

KingLommel opened this issue May 23, 2024 · 1 comment · Fixed by #234
Labels
bug Something isn't working

Comments

@KingLommel
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

When I run dbt source freshness without using the loaded_at_field keyword (see https://docs.getdbt.com/reference/resource-properties/freshness), then according to the documentation dbt should calculate freshness via warehouse metadata tables for supported adapters.

Although I checked the metadata-tables , the freshness checks failed.

Using the debug flag of dbt I saw the following relevant lines

...
dremio adapter: On source.<path>.<to>.<mytable>: select committed_at as last_modified,
                (SELECT CURRENT_TIMESTAMP()) as snapshotted_at
          from TABLE( table_snapshot(<path>.<to>.<mytable>) )
...

The main point is, that the select committed_at does not use the max value but all of the commited_at values. Therefore the freshness checks fail.


Suggested Solution:

Having a look at this page dbt-labs/dbt-core#8307 I was able to find out what macro is responsible for that behaviour: dremio__get_relation_last_modified.

At this point https://github.com/dremio/dbt-dremio/blob/main/dbt/include/dremio/macros/adapters/metadata.sql we can see the macro

{% macro dremio__get_relation_last_modified(information_schema, relations) -%}
  {% set relation = relations[0] %}
  {%- if relation.type != 'view' -%}

    {%- call statement('last_modified', fetch_result=True) -%}
          select committed_at as last_modified,
                {{ current_timestamp() }} as snapshotted_at
          from TABLE( table_snapshot('{{relation}}') )
    {%- endcall -%}
  {%- else -%}

  {%- endif -%}

  {{ return(load_result('last_modified')) }}

{% endmacro %}

Changing the select statement to use max solved all my problems:

{% macro dremio__get_relation_last_modified(information_schema, relations) -%}
  {% set relation = relations[0] %}
  {%- if relation.type != 'view' -%}

    {%- call statement('last_modified', fetch_result=True) -%}
          select max(committed_at) as last_modified,
                {{ current_timestamp() }} as snapshotted_at
          from TABLE( table_snapshot('{{relation}}') )
    {%- endcall -%}
  {%- else -%}

  {%- endif -%}

  {{ return(load_result('last_modified')) }}

{% endmacro %}

Expected Behavior

No response

Steps To Reproduce

No response

Environment

- OS: Ubuntu 22.04
- dbt-dremio: 1.7.0
- Dremio Software: 25.0.0
- Dremio Cloud: N/A

Relevant log output

No response

@KingLommel KingLommel added the bug Something isn't working label May 23, 2024
@ravjotbrar
Copy link
Contributor

Thanks for bringing this to our attention @KingLommel. I'll take a look at implementing your fix.

@ravjotbrar ravjotbrar linked a pull request Jun 25, 2024 that will close this issue
1 task
ravjotbrar added a commit that referenced this issue Jun 26, 2024
### Summary

Metadata freshness checks were not working properly if `dbt source
freshness` was ran without using the `loaded_at_field` keyword. This is
because we were not reducing the amount of results to the max
commited_at value in our metadata select statement.

### Description

Added the max operator as suggested by @KingLommel. Also fixed the
relevant test to include more than one snapshot.

### Changelog

-   [x] Added a summary of what this PR accomplishes to CHANGELOG.md

### Related Issue

#229
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

2 participants