Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add zero length option to exclusive range test. #307

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,15 @@ models:
upper_bound_column: ended_at
partition_by: customer_id
gaps: required

# test that each customer can have subscriptions that start and end on the same date
- name: subscriptions
tests:
- dbt_utils.mutually_exclusive_ranges:
lower_bound_column: started_at
upper_bound_column: ended_at
partition_by: customer_id
zero_length: allowed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of this instead?

Suggested change
zero_length: allowed
zero_length_range_allowed: true

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about that, but I thought it would be best to stick with the required/allowed/not_allowed convention used in gaps parameter. Happy to change if you feel strongly.

Copy link
Contributor

@clrcrl clrcrl Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a true/false is nicer here — had to use an enum for the above since there were three different options 😓

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All set :)

One other thing:
I feel like the lack of negative test coverage is concerning for this project. I would like to be able to make test cases where the test (specifically, the test that is meant to be reused in other code) fails. As it stands we only have tests where the tested model passes the test.

One kluge that we could use to solve this quickly is to put tags on tests that should fail, exclude them from the standard test run, and run the tests that should fail with the expectation that it will return a failure code.

To me a better way would be to add an optional flag expect: pass/warn/fail to the test.config and then invert the results appropriately here:
https://github.com/fishtown-analytics/dbt/blob/dev/kiyoshi-kuromiya/core/dbt/task/test.py#L95

I would love to hear your thoughts, so I can decide where to make my PR, Thanks!

```
**Args:**
* `lower_bound_column` (required): The name of the column that represents the
Expand All @@ -337,6 +346,8 @@ upper value of the range. Must be not null.
argument to indicate which column to partition by. `default=none`
* `gaps` (optional): Whether there can be gaps are allowed between ranges.
`default='allowed', one_of=['not_allowed', 'allowed', 'required']`
* `zero_length` (optional): Whether ranges can start and end on the same date.
`default='not_allowed', one_of=['not_allowed', 'allowed']`

**Note:** Both `lower_bound_column` and `upper_bound_column` should be not null.
If this is not the case in your data source, consider passing a coalesce function
Expand Down Expand Up @@ -383,6 +394,24 @@ the lower bound of the next record (common for date ranges).
| 2 | 3 |
| 4 | 5 |

**Understanding the `zero_length` parameter:**
Here are a number of examples for each allowed `zero_length` parameter.
* `zero_length:not_allowed`: The upper bound of each record must be greater than its lower bound.

| lower_bound | upper_bound |
|-------------|-------------|
| 0 | 1 |
| 1 | 2 |
| 2 | 3 |

* `zero_length:allowed`: The upper bound of each record can be greater than or equal to its lower bound.

| lower_bound | upper_bound |
|-------------|-------------|
| 0 | 1 |
| 2 | 2 |
| 3 | 4 |

#### unique_combination_of_columns ([source](macros/schema_tests/unique_combination_of_columns.sql))
This test confirms that the combination of columns is unique. For example, the
combination of month and product is unique, however neither column is unique
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
subscription_id,valid_from,valid_to
3,2020-05-06,2020-05-07
3,2020-05-08,2020-05-08
3,2020-05-09,2020-05-10
4,2020-06-06,2020-06-07
4,2020-06-08,2020-06-08
4,2020-06-09,2020-06-10
8 changes: 8 additions & 0 deletions integration_tests/models/schema_tests/schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,14 @@ models:
partition_by: subscription_id
gaps: required

- name: data_test_mutually_exclusive_ranges_with_gaps_zero_length
tests:
- dbt_utils.mutually_exclusive_ranges:
lower_bound_column: valid_from
upper_bound_column: valid_to
partition_by: subscription_id
zero_length: allowed

- name: data_unique_combination_of_columns
tests:
- dbt_utils.unique_combination_of_columns:
Expand Down
20 changes: 15 additions & 5 deletions macros/schema_tests/mutually_exclusive_ranges.sql
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{% macro test_mutually_exclusive_ranges(model, lower_bound_column, upper_bound_column, partition_by=None, gaps='allowed') %}
{% macro test_mutually_exclusive_ranges(model, lower_bound_column, upper_bound_column, partition_by=None, gaps='allowed', zero_length='not_allowed') %}

{% if gaps == 'not_allowed' %}
{% set allow_gaps_operator='=' %}
Expand All @@ -13,7 +13,17 @@
{{ exceptions.raise_compiler_error(
"`gaps` argument for mutually_exclusive_ranges test must be one of ['not_allowed', 'allowed', 'required'] Got: '" ~ gaps ~"'.'"
) }}

{% endif %}
{% if zero_length == 'not_allowed' %}
{% set allow_zero_length_operator='<' %}
{% set allow_zero_length_operator_in_words='less_than' %}
{% elif zero_length == 'allowed' %}
{% set allow_zero_length_operator='<=' %}
{% set allow_zero_length_operator_in_words='less_than_or_equal_to' %}
{% else %}
{{ exceptions.raise_compiler_error(
"`zero_length` argument for mutually_exclusive_ranges test must be one of ['not_allowed', 'allowed'] Got: '" ~ zero_length ~"'.'"
) }}
{% endif %}

{% set partition_clause="partition by " ~ partition_by if partition_by else '' %}
Expand Down Expand Up @@ -51,9 +61,9 @@ calc as (
-- Coalesce it to return an error on the null case (implicit assumption
-- these columns are not_null)
coalesce(
lower_bound < upper_bound,
lower_bound {{ allow_zero_length_operator }} upper_bound,
false
) as lower_bound_less_than_upper_bound,
) as lower_bound_{{ allow_zero_length_operator_in_words }}_upper_bound,

-- For each record: upper_bound {{ allow_gaps_operator }} the next lower_bound.
-- Coalesce it to handle null cases for the last record.
Expand All @@ -75,7 +85,7 @@ validation_errors as (

where not(
-- THE FOLLOWING SHOULD BE TRUE --
lower_bound_less_than_upper_bound
lower_bound_{{ allow_zero_length_operator_in_words }}_upper_bound
and upper_bound_{{ allow_gaps_operator_in_words }}_next_lower_bound
)
)
Expand Down