Skip to content

Commit

Permalink
Updated list of default data observability checks in the getting star…
Browse files Browse the repository at this point in the history
…ted article.
  • Loading branch information
piotrczarnas committed Mar 4, 2024
1 parent fdd4a85 commit 8d97030
Showing 1 changed file with 41 additions and 33 deletions.
74 changes: 41 additions & 33 deletions docs/getting-started/review-results-and-run-monitoring-checks.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,39 +9,47 @@ we describe how to review the initial results from the basic statistics and prof
Once new tables are imported, DQOps automatically activates the following profiling and monitoring checks.
To learn more about each check, click on the links below.

**Profiling checks type**

| Target | Check name | Description |
|--------|-------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|
| table | [profile row count](../checks/table/volume/row-count.md) | Counts the number of rows in a table. |
| table | [profile column count](../checks/table/schema/column-count.md) | Retrieves the metadata of the monitored table from the data source, counts the number of columns and compares it to an expected number of columns. |
| column | [profile nulls count](../checks/column/nulls/nulls-count.md) | Ensures that there are no more than a set number of null values in the monitored column. |
| column | [profile nulls percent](../checks/column/nulls/nulls-percent.md) | Ensures that there are no more than a set percentage of null values in the monitored column. |
| column | [profile_not_nulls_count](../checks/column/nulls/not-nulls-count.md) | Ensures that there are no more than a set number of null values in the monitored column. |

**Daily monitoring checks type**

| Target | Check name | Description |
|--------|--------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|
| table | [daily row count](../checks/table/volume/row-count.md) | Counts the number of rows in a table. |
| table | [daily row count anomaly](../checks/table/volume/row-count-anomaly.md) | Ensures that the row count is within a two-tailed percentile from measurements made during the last 90 days. |
| table | [daily row count change](../checks/table/volume/row-count-change.md) | Ensures that the row count changed by a fixed rate since the last readout. |
| table | [daily table availability](../checks/table/availability/table-availability.md) | Verifies that a table exists, can be accessed, and queried without errors. |
| table | [daily column count](../checks/table/schema/column-count.md) | Retrieves the metadata of the monitored table from the data source, counts the number of columns and compares it to an expected number of columns. |
| table | [daily column count changed](../checks/table/schema/column-count-changed.md) | Detects whether the number of columns in a table has changed since the last time the check (checkpoint) was run. |
| table | [daily column list changed](../checks/table/schema/column-list-changed.md) | Detects if the list of columns has changed since the last time the check was run. |
| table | [daily column list or order changed](../checks/table/schema/column-list-or-order-changed.md) | Detects whether the list of columns and the order of columns have changed since the last time the check was run. |
| table | [daily column types changed](../checks/table/schema/column-types-changed.md) | Detects if the column names or column types have changed since the last time the check was run. |
| column | [daily nulls count](../checks/column/nulls/nulls-count.md) | Ensures that there are no more than a set number of null values in the monitored column. |
| column | [daily nulls percent](../checks/column/nulls/nulls-percent.md) | Ensures that there are no more than a set percentage of null values in the monitored column. |
| column | [daily not nulls count](../checks/column/nulls/not-nulls-count.md) | Ensures that there are no more than a set number of null values in the monitored column. |
| column | [daily not nulls percent](../checks/column/nulls/not-nulls-percent.md) | Ensures that there are no more than a set percentage of not null values in the monitored column. |
| column | [daily nulls percent anomaly](../checks/column/nulls/nulls-percent-anomaly.md) | Ensures that the null percent value in a monitored column is within a two-tailed percentile from measurements made during the last 90 days. |
| column | [daily nulls percent change 1 day](../checks/column/nulls/nulls-percent-change-1-day.md) | Ensures that the null percent in a monitored column has changed by a fixed rate since the last readout from yesterday. |
| column | [daily_distinct_count_anomaly](../checks/column/uniqueness/distinct-count-anomaly.md) | Ensures that the distinct count in a monitored column is within a two-tailed percentile from measurements made during the last 90 days |
| column | [daily detected datatype in text changed](../checks/column/datatype/detected-datatype-in-text-changed.md) | Scans all values in a string column and detects the data type of all values in a column. |
| column | [daily column exists](../checks/column/schema/column-exists.md) | Reads the metadata of the monitored table and verifies that the column still exists in the data source. |
| column | [daily column type changed](../checks/column/schema/column-type-changed.md) | Detects if the data type of the column has changed since the last time it was retrieved. |
### Default data profiling checks
DQOps activates the following [default data quality checks](../dqo-concepts/data-observability.md) to
perform [initial data profiling](../dqo-concepts/definition-of-data-quality-checks/data-profiling-checks.md#initial-data-quality-kpi-score).

| Target | Check name | Description |
|--------|------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|
| table | [profile row count](../checks/table/volume/row-count.md) | Captures the row count of the table and identifies empty tables. |
| table | [profile data freshness](../checks/table/timeliness/data-freshness.md) | Measures data freshness of the table. |
| table | [profile column count](../checks/table/schema/column-count.md) | Retrieves the metadata of the monitored table from the data source and captures the column count. |
| column | [profile nulls count](../checks/column/nulls/nulls-count.md) | Counts null values in every column and detects incomplete columns that contain null values. |
| column | [profile nulls percent](../checks/column/nulls/nulls-percent.md) | Measures the percentage of null values in every column. |
| column | [profile not nulls count](../checks/column/nulls/not-nulls-count.md) | Counts not null values in every column and detects empty columns that contain only null values. |

### Default daily monitoring checks
DQOps activates the following [daily monitoring checks](../dqo-concepts/definition-of-data-quality-checks/data-observability-monitoring-checks.md)
on every table and column to apply [data observability](../dqo-concepts/data-observability.md) of the data source.

| Target | Check name | Description |
|--------|-----------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| table | [daily row count](../checks/table/volume/row-count.md) | Captures the row count of the table every day and identifies empty tables. |
| table | [daily row count anomaly](../checks/table/volume/row-count-anomaly.md) | Detects day-to-day anomalies in the table volume. Raises a data quality issue when the increase or decrease in the row count is in the top 1% of most significant changes. |
| table | [daily row count change](../checks/table/volume/row-count-change.md) | Detects significant day-to-day changes in the table volume. Raises a data quality issue when the increase or decrease in the row count is greater than 10%. |
| table | [daily data freshness](../checks/table/timeliness/data-freshness.md) | Measures data freshness of the table. Raises a data quality issue when the data is outdated by 2 days. |
| table | [daily data staleness](../checks/table/timeliness/data-staleness.md) | Measures data staleness (the time since the last data loading) of the table. Raises a data quality issue when the table was not updated for 2 or more days. |
| table | [daily table availability](../checks/table/availability/table-availability.md) | Verifies that a table exists, can be accessed, and queried without errors. Detects corrupted tables and expired credentials to data sources. |
| table | [daily column count](../checks/table/schema/column-count.md) | Retrieves the metadata of the monitored table from the data source and counts the number of columns. |
| table | [daily column count changed](../checks/table/schema/column-count-changed.md) | Detects whether the number of columns in a table has changed since the last time the check (checkpoint) was run. |
| table | [daily column list changed](../checks/table/schema/column-list-changed.md) | Detects if the list of columns has changed since the last time the check was run. |
| table | [daily column list or order changed](../checks/table/schema/column-list-or-order-changed.md) | Detects whether the list of columns and the order of columns have changed since the last time the check was run. |
| table | [daily column types changed](../checks/table/schema/column-types-changed.md) | Detects if the column names or column types have changed since the last time the check was run. |
| column | [daily nulls count](../checks/column/nulls/nulls-count.md) | Counts null values in every column without raising any data quality issues. |
| column | [daily nulls percent](../checks/column/nulls/nulls-percent.md) | Measures the percentage of null values in every column without raising any data quality issues. |
| column | [daily nulls percent anomaly](../checks/column/nulls/nulls-percent-anomaly.md) | Measures the percentage of null values in every column and detects anomalous changes in the percentage of null value. Raises a data quality issue for the top 1% biggest day-to-day changes. |
| column | [daily not nulls percent](../checks/column/nulls/not-nulls-percent.md) | Measures the percentage of null values in every column without raising any data quality issues. |
| column | [daily nulls percent change](../checks/column/nulls/nulls-percent-change.md) | Detects significant day-to-day changes in the percentage of null values in every column. Raises a data quality issue when the increase or decrease in the percentage of null values is greater than 10%. |
| column | [daily distinct count anomaly](../checks/column/uniqueness/distinct-count-anomaly.md) | Counts distinct values in every column and detects anomalous changes in the percentage of null value. Raises a data quality issue for the top 1% biggest day-to-day changes of the count of distinct values. |
| column | [daily detected datatype in text changed](../checks/column/datatype/detected-datatype-in-text-changed.md) | Scans all values in a text column and detects the data type of all values in a column. Raises a data quality issue when the type of texts changes. For example, when a column contained always numeric values, but a text value was found. |
| column | [daily sum anomaly](../checks/column/anomaly/sum-anomaly.md) | Sums values in all numeric columns. Detects day-to-day anomalies in the sum of numeric values. Raises a data quality issue for the top 1% biggest day-to-day changes. |
| column | [daily mean anomaly](../checks/column/anomaly/mean-anomaly.md) | Calculates a mean (average) value in all numeric columns. Detects day-to-day anomalies of the mean of numeric values. Raises a data quality issue for the top 1% biggest day-to-day changes. |
| column | [daily column exists](../checks/column/schema/column-exists.md) | Reads the metadata of the monitored table and verifies that the column still exists in the data source. |
| column | [daily column type changed](../checks/column/schema/column-type-changed.md) | Detects if the data type of the column has changed since the last time it was retrieved. |

All checks are scheduled to run daily at 12:00 a.m.

Expand Down

0 comments on commit 8d97030

Please sign in to comment.