-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* fix * fix: index * fix: requirements * fix: link
- Loading branch information
1 parent
31e429a
commit a396839
Showing
8 changed files
with
237 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
Advanced Usage | ||
============== | ||
|
||
Project Health Configuration | ||
---------------------------- | ||
|
||
You can configure the project health settings by providing a configuration file. The configuration file is a YAML file that contains the following fields: | ||
|
||
.. code-block:: yaml | ||
version: v1 | ||
# Insights to disable | ||
disabled_insights: | ||
- source_staging_model_integrity | ||
- downstream_source_dependence | ||
- Duplicate_Sources | ||
- hard_coded_references | ||
- rejoining_upstream_concepts | ||
- model_fanout | ||
- multiple_sources_joined | ||
# Define patterns to identify different types of models | ||
model_type_patterns: | ||
staging: "^stg_.*" # Regex for staging models | ||
mart: "^(mrt_|mart_|fct_|dim_).*" # Regex for mart models | ||
intermediate: "^int_.*" # Regex for intermediate models | ||
base: "^base_.*" # Regex for base models | ||
# Configure insights | ||
insights: | ||
# Set minimum test coverage percent and severity for 'Low Test Coverage in DBT Models' | ||
dbt_low_test_coverage: | ||
min_test_coverage_percent: 30 | ||
severity: WARNING | ||
# Configure maximum fanout for 'Model Fanout Analysis' | ||
model_fanout.max_fanout: 10 | ||
# Configure maximum fanout for 'Source Fanout Analysis' | ||
source_fanout.max_fanout: 10 | ||
# Define model types considered as downstream for 'Staging Models Dependency Check' | ||
staging_models_dependency.downstream_model_types: | ||
- mart | ||
Key Sections of the config file | ||
------------------------------- | ||
|
||
- disabled_insights: Insights that you want to disable | ||
- model_type_patterns: Regex patterns to identify different model types like staging, mart, etc. | ||
- insights: Custom configurations for each insight. For each insight, you can set specific thresholds, severity levels, or other parameters. | ||
|
||
Severity can have 3 values -> INFO, WARNING, ERROR | ||
|
||
Overriding default configs for the insights | ||
------------------------------------------- | ||
|
||
To change the severity level or set a threshold, modify the corresponding insight under the insights section. For example: | ||
|
||
.. code-block:: yaml | ||
insights: | ||
dbt_low_test_coverage: | ||
severity: WARNING | ||
For insights with more complex configurations (like fanout thresholds or model types), you need to specify the insight name and corresponding parameter under insights. For example: | ||
|
||
.. code-block:: yaml | ||
insights: | ||
model_fanout.max_fanout: 10 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
dbt | ||
=== | ||
|
||
project-health | ||
-------------- | ||
|
||
The ``project-health`` feature in DataPilot is a comprehensive tool designed to analyze and report on various aspects of your dbt project. | ||
|
||
How to Use | ||
^^^^^^^^^^ | ||
|
||
To use the ``project-health`` feature, run the following command in your dbt project directory: | ||
|
||
Step 1: Generate a manifest file for your dbt project. | ||
|
||
.. code-block:: shell | ||
dbt compile | ||
This command will generate a manifest file for your dbt project under the configured ``target`` directory. The default location for this directory is ``target/manifest.json``. | ||
|
||
Step 2: Generate a catalog file for your dbt project. | ||
|
||
.. code-block:: shell | ||
dbt docs generate | ||
This command will generate a catalog file for your dbt project under the configured ``target`` directory. The default location for this directory is ``target/catalog.json``. | ||
|
||
Step 3: Run the ``project-health`` command. | ||
|
||
.. code-block:: shell | ||
datapilot dbt project-health --manifest-path ./target/manifest.json --catalog-path ./target/catalog.json | ||
This command assesses your dbt project and provides insights on several key areas. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,36 @@ | ||
============ | ||
======================== | ||
Installation | ||
======================== | ||
|
||
Prerequisites | ||
============= | ||
|
||
Before installing DataPilot, ensure you have the following prerequisites met: | ||
|
||
- Python 3.7 or higher installed on your machine. | ||
- Access to a command-line interface (CLI) to execute pip commands. | ||
- An existing dbt project to analyze with DataPilot. | ||
|
||
Installation | ||
============ | ||
|
||
At the command line:: | ||
To install DataPilot, open your CLI and run the following command: | ||
|
||
.. code-block:: shell | ||
pip install altimate-datapilot | ||
This command will download and install the latest version of DataPilot along with its dependencies. | ||
|
||
QuickStart | ||
========== | ||
|
||
Once DataPilot is installed, you can set it up to work with your dbt project. | ||
|
||
Execute the following command to perform a health check on your dbt project: | ||
|
||
.. code-block:: shell | ||
datapilot dbt project-health --manifest-path /path/to/manifest.json --catalog-path /path/to/catalog.json | ||
pip install datapilot | ||
After running the command, DataPilot will provide you with insights into your dbt project's health. Review the insights and make any necessary adjustments to your project. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
======================== | ||
Introduction to DataPilot | ||
======================== | ||
|
||
What is DataPilot? | ||
================== | ||
|
||
DataPilot is an innovative tool designed to be an AI-powered assistant for data engineers and analysts working with SQL and dbt (data build tool). It integrates seamlessly into the development environment, providing real-time insights and suggestions to uphold best practices and enhance the quality of data projects. | ||
|
||
With DataPilot, teams can automate the review process for their SQL queries and dbt models, ensuring that their data transformations are efficient, well-documented, and maintainable. It also facilitates organization-wide consistency by enforcing project standards through integration with version control systems and continuous integration/continuous deployment (CI/CD) pipelines. | ||
|
||
Key Features | ||
============= | ||
|
||
DataPilot comes with a host of features aimed at improving data project management: | ||
|
||
- **Insightful Analysis:** DataPilot performs in-depth analysis of SQL code and dbt projects, highlighting areas of concern such as model fanouts, hard-coded references, and potential duplications. | ||
|
||
- **Seamless Integration:** It can be easily integrated into local development environments as well as Git workflows and CI/CD pipelines, making it a versatile tool for teams of all sizes. | ||
|
||
- **Early Detection:** By identifying potential issues early in the development cycle, DataPilot helps prevent costly and time-consuming fixes down the line. | ||
|
||
- **Best Practice Enforcement:** DataPilot encourages the adoption of best practices in SQL and dbt project development, aiding in the maintenance of high-quality data models. | ||
|
||
- **Automated Checks:** The tool includes a range of automated checks for detecting unused sources, ensuring dependency integrity, and encouraging comprehensive testing and documentation. | ||
|
||
How DataPilot Works | ||
==================== | ||
|
||
DataPilot operates by scanning your SQL and dbt project files, identifying patterns and structures that indicate potential problems or deviations from best practices. Once an issue is detected, it provides feedback and recommendations on how to address it. | ||
|
||
For dbt projects, DataPilot makes use of the manifest and catalog files generated by dbt to perform its analysis. This ensures that the insights provided are based on the most up-to-date view of your project's state. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
sphinx>=1.3 | ||
furo | ||
sphinx_rtd_theme |