content: update heading tags hierarchy

surveilr · Oct 10, 2024 · c95a86e · c95a86e
1 parent 1e755ba
commit c95a86e
Show file tree

Hide file tree

Showing 5 changed files with 81 additions and 69 deletions.
diff --git a/src/content/blog/en/power-of-sql-and-sql-views.md b/src/content/blog/en/power-of-sql-and-sql-views.md
@@ -15,13 +15,13 @@ tags: ["SQL","Views" , "RSSD" ]
 
 In the world of data integration and processing, flexibility and extensibility are paramount. The ability to easily prepare, integrate, and analyze data from multiple sources is critical. `surveilr`, with its stateful, local-first, and edge-based architecture, stands out as a powerful solution for these challenges. One of the key strengths of `surveilr` is its SQL-centric nature, making it both flexible and extendable. And when it comes to maximizing this power, there’s no better tool in the SQL toolbox than SQL views.
 
-### The Foundation: SQL in ``surveilr``
+## The Foundation: SQL in ``surveilr``
 
 At the core of ``surveilr`` is its **SQL-centric approach**. Every piece of data it processes is queryable using SQL, allowing users to manipulate and organize information in a way that best fits their workflow. This SQL-centric design makes it easy to set up data pipelines, perform complex transformations, and create relationships between disparate data sources. Whether you're working with clinical operations data, auditing evidence collection and reporting, pharmacy records, billing information, or any other type of clinical or non-clinical data, SQL provides a robust foundation to access and modify that data seamlessly and effortlessly.
 
 However, while basic SQL queries can deliver tremendous value, `surveilr`’s real potential can be unleashed with the use of **SQL views**.
 
-### What are SQL Views?
+## What are SQL Views?
 
 An **SQL view** is a virtual table defined by a query. It does not store data itself but acts as a window through which you can view and interact with data stored in underlying tables. Essentially, a view abstracts away the complexity of a query, letting users interact with data as though it were a single unified table.
 
@@ -31,7 +31,7 @@ Here’s why views are so powerful in the context of surveilr:
 -  **Data Abstraction**: Views provide a layer of abstraction, allowing you to hide certain complexities or fields from users who may not need access to all the underlying data. For example, you could create a view that shows only anonymized or deidentified data for specific use cases, ensuring HIPAA compliance without sacrificing usability.
 - **Data Consistency**: By defining a view, you ensure that everyone accessing the data sees the same results based on a consistent underlying query. This reduces errors and ensures that reports or analyses built on top of those views are based on uniform data.
 
-### Extending `surveilr`’s Power with SQL Views
+## Extending `surveilr`’s Power with SQL Views
 
 `surveilr` already excels at integrating data from multiple sources—be it clinical records, billing data, or operational logs. But the real power of `surveilr` comes when you use SQL views to extend its capabilities. Let’s explore how SQL views make `surveilr` even more powerful:
 
@@ -48,7 +48,7 @@ SQL views enable the creation of custom datasets that can be fed directly into B
 Views provide a way to control what data different users or systems can see. By setting up views that show only the fields or records that a user needs, `surveilr` users can maintain security and regulatory compliance. For example, you could create views that display anonymized patient data for non-clinical staff while allowing full access for medical personnel.
 
 
-### Real-World Examples of Extending `surveilr` with SQL Views
+## Real-World Examples of Extending `surveilr` with SQL Views
 
 - **Customized Reporting**: Create views to aggregate data from multiple sources, generating customized reports.
  - **Data Validation**: Use views to validate data against specific criteria, ensuring data quality and integrity.
@@ -59,7 +59,7 @@ Views provide a way to control what data different users or systems can see. By
 -  **Improved Security**: Views enable fine-grained access control, ensuring sensitive data is only accessible to authorized users.
 
 
-### Real-World Example: Using SQL Views for Healthcare Data Integration
+## Real-World Example: Using SQL Views for Healthcare Data Integration
 
 Imagine a healthcare provider using ``surveilr`` to integrate data from different departments—clinical records, pharmacy, and billing. The provider wants to track patient progress and costs without exposing sensitive information unnecessarily.
 
@@ -72,7 +72,7 @@ Using SQL views, the provider can create:
 These views can be reused across the organization, ensuring that each department gets exactly what it needs while maintaining security and consistency across all datasets.
 
 
-### Conclusion
+## Conclusion
 
 `surveilr`'s SQL-centric architecture is already a game-changer for integrating and analyzing data from multiple systems. However, its potential truly shines when extended using SQL views. Views allow you to simplify complex queries, combine data in powerful ways, and enhance both security and compliance. They enable you to preprocess and transform data effortlessly, all while keeping the underlying system flexible and scalable.
 

diff --git a/src/content/blog/en/rssd-excel-portability-sql-power.md b/src/content/blog/en/rssd-excel-portability-sql-power.md
@@ -36,22 +36,22 @@ Surveillance State Database (RSSD)** as something similar to **Microsoft
 Excel**—a tool most of us have used at some point. Here’s a breakdown of how the
 RSSD works, using an Excel analogy:
 
-### **RSSD is Like an Excel Workbook**
+### RSSD is Like an Excel Workbook
 
 Just like an Excel workbook is a **single file** that contains all of your data,
 an RSSD is also a **single file** that contains all of the data surveilr is
 managing. This single file can store everything from simple numbers to complex
 records that need to be tracked and queried.
 
-### **Tables are Like Excel Worksheets**
+### Tables are Like Excel Worksheets
 
 In Excel, you organize data into **worksheets**. Similarly, in the RSSD, the
 data is organized into **tables**. Just as a worksheet holds rows and columns of
 data, a table in the RSSD holds **rows of records and columns of fields**. For
 example, you might have one table for customer data, another for transactions,
 and yet another for logs.
 
-### **SQL is Like Excel Formulas**
+### SQL is Like Excel Formulas
 
 In Excel, you use **formulas** to manipulate your data. These formulas allow you
 to perform calculations, look up values, or summarize data across your
@@ -63,15 +63,15 @@ Just like in Excel, where you can create simple to complex formulas depending on
 your needs, the RSSD allows you to extract insights from your data using
 flexible SQL queries that work across different tables of information.
 
-### **Flexibility and Power**
+### Flexibility and Power
 
 Just as Excel gives you the ability to manipulate, organize, and analyze your
 data in many different ways, the RSSD allows you to do all of this too—only it
 uses SQL, which is more powerful when working with large datasets. For example,
 while Excel might slow down with very large workbooks, the RSSD, thanks to
 SQLite, can handle **millions of records** without breaking a sweat.
 
-### **Portable and Self-Contained**
+### Portable and Self-Contained
 
 In the same way that you can take an Excel file and send it to someone else (and
 they’ll have access to all the worksheets and data), the RSSD is a
@@ -80,7 +80,7 @@ with all its tables and data, simply by copying the RSSD file to another
 location. There’s no need for a complex setup or configuration—just open it and
 start working with the data.
 
-### How RSSD Works as a SQLite Database
+## How RSSD Works as a SQLite Database
 
 The **Resource Surveillance State Database (RSSD)** leverages **SQLite**, a
 fully-featured relational database that is known for being:
@@ -103,7 +103,7 @@ fully-featured relational database that is known for being:
   can move the entire state of your data from one environment to another by
   copying a single file.
 
-### Why SQLite?
+## Why SQLite?
 
 1. **Local-First Processing**: SQLite's small footprint and self-contained
    nature make it an ideal choice for **local-first** and **edge-based** data
@@ -122,38 +122,38 @@ flexibility and performance in a portable, easy-to-manage format.
 
 ## Why RSSD makes Data Integration easier for those without IT departments
 
-### **No Server Setup Required**
+### No Server Setup Required
 
 One of the biggest advantages of using **SQLite** for the RSSD is that there’s
 no need for a dedicated database server. Everything happens locally within a
 single file. This simplifies setup, reduces costs, and minimizes dependencies on
 external infrastructure, which is especially beneficial for smaller
 organizations that may not have large IT departments.
 
-### **Fast and Lightweight**
+### Fast and Lightweight
 
 Because the RSSD is built on **SQLite**, it’s designed to be **fast and
 lightweight**. This is critical for local-first operations, where data needs to
 be processed efficiently on edge devices or local machines before being
 synchronized with a central system. Despite being lightweight, the RSSD can
 handle a high volume of data with **excellent performance**.
 
-### **SQL for All Data Operations**
+### SQL for All Data Operations
 
 By standardizing all data operations with **SQL**, the RSSD makes it easy for
 non-technical users who are familiar with SQL (or even just comfortable with
 Excel formulas) to work with the data. SQL is a widely-known language that
 allows users to run **queries**, **generate reports**, and **analyze data**
 without needing to learn a new, proprietary system.
 
-### **Reliability and Durability**
+### Reliability and Durability
 
 The RSSD ensures data consistency through its **ACID-compliant** transactions,
 meaning you can trust that your data is safe, even during system failures. Every
 change made to the RSSD is guaranteed to be completed fully or not at all, so
 you never end up with incomplete or corrupted data.
 
-### **Portable and Easy to Backup**
+### Portable and Easy to Backup
 
 Because the entire database is stored as a single file, **backing up** and
 **restoring** data is as simple as copying the RSSD file. This simplicity makes
@@ -171,4 +171,4 @@ For non-technical users, it’s as easy to understand as working with an Excel
 workbook. For technical users, the flexibility and power of SQL make it a robust
 solution for handling complex data operations. The RSSD delivers the
 performance, reliability, and simplicity that modern organizations need to
-thrive in today’s data-centric landscape.
+thrive in today’s data-centric landscape.
diff --git a/src/content/blog/en/sql-based-etl-elt.md b/src/content/blog/en/sql-based-etl-elt.md
@@ -15,13 +15,13 @@ If you're an SQL engineer trying to learn the ropes of data engineering, you mig
 
 Our specific use case will involve aggregating patient remote monitoring data from various devices into a single unified view for Continuous Glucose Monitoring (CGM) tracings. We'll break down each step in a way that's approachable and practical, giving you the tools to work with real data while keeping the infrastructure lightweight.
 
-### **Background on ELT and Why We Use It**
+## Background on ELT and Why We Use It
 
 The classic ETL strategy involves transforming data before loading it into your storage system, which typically requires more complex workflows, external tools, and a lot of up-front work. In contrast, ELT lets you extract the data as-is, load it into your database, and then transform it *in place*, often with SQL views, which makes it great for exploratory work or environments with less infrastructure.
 
 SQLite is a great fit here because it's lightweight, widely supported, and doesn't require complex setup—perfect for small to medium datasets or rapid prototyping.
 
-### **Setting the Scene: Ingesting the Data**
+## Setting the Scene: Ingesting the Data
 
 Imagine we have data from multiple devices—perhaps CGMs, smartwatches, and other monitoring devices—that all capture remote patient monitoring data. After ingesting these data sources, we end up with tables like `table_1`, `table_2`, `table_3`, and `table_4` in our SQLite database. Each table represents a different device and has different columns, even though they all describe patient data for similar remote monitoring purposes.
 
@@ -32,7 +32,7 @@ For example:
 - **table_3** has columns like `pat_id`, `recorded_at`, `glucose_reading`, `sensor`
 - **table_4** has columns like `identifier`, `time_taken`, `patient_ref`, `sugar_level`
 
-### **The Challenge: Creating a Unified View**
+## The Challenge: Creating a Unified View
 
 We need to create a single unified view called `patient_rpm_mode_cgm` that gives us all CGM tracings in a common format. Since ELT focuses on transforming data in place, we will write SQL to transform and union the data from these disparate tables. Our ultimate goal is to create a view that presents common column names—let's standardize them to:
 
@@ -42,7 +42,7 @@ We need to create a single unified view called `patient_rpm_mode_cgm` that gives
 - `device_type` (a new column that does not exist in the physical tables)
 - `source` (a new column to indicate the origin table)
 
-### **Step 1: Understanding the Source Tables**
+### Step 1: Understanding the Source Tables
 
 The first step in transforming this data is to understand how each source table maps to our target columns. To standardize the columns:
 
@@ -53,7 +53,7 @@ The first step in transforming this data is to understand how each source table
 | table_3      | `pat_id`, `recorded_at`, `glucose_reading` | `patient_id`, `timestamp`, `glucose_level`, 'CGM' AS `device_type`, 'table_3' AS `source` |
 | table_4      | `patient_ref`, `time_taken`, `sugar_level` | `patient_id`, `timestamp`, `glucose_level`, 'CGM' AS `device_type`, 'table_4' AS `source` |
 
-### **Step 2: Writing the Transformation Queries**
+### Step 2: Writing the Transformation Queries
 
 We need to write queries that extract the relevant fields from each table, aliasing the columns to standardize their names, and adding new columns as needed.
 
@@ -83,7 +83,7 @@ FROM
     table_3;
 ```
 
-### **Step 3: Combining the Queries with UNION**
+### Step 3: Combining the Queries with UNION
 
 Next, we need to combine these transformed queries using `UNION ALL`. Using `UNION ALL` is appropriate here because it ensures we retain all records, even if they have duplicate values (which may be necessary for auditing or detailed analysis).
 
@@ -143,11 +143,11 @@ SELECT
     'synthetic' AS source;
 ```
 
-### **Step 4: Adding More Transformations with Views**
+### Step 4: Adding More Transformations with Views
 
 One of the key benefits of ELT using views is the ability to easily add more transformations without altering the raw data or writing complex ETL pipelines. Here are some additional common transformations that are better handled through views:
 
-#### **1. Standardizing Data Formats**
+#### 1. Standardizing Data Formats
 
 In many cases, different tables may store data in different formats. For example, timestamps might be stored in different formats or time zones. Using a view, you can standardize these formats:
 
@@ -165,7 +165,7 @@ FROM
 
 This view ensures that all timestamps are in the same format, making downstream analysis much easier.
 
-#### **2. Filtering and Cleaning Data**
+#### 2. Filtering and Cleaning Data
 
 You may want to exclude certain rows from analysis, such as rows with missing or invalid data. Views are a great way to create a “clean” dataset:
 
@@ -185,7 +185,7 @@ WHERE
 
 This view filters out any rows where `glucose_level` is 0 or negative, which may represent invalid data.
 
-#### **3. Aggregating Data**
+#### 3. Aggregating Data
 
 You can also use views to create aggregate data that can be used for reporting or analysis. For example, creating a view that provides daily average glucose levels for each patient:
 
@@ -203,7 +203,7 @@ GROUP BY
 
 This aggregated view makes it easy to analyze trends over time without needing to write aggregation queries repeatedly.
 
-#### **4. Creating Derived Metrics**
+#### 4. Creating Derived Metrics
 
 If you need to create new metrics based on existing columns, views are a great way to handle this. For example, you might want to create a derived metric called `glucose_category` to categorize glucose levels:
 
@@ -227,7 +227,7 @@ FROM
 
 This view adds a new column that categorizes glucose levels into 'Low', 'Normal', or 'High'.
 
-### **Step 5: Validating the Unified View**
+### Step 5: Validating the Unified View
 
 After creating the view, it’s always good practice to validate the results. You can use a `SELECT` query to make sure everything looks right:
 
@@ -237,7 +237,7 @@ SELECT * FROM patient_rpm_mode_cgm LIMIT 10;
 
 Review the data to ensure that the column names are standardized and that the values align as expected. Pay particular attention to the `timestamp` column to ensure formats are consistent.
 
-### **Step 6: Leveraging the View for Downstream Analysis**
+### Step 6: Leveraging the View for Downstream Analysis
 
 With the `patient_rpm_mode_cgm` view in place, downstream processes can now treat this data as a consistent and unified source. Analysts can run queries like:
 
@@ -253,20 +253,20 @@ GROUP BY
 
 This allows for seamless analysis without needing to worry about device-specific table structures.
 
-### **Why ELT with SQLite?**
+## Why ELT with SQLite?
 
 You might wonder why ELT is a good fit for this scenario. Here are some reasons:
 
 - **Flexibility**: ELT allows you to load data as-is and apply transformations later when you have a better understanding of the data.
 - **Simplicity**: SQLite is simple to set up, and using views means that transformations are written declaratively with SQL, which is easy for teams to understand and modify.
 - **Lightweight**: No heavyweight ETL tools are required, which makes this approach perfect for small datasets or prototyping.
 
-### **Conclusion**
+## Conclusion
 
 By using SQLite and SQL views, we've demonstrated a lightweight and modern approach to ETL (or, more precisely, ELT) that helps simplify the process of integrating data from multiple sources. This approach allows for greater flexibility, and by leveraging SQL views, we can keep the transformation logic declarative and transparent.
 
 Whether you're prototyping a new data pipeline, working with smaller datasets, or need a low-maintenance integration solution, this ELT strategy with SQLite is an excellent option. We hope this guide helps you get started on your journey towards modern data engineering!
 
-### **Next Steps**
+## Next Steps
 
 To deepen your understanding, try adding more transformations or aggregations to the `patient_rpm_mode_cgm` view. You could, for example, normalize the `timestamp` formats or add additional metadata to the view to help with analysis. Feel free to experiment and explore how SQLite's capabilities can further simplify your data engineering workflow.