A system for processing and monitoring data from 300,000 IoT sensors, built on Google Cloud Platform.
- 中文: README-zh.md
- 日本語: README-ja.md
- Near real-time data processing from 300,000 IoT sensors
- Sensor monitoring and offline alerts
- Batch data processing with configurable settings
- Error handling with Dead Letter Queue
- Automated infrastructure management with Terraform
- CI/CD with GitHub Actions
graph LR
subgraph "IoT Devices"
subgraph "Device 1"
S1[Temperature Sensor]
S2[Humidity Sensor]
S3[Voltage Sensor]
end
D2[Device 2]
D3[Device 3]
D4[Device N]
end
subgraph "Google Cloud Platform"
PS[Cloud Pub/Sub]
DLQ[Dead Letter Queue]
W[BigQuery Worker]
BQ[(BigQuery)]
end
S1 & S2 & S3 -->|Sensor Data| PS
D2 & D3 & D4 -->|Sensor Data| PS
PS -->|Messages| W
W -->|Failed Messages| DLQ
W -->|Batch Insert| BQ
The project creates and uses the following GCP resources:
-
BigQuery Dataset & Table
- Dataset:
sensor_data
- Table:
sensor_logs
(partitioned by day, clustered by device_id) - Stores processed sensor data
- Dataset:
-
Cloud Pub/Sub
- Topic:
sensor-logs-topic
- Subscription:
sensor-logs-sub-01
- Dead Letter Queue:
sensor-logs-dlq
- Receives real-time sensor data
- Topic:
-
Service Account
- Used for client applications
- Includes necessary IAM permissions for Pub/Sub and BigQuery
-
Infrastructure (terraform/)
- GCP resources managed with Terraform
- BigQuery, Pub/Sub, and IAM configurations
-
IoT Client (apps/iot-client/)
- Simulates multiple IoT devices
- Configurable device count and transmission frequency
-
BigQuery Worker (apps/bigquery-worker/)
- Processes messages from Pub/Sub
- Batch inserts into BigQuery
- Handles errors with Dead Letter Queue
- Terraform >= 1.0
- Google Cloud SDK
- Bun >= 1.2.2
- TypeScript >= 5.0.0
-
Install prerequisites:
-
Clone and enter the project:
git clone https://github.com/ThaddeusJiang/sensor_logs.git cd sensor_logs
-
Configure GCP authentication:
gcloud auth application-default login
-
Initialize and deploy:
cd terraform terraform init terraform plan terraform apply
-
Initialize sensors:
bun run apps/bigquery-worker/src/scripts/init-sensors.ts
-
Run IoT Client, see details in apps/iot-client
-
Run BigQuery Worker, see details in apps/bigquery-worker
CREATE TABLE `sensor_data.sensor_logs` (
`device_id` STRING,
`sensor_id` STRING,
`timestamp` TIMESTAMP,
`temperature` FLOAT64,
`humidity` FLOAT64,
`voltage` FLOAT64,
`error_code` STRING,
`status` STRING
)
PARTITION BY DATE(`timestamp`)
CLUSTER BY `device_id`, `sensor_id`
CREATE TABLE `sensor_data.sensors`
(
`device_id` STRING,
`sensor_id` STRING,
`created_at` TIMESTAMP,
`updated_at` TIMESTAMP,
`status` STRING
)
To delete all created resources:
cd terraform
terraform destroy
Pull Requests are welcome! Please ensure:
- Code follows project standards
- Documentation is updated
- Tests are added as needed
MIT