-
Notifications
You must be signed in to change notification settings - Fork 65
Home
Ronan Stokes edited this page Oct 21, 2022
·
17 revisions
Welcome to the data-generator wiki!
The Databricks data generator (dbldatagen
) is available as a PyPi package at https://pypi.org/project/dbldatagen/.
steps:
- soft release (with docs hosted as GitHub pages)
- package release (with docs hosted via Github pages) and data generator available via package
Current release feature set:
- Data generation with support for generation of data conforming to statistical distributions
- Faker integration via plugin mechanism
- Support for generation of streaming data
- Support for generation of multi-table data with consistency between primary and foreign keys
- Support for generation of CDC style data
- Support for generation of IOT style data
- Supports generation of streaming data both in Databricks classic notebook environment and in Delta Live Tables pipelines
- The following direct link will bring you to the documentation: Databricks Data Generator online documentation