DataUpsampler is a flexible and user-friendly Python module that uses Generative Adversarial Networks (GANs) to synthesize new data samples for augmentation and upsampling purposes. This tool is designed to help data scientists and machine learning practitioners address data scarcity and imbalance issues in their datasets.
Key Features
- Versatile Data Handling
Supports both NumPy arrays and Pandas DataFrames Handles numerical data of any dimensionality Automatic data scaling and normalization Preserves data distributions and relationships
- Flexible Architecture
Customizable generator and discriminator architectures Configurable network depth and width Adjustable latent space dimension Multiple scaling options (StandardScaler or MinMaxScaler)
- Easy-to-Use Interface
Simple fit/generate API similar to scikit-learn Intuitive parameter configuration Progress monitoring during training Comprehensive error handling and validation
- Production-Ready Features
Reproducible results with random state control Training history tracking Memory-efficient processing Scalable to large datasets