Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add Email Generator (a new type of sdgx.data_processor) #184

Merged
merged 16 commits into from
Jun 21, 2024

Conversation

MooooCat
Copy link
Contributor

@MooooCat MooooCat commented Jun 5, 2024

Description

This module is a subclass of PIIGenerator, a class designed to handle the conversion and reversal of personally identifiable information (PII) in a DataFrame.

The EmailGenerator class has three important methods: fit, convert, and reverse_convert:

  • The fit method is used to fit the generator to the metadata, which includes identifying the columns in the DataFrame that contain email addresses.
  • The convert method removes the email columns from the DataFrame, while the reverse_convert method adds new email columns to the DataFrame.

Motivation and Context

The motivation is to provide a way to handle email addresses in a DataFrame.

This is particularly useful when dealing with datasets that contain sensitive information, such as email addresses, and need to be anonymized or de-identified.

How has this been tested?

Email Generator has been tested using a variety of test cases.

These tests include checking if the fit method correctly identifies the email columns in the DataFrame, and if the convert and reverse_convert methods correctly handle the email columns.

Types of changes

  • Maintenance (no change in code, maintain the project's CI, docs, etc.)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@MooooCat MooooCat merged commit 14ad5e8 into main Jun 21, 2024
12 checks passed
@MooooCat MooooCat deleted the feature-email-generator branch June 21, 2024 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants