-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/presidio-structured #1192
Conversation
changelog Static analysis docstrings, types preliminary tests engine static analysis isort Minor refactorings Update README.md Fix late binding issues and example removal of old samples Refactoring, adding example pre-clean-break-commit broken commit, fixing TabularConfigBuilder Rename TabularConfig pre-breaking replace commit removal of some old experimental files rename tabular to structured restructuring presidio tabular - pre del commit Add project TODOs testing dump presidio tabular
ccb469a
to
8c6be26
Compare
Thanks @Jakob-98! |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Opening up for review Items left from my side:
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
…8/presidio into feature/presidio-tabular
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I did an initial review and added some comments.
optional AnalyzerEngine parameter
Azure Pipelines successfully started running 1 pipeline(s). |
…8/presidio into feature/presidio-tabular
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
presidio-structured/presidio_structured/data/data_processors.py
Outdated
Show resolved
Hide resolved
presidio-structured/presidio_structured/data/data_processors.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comment, looks great!
presidio-structured/presidio_structured/config/structured_analysis.py
Outdated
Show resolved
Hide resolved
presidio-structured/presidio_structured/data/data_processors.py
Outdated
Show resolved
Hide resolved
Agree with Sharon, this looks great! Added a few comments, all minor. |
Happy 2024, and thanks for the review! Will address once some dev time frees up on my end :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Amazing work :)
Change Description
The proposed approach is to build a library (presidio-structured) which re-uses existing logic from existing presidio components to allow anonymization of (semi-)structured data. A priority is to have a recognizable user experience/interface compared to the existing library components. This has been a much requested feature, see for instance:
Supporting structured / semi-structured data with Presidio · microsoft/presidio · Discussion #714 (github.com)
In the sample folder there is a notebook showcasing the logic to be supported in V1 of presidio-structured
Issue reference
This PR fixes issue #714
Checklist