PUBLIC VERSION: Testing solution for BQ GDPR anonymization use case.
IMPORTANT: This is a public version of the project. Feature files and SQL templates were anonymized. Also, API connection to BigQuery is not possible. Rest of the codebase is intact.
This projects implements a testing solution using python-behave framework to test, whether ID fields in BQ datasets' tables were anonymized successfully.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
What things you need to install the software and how to install them.
- Python 3.6+ with these external packages:
- behave
- allure-behave
- pandas
- openpyxl
- tqdm
- pyhamcrest
- google-cloud-biqquery
- protobuf
- linux (Ubuntu)/Win10 OS
- allure reporting tool
- on Win10 install using scoop
- on Ubuntu/linux install using linuxbrew
- access to tested BQ data project
- access to BQ API, have it set up and have proper roles
- access to this repository
Google and protobuf packages had to be placed in setup.py file to ensure proper functionality of BQ API library package.
- Install Python (refer to documentation, how to do that on your OS)
- fire up your command line tool of choice and get to the directory, where you will want to clone the project from github
- clone this repo
- run "python3 setup.py install" if on ubuntu, or "py setup.py install" if on win10. On Win10, package "pandas" will not be installed, you will have to do it manually. See comment in the setup.py file for link. Download the package, and run command pip install [path to package]/packagefile
- In the console, be in the root folder of the project
- run command "behave -f allure_behave.formatter:AllureFormatter -f pretty -o allure-results .\test\features" if on ubuntu, or "behave -f allure_behave.formatter:AllureFormatter -f pretty -o allure-results ./test/features" if on Win10
- wait, until tests are finished
- failed test have BQ data saved in XLSX file with timestamped name in the ./reports folder.
- you can also display interactive HTML report. To do this, run "allure serve" command in your console and the report will open in your default browser. It should be Firefox or Chrome.
All datasets are divided into 5 feature files, with few exceptions. It is possible to run them either as it is specified above, or, if needed, it is possible to apply pseudo-random selection of the feature file.
To do that, run "python3 (or py on windows) manage.py -r" command in the console.
This will pick one of the tags stored in the list in the "functions.py" file and then run behave test framework, as usual, but only the feature file tagged by this tag will be actually run.
This process can be repeated as many times, as there are some tags, that were not picked, or "exhausted". When that happens, ValueError exception is caught, and you have to manually clear the "config.json" file.
To do that, use the utility "py manage.py -c".
You can also run the utility with both parameters at once, so next time the pseudorandom function will be able to choose from full set of tags again. In this case, run command like this "py manage.py -r -c".
To provide easier and faster work with behave coupled with allure reporting tool - since that console command can be quite long, you can use manage.py utility to cover these scenarios:
- py manage.py -r will run one randomly picked feature file from all tagged feature files. This feature file will not be ran again, until config.json is cleared.
- py manage.py -c will clear config.json file, which stores tags of feature files, which were already randomly run.
- py manage.py -b will run all feature files like this command "behave -f allure_behave.formatter:AllureFormatter -f pretty -o allure-results .\test\features" would do.
- py manage.py -t "@tag1" -t "@tag2" etc... wil run all feature files or just some of their scenarios tagged by provided tags. Take care to enter the tags wrapped in " " !.
- py manage.py -h is always available by default and will display all available command with short descriptions.
- @bednaJedna - Idea & work