This project repository contains testing datasets and tools to compare WAF efficacy in the two most important categories:
- Security Coverage (True Positive Rate) - measures the WAF's ability to correctly identify and block malicious requests is crucial in today's threat landscape. It must preemptively block zero-day attacks as well as effectively tackle known attack techniques utilized by hackers
- Precision (False Positive Rate) – measures the WAF's ability to correctly allow legitimate requests. Any hindrance to these valid requests could lead to significant business disruption and an increased workload for administrators.
This project aims to measure the efficacy of each WAF against a variety of legitimate and malicious HTTP requests, taken from real-world scenarios.
The project is described in detail in this blog.
Follow the steps below to set up and run the tool:
Download the necessary python requirements
pip install -r requirements.txt
Copy the configuration template file by running the following command in the project's root directory:
cp config_template.py config.py
This command creates a copy of the config_template.py file and renames it to config.py.
Once your WAF environments are properly set up, it's time to configure the testing tool.
Open the config.py file in a text editor. Here, you'll find placeholders for WAF names and their corresponding URLs. Replace these with your specific details.
Once you've input all necessary data, save and close the config.py file. Your tool is now customized for your WAF systems and ready for testing.
This nginx configuration works both when nginx is an upstream application before third-party WAFs like Cloudflare and when WAFs are integrated directly on top of nginx, such as Check Point CloudGuard WAF or F5 App Protect.
server {
listen 80 default_server;
listen [::]:80 default_server;
client_max_body_size 1G;
server_name _;
location / {
proxy_set_header Host $host;
proxy_pass http://localhost:8080;
}
}
server {
listen 8080 default_server;
client_max_body_size 1G;
location / {
add_header Content-Type text/plain;
return 200 "OK";
}
}
Execute the main runner file by running the following command:
python3 runner.py
This command starts the tool and executes the desired functionality.
If you are running the script in a Linux environment, you can use tmux to keep the tool running even after detaching from the console.
Follow the steps below:
Enter the tmux terminal by running the command:
tmux
Run the tool within the tmux session:
python3 runner.py
To detach from the tmux console, press and hold Ctrl + b, release the keys, and then type d.
CTRL+b d
This detaches your current tmux session, leaving the tool running in the background.
To re-enter the tmux terminal again, use the following command:
tmux ls
Select the relevant terminal number from the list and run:
tmux attach-session -t <TERMINAL_NUMBER>
This command attaches you back to the tmux session where the tool is running.
Each WAF solution is tested against two data sets: legitimate and malicious. We then used a formula described below in detail to produce a single balanced score.
The Legitimate Requests Dataset is carefully designed to test WAF behaviors in real-world scenarios. To attain this, it includes 973,964 different HTTP requests from 185 real-web sites in 12 categories. Each dataset was recorded by browsing to real-world web sites and conducting various operations in the site (for example, sign-up, selecting products and placing in a cart, etc) ensuring the presence of 100% legitimate requests.
The dataset can be found in the folder Data/Legitimate
The Malicious Requests Dataset includes 73,924 malicious payloads from a broad spectrum of commonly experienced attack vectors:
- SQL Injection-
- Cross-Site Scripting (XSS)
- XML External Entity (XXE)
- Path Traversal
- Command Execution
- Log4Shell
- Shellshock
The malicious payloads were sourced from the WAF Payload Collection GitHub page that was assembled by mgm security partners GmbH from Germany. This repository serves as a valuable resource, providing payloads specifically created for testing Web Application Firewall rules.
The dataset is available here
To trigger the data sets through the different devices under tests, we developed a simple test tool in Python. The test tool is designed to ingest data sets as input and send each request to the various WAFs being tested. It reads the data files from the data sets and uses the requests module in a multi-threaded manner to send the data to each WAF.
During the initial phase, the tool conducts a dual-layer health check for each WAF. This process first validates connectivity to each WAF, ensuring system communication. It then checks that each WAF is set to prevention mode, confirming its ability to actively block malicious requests.
The responses from each request sent by the test tool to the WAFs were systematically logged in a dedicated database for further analysis. The database we used is an AWS RDS instance running PostgreSQL (database is not included in this repo). You can configure it to work with any SQL database of their preference by adjusting the settings in the config.py file.
The main file for running the tests is runner.py
. This script will send the HTTP requests, log the responses, and calculate the performance metrics for each WAF.
Note: You may need to adjust the settings in the config.py
file to suit your specific testing environment.
The Legitimate Requests Datasetand the Tooling are available under Apache 2.0 license. The Malicous Requests Dataset is a collection of datasets assembeled by MGM with different copyrights, mostly under MIT.
The data sets used for this project are available via GitHub and will be updated annually.
For an in-depth discussion and analysis of the results, see our WAF Comparison Blog Post.
For any questions or concerns, please open an issue in this GitHub repository.