CWAC is a project originally designed and developed by the Web Standards team within the Digital Public Service branch of Te Tari Taiwhenua | Department of Internal Affairs, New Zealand Government.
Note: "CWAC" is pronounced "quack", like a duck.
CWAC is a tool that can scan hundreds of websites for accessibility issues, automatically.
CWAC can be used as a mechanism to monitor the New Zealand Government's implementation of minimum accessibility standards and guidelines on its websites. The primary standard, is the NZ Government Web Accessibility Standard, which includes Web Content Accessibility Guidelines (WCAG) 2.1 Level AA conformance. CWAC enables the partial fulfillment of Article 9 of the United Nations Convention on the Rights of Persons with Disabilities (CRPD).
Provided a list of URLs, CWAC automatically crawls a specified number of pages per URL, and checks each page for automatically-identifiable accessibility issues. It then stores results in an easy-to-read CSV file.
CWAC can also crawl an explicitly-defined set of URLs without a crawler, which is useful for re-running tests to see changes in accessibility conformance over time.
CWAC is designed to be extensible, so new forms of web testing can be added over time. For instance, CWAC could also run The Nu Html Checker on web pages. Or, it could theoretically check other website requirements, such as website data usage and performance, or the existence of a privacy or copyright statement.
CWAC combines the following technologies:
- Python 3 (the primary scripting language CWAC is written in)
- Selenium (used to control web browsers)
- axe-core (the accessibility rules engine)
- Chrome for Testing (main browser CWAC uses)
Ensure you have Python 3.12 installed on your system.
A Python virtual environment ensures that CWAC's dependencies aren't installed on your system's version of Python.
- In a terminal,
cd
to thecwac
directory. - Double-check you're using the right python environment - type
python --version
and ensure it'sPython 3.12.x
- Run
python -m venv .venv
- Run
source .venv/bin/activate
- In a terminal,
cd
to thecwac
directory. - Run
pip install -r requirements.txt
to install all required Python libraries.
To set up CWAC for development, you must first install all required pre-commit hooks. This isn't necessary if you just want to run CWAC.
- Open a shell
- Run
pre-commit autoupdate
- Run
pre-commit install
A series of linters, security checking, and formatting will occur at every git commit.
To run the pre-commit hooks at any time, run:
pre-commit run --all-files
This is useful for debugging why a pre-commit hook failed.
CWAC has three dependencies managed by NPM. These are:
- axe-core
- Readability
- Chrome for Testing
To get NPM, go here: https://nodejs.org/en/download/
- In a shell,
cd
to thecwac
root directory - run
npm install
Chrome for Testing is a specific version of Chrome used for testing purposes.
An instance of Chrome for Testing should be in the cwac/chrome/
directory.
Depending on your OS/architecture, Chrome for Testing will have different folder names and executable paths.
You must specify the correct path to Chrome for Testing for CWAC to work within CWAC's configuration files.
To do this:
- Look inside the
cwac/chrome/
directory. Note the folder name for the version of Chrome for Testing that was downloaded i.e.mac_arm-114.0.5735.90
- Open
cwac/config/
. For every config file in this directory e.g.config_default.json
, modify the value ofchrome_binary_location
so the correct binary path is specified. For example:- the value of
chrome_binary_location
for an ARM-based Mac could be:./chrome/mac_arm-114.0.5735.90/chrome-mac-arm64/Google Chrome for Testing.app/Contents/MacOS/Google Chrome for Testing
- for Linux x64, the
chrome_binary_location
could be:./chrome/linux-114.0.5735.90/chrome-linux64/chrome
- the value of
From time to time, it might make sense to update the version of Chrome for Testing that CWAC uses.
To do this:
- Visit Chrome for Testing - GitHub
- Open the API endpoint
last-known-good-versions-with-downloads.json
in a JSON viewer (Firefox has one built-in) - Find the entry for the latest stable version of Chrome for Testing
- Download the
chromedriver
that matches the version of Chrome for Testing you want to use - Place the
chromedriver
executable into the/drivers/
folder incwac
, with a unique filename - Open
package.json
and change the Chrome for Testing version number, ensuring it matches thechromedriver
version - Run
npm install
to install the newly specified version of Chrome for Testing inpackage.json
- Modify the CWAC configuration file i.e.
/config/config_default.json
and ensure it specifies the correctchrome_binary_location
andchrome_driver_location
- Note: the
chromedriver
executable may need to havechmod +x
run on it in order to give it execution permissions - macOS might come up with an error stating "chromedriver_mac_arm64” can’t be opened because Apple cannot check it for malicious software.". This is fixed by running
xattr -d com.apple.quarantine <name-of-executable>
On macOS, urllib
may fail when attempting to fetch data for nltk
, with certificate errors. To fix this, go to /Applications/Python 3.x.x/Install Certificates.command
. This file should install the necessary certificates and resolve the error.
If you run CWAC and get the error "“chromedriver_mac_arm64” can’t be opened because Apple cannot check it for malicious software.":
Run: xattr -d com.apple.quarantine <name-of-executable>
CWAC uses JSON files within the ./config
directory for its configuration. Before running a test, ensure that these files contain the settings that you want to use.
This file contains the configuration settings for new tests run using CWAC. The values in this file can be manually modified to change the way CWAC runs its tests.
Field descriptions:
audit_name
- a name for the test, which is used as a folder name to store results inside of /results
headless
- a boolean that specifies whether the browsers will be headless, or not (browser windows will be invisible)
max_links_per_domain
- the maximum number of pages that will be tested for each URL specified in
base_urls_crawl_path
- the maximum number of pages that will be tested for each URL specified in
thread_count
- the number of browsers, and threads CWAC will use
- a number equal to the number of CPU cores is most efficient
browser
- specifies what web browser is used for tests
- can be either "chrome" or "firefox"
chrome_binary_location
- a valid path to a Chrome for Testing executable
chrome_driver_location
- a valid path to a
chromedriver
executable (version must match the version of Chrome for Testing)
- a valid path to a
user_agent
- the user agent string CWAC will use for all network requests
user_agent_product_token
- the product token (should match the one in
user_agnet
used for robots.txt matching)
- the product token (should match the one in
follow_robots_txt
- a bool, determines if robots.txt directives should be followed by CWAC
script_timeout
- the number of seconds before JavaScript execution will timeout
page_load_timeout
- the number of seconds before a page load will timeout
delay_between_page_loads
- delay amount (in seconds) that is used between visiting pages for each browser instance
- if a Chrome instance hangs on a webpage, try increasing this value
delay_between_viewports
- if multiple viewport resolutions are specified in
viewport_sizes
, this is a delay (in seconds) between testing each viewport. It's useful to ensure the browser has re-rendered the content at the new viewport size
- if multiple viewport resolutions are specified in
delay_after_page_load
- once the browser has loaded the page, this is a delay (in seconds) before running tests. This is helpful to allow any animations of JavaScript-based page changes to occur before running tests which may cause false positives i.e. fade-in animations can cause colour contrast false positives.
only_allow_https
- a boolean value that determines if CWAC will restrict to HTTPS-only URLs (true: restrict to HTTPS)
perform_header_check
- a boolean value that determines if CWAC will send a header-only request to each URL before loading the URL in a browser. This can prevent Chrome from loading garbage URLs, but it also slows down the crawler, and increases network requests.
nocrawl_mode
- a boolean value that can be used to switch off the crawling function
- this causes CWAC to only scan explicitly-defined URLs within the CSV files located at
base_urls_nocrawl_path
shuffle_base_urls
- before CWAC starts scanning websites, it will randomly shuffle the order of URLs it will scan
base_urls_crawl_path
- Defines which URLs will be scanned when
nocrawl_mode
isfalse
- a path to a folder that contains CSV files (as many as you like). The CSV files must have the headers: organisation,url,sector.
- The entries in
base_urls_crawl_path
are extremely important, as these files are used to associate URLs with other information like their organisation, and sector
- Defines which URLs will be scanned when
base_urls_nocrawl_path
- Defines which URLs will be scanned when
nocrawl_mode
istrue
- a path to a folder that contains CSV files (as many as you like). The CSV files must have only one header: url.
- CWAC will take the URL, and look it up within
base_urls_crawl_path
CSVs to determine the URL's organisation,sector automatically, otherwise 'Unknown' will be specified and a warning is put in the scan's log.
- Defines which URLs will be scanned when
filter_to_organisations
- a list of strings that can be used to restrict a CWAC scan to particular organisations. The organisations are specified in CSVs within the
base_urls
folder - e.g. ["Ministry of Social Development", "Department of Internal Affairs"]
- partial string matches are included, e.g. "Internal" would match "Department of Internal Affairs"
- a list of strings that can be used to restrict a CWAC scan to particular organisations. The organisations are specified in CSVs within the
filter_to_urls
- a list of strings of specific URLs to restrict a crawl to (these URLs must be specified within a CSV inside of
base_urls_crawl_path
orbase_urls_nocrawl_path
) - e.g. ["https://msd.govt.nz/", "https://dia.govt.nz"]
- partial string matches are included e.g. "dia.govt" matches "https://dia.govt.nz"
- a list of strings of specific URLs to restrict a crawl to (these URLs must be specified within a CSV inside of
viewport_sizes
- a JSON object specifying all viewport sizes CWAC will test each page with
- Example:
{"small": {"width": 320, "height": 450}, "medium": {"width": 1280, "height": 450}}
audit_plugins
- a dictionary of plugin configurations.
- the key of each plugin entry must be a snake case file name that exists within
./src/audit_plugins/
- each plugin entry requires a camel case
class_name
as a value (equal to the class name in./src/audit_plugins/[plugin_file].py
) - each plugin entry has an
enabled
boolean, which can be used to switch the plugin on and off - each plugin entry may have an optional
viewport_to_test
value, which can be used to restrict the plugin to run only at one viewport size that has been specified inviewport_sizes
e.g.viewport_to_test: "small"
- each plugin may have one or more custom properties that are passed to the plugin
First, ensure you're in the cwac
directory and using the Python virtual environment created earlier:
cd
to thecwac
folder- ensure the terminal prompt starts with
(.venv)
- if it doesn't, run the command
source .venv/bin/activate
Once the configuration files are set up, CWAC can be run by executing:
python cwac.py
CWAC will execute using ./config/config_default.json
as its configuration source, which by default works for macOS.
CWAC also supports specifying configuration file names as a singular command line argument. Configuration files must be located in the ./config/
directory.
To specify a config file for CWAC:
python cwac.py config_custom.json
This will cause CWAC to load ./config/config_custom.json
instead of config_default.json
.
If you want to run CWAC on Linux, this can be done easily by using the pre-built config for Linux:
python cwac.py config_linux.json
This feature can be useful if you want to chain, or concurrently run instances of CWAC with different configurations e.g. different viewports, different tests, different organisations, etc.
To chain two instances of CWAC where one test will run after the other in sequence, use:
python cwac.py config_a.json && python cwac.py config_b.json
The raw test results are stored within the ./results/
folder.
You can either use the raw data stored in teh ./results/
folder directly, or you can use a data exporting feature which auto-generates leaderboards, and runs an algorithm which attempts to de-duplicate axe-core issues.
The CWAC data exporter is in the file export_report_data.py
, and its configuration is in export_report_data_config.json
.
To use the CWAC data exporter, first modify export_report_data_config.json
to specify where it should import data from within the ./results/
folder. Set input_results_folder_name
to a valid folder name found within ./results/
. Then, set output_report_name
to specify the name of the output folder that will be generated within ./reports/
.
You can then run export_report_data.py
and it will generate various leaderboard CSVs etc and the output will be placed within ./reports/{output_folder_name}/
.
CWAC uses several tools to maintain the quality and integrity of its source code, including:
- black, an uncompromising code formatter
- pydocstyle, for linting docstrings
- isort, for sorting import statements
- bandit, for detecting potential security vulnerabilities
- flake8, for linting
- pylint, for linting
- mypy, for static type checking
Use pre-commit run --all-files
to run all pre-commit hooks.
CWAC is designed to be extensible with plugins. This enables CWAC to run multiple different types of audits against web pages.
By default, CWAC has 6 plugins:
DefaultAudit
- a plugin that simply gets basic page information e.g. viewport size, page title. This audit plugin is never used directly, it is always imported by other plugins so they don't have to fetch basic page information.AxeCoreAudit
- a plugin that runsaxe-core
on the pageLanguageAudit
- a plugin that estimates text readability using a Flesch-Kincaid and SMOG score. It can also perform sentiment analysis.ReflowAudit
- a plugin that gives an indicative test for WCAG 1.4.10 ReflowFocusIndicatorAudit
- a plugin that presses the tab key and detects if pixels changed after pressing tab, which can be an indicative test for WCAG 2.4.7 Focus Visible.ScreenshotAudit
- a plugin that simply takes screenshots of each web page tested and saves it to a folder in the results directory.ElementAudit
- a plugin that reports all instances of elements that match a CSS selector.
The code for each plugin is located in /src/audit_plugins/
To specify what audits run during testing, modify the audit_plugin
dictionary in config_default.json
.
The format of audit_plugin
entries requires a snake case name as the key, and a camel case name as the value for the 'class_name' property, e.g.:
"audit_plugins": {
"axe_core_audit": {
"class_name": "AxeCoreAudit",
"best-practice": true,
"enabled": true
},
"language_audit": {
"class_name": "LanguageAudit",
"enabled": true,
"viewport_to_test": "small",
"run_sentiment_analysis": false
},
"reflow_audit": {
"class_name": "ReflowAudit",
"enabled": true,
"viewport_to_test": "small",
"screenshot_failures": false
},
"screenshot_audit": {
"class_name": "ScreenshotAudit",
"enabled": true,
"viewport_to_test": "small"
},
"focus_indicator_audit": {
"class_name": "FocusIndicatorAudit",
"enabled": true,
"max_tab_key_presses": 15
},
"element_audit": {
"class_name": "ElementAudit",
"target_element_css_selector": "input:not([type="search"])",
"enabled": true
}
}
This is because the key corresponds to a module name within ./src/audit_plugins/
, and the camel case name corresponds to the audit class name contained in each module.
To add new audit plugins, first develop an appropriate test module/class within ./src/audit_plugins/
, and then enable that audit plugin by adding an entry within config_default.json
.
Each plugin can have an optional viewport_to_test
item, which allows you to run a plugin only at particular viewport sizes, if multiple are being tested. The value of this key must match a value within the viewport_sizes
option.
Crown copyright (c) 2024, Department of Internal Affairs on behalf of the New Zealand Government.
This copyright, along with CWAC's GPL-3.0 license, does not extend to the third-party chromedriver binaries located in the /drivers/
folder. Permission to re-use third party copyright material cannot be given by the Department of Internal Affairs.
CWAC includes chromedriver binaries at /drivers/
. chromedriver licenses can be found in the /drivers/
folder.
// Copyright 2015 The Chromium Authors
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google LLC nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.