Context
Despite the availability of numerous automatic accessibility testing solutions, web accessibility issues persist on many websites. Moreover, there is a lack of systematic evaluations of the efficacy of current accessibility testing tools. To address this gap, we present the first mutation analysis framework, called Ma11y, designed to assess web accessibility testing tools. Ma11y includes 25 mutation operators derived from WCAG 2.1 that intentionally violate various accessibility principles and an automated oracle to determine whether a mutant is detected by a testing tool.
Problem statement
- 60% of the analyzed websites lack alternative text for their images
- Accessibility testing, when is done, is done manually: error-prone and time-taking approach
- Human evaluators with disability may not be always available
- Lack of systematic evaluation for assessing the automatic testing tool’s effectiveness
Approach
- deriving mutations from WCAG
- mutating existing code to introduce accessibility issue
- evaluate accessibility tool (eg. screen-readers): can these tool detect the injected accessibility issues?
Ma11y
- The first component is the Mutant Generator, which applies 25 mutation operators derived from the defect model based on WCAG 2.1 accessibility guidelines to the website under test. The Mutant Generator also includes checks to avoid generating equivalent mutants
- The second component is the Tool Runner, which so far integrates the implementation of 6 popular accessibility testing tools through a unified interface. Once mutants are generated, the Tool Runner executes the accessibility testing tools on them
- The final component is the Oracle, which compares the reports generated by each tool for both the original website and its corresponding mutated websites. By analyzing these reports, the oracle determines whether the accessibility issues were successfully identified by the tool. The framework generates a comprehensive report showcasing each tool’s performance in detecting accessibility bugs, along with valuable insights and a mutation score assessment.
Results
The study revealed that on average, the tested tools detected less than 50% of the injected accessibility issues, underscoring the need for further improvement in web accessibility testing tools