Web Structure Derived Clustering for Optimised Web Accessibility Evaluation
Published in Proceedings of the ACM Web Conference 2023, 2023
Recommended citation: Alexander Hambley, Yeliz Yesilada, Markel Vigo, Simon Harper. 2023. Web Structure Derived Clustering for Optimised Web Accessibility Evaluation. Proceedings of the ACM Web Conference 2023 https://doi.org/10.1145/3493612.3520452
Web accessibility evaluation is a costly and complex process due to limited time, resources and ambiguity. To optimise the accessibility evaluation process, we aim to reduce the number of pages audi- tors must review by employing statistically representative pages, reducing a site of thousands of pages to a manageable review of archetypal pages. Our paper focuses on representativeness, one of six proposed metrics that form our methodology, to address the limitations we have identified with the W3C Website Accessibility Conformance Evaluation Methodology (WCAG-EM). These include the evaluative scope, the non-probabilistic sampling approach, and the potential for bias within the selected sample. Representative- ness, in particular, is a metric to assess the quality and coverage of sampling. To measure this, we systematically evaluate five web page representations with a website of 388 pages, including tags, structure, the DOM tree, content, and a mixture of structure and content. Our findings highlight the importance of including struc- tural components in representations. We validate our conclusions using the same methodology for three additional random sites of 500 pages. As an exclusive attribute, we find that features derived from web content are suboptimal and can lead to lower quality and more disparate clustering for optimised accessibility evaluation.