I'm not convinced at all by most of the heuristic-driven ARIA scanning tools. I don't want to know if my app appears to have the right ARIA attributes set - I want to know if my features work for screenreader users.
What I really want is for a Claude Code style agent to be able to drive my application in an automated fashion via a screenreader and record audio for me of successful or failed attempts to achieve goals.
Think Playwright browser tests but for popular screenreaders instead.
Every now and then I check to see if this is a solved problem yet.
I think we are close. https://www.guidepup.dev/ looks extremely promising - though I think it only supports VoiceOver on macOS or NVDA on Windows, which is a shame since asynchronous coding agent tools like Codex CLI and Claude Code for web only run Linux.
What I haven't seen yet is someone closing the loop on ensuring agentic tools like Claude Code can successfully drive these mechanisms.