←back to thread

125 points robin_reala | 6 comments | | HN request time: 0s | source | bottom
Show context
simonw ◴[] No.46203241[source]
Something I'm desperately keen to see is AI-assisted accessibility testing.

I'm not convinced at all by most of the heuristic-driven ARIA scanning tools. I don't want to know if my app appears to have the right ARIA attributes set - I want to know if my features work for screenreader users.

What I really want is for a Claude Code style agent to be able to drive my application in an automated fashion via a screenreader and record audio for me of successful or failed attempts to achieve goals.

Think Playwright browser tests but for popular screenreaders instead.

Every now and then I check to see if this is a solved problem yet.

I think we are close. https://www.guidepup.dev/ looks extremely promising - though I think it only supports VoiceOver on macOS or NVDA on Windows, which is a shame since asynchronous coding agent tools like Codex CLI and Claude Code for web only run Linux.

What I haven't seen yet is someone closing the loop on ensuring agentic tools like Claude Code can successfully drive these mechanisms.

replies(12): >>46203277 #>>46203374 #>>46203420 #>>46203447 #>>46203583 #>>46203605 #>>46203642 #>>46204338 #>>46204455 #>>46206651 #>>46206832 #>>46208023 #
1. PebblesHD ◴[] No.46203374[source]
Rather than improving testing for fallible accessibility assists, why not leverage AI to eliminate the need for them? An agent on your device can interpret the same page a sighted or otherwise unimpaired person would giving you as a disabled user the same experience they would have. Why would that not be preferable? It also puts you in control of how you want that agent to interpret pages.
replies(5): >>46203409 #>>46203480 #>>46203790 #>>46204152 #>>46205099 #
2. eru ◴[] No.46203409[source]
What you are describing is something the end user can do.

What simonw was describing is something the author can do, and end user can benefit whether they use AI or not.

3. simonw ◴[] No.46203480[source]
I'm optimistic that modern AI will lead to future improvements in accessibility tech, but for the moment I want to meet existing screenreader users where they are and ensure the products I build are as widely accessible as possible.
4. K0nserv ◴[] No.46203790[source]
It adds loads of latency for one. If you watch someone who is a competent screen reader user you'll notice they have the speech rate set very high, to you it'll be hard to understand anything. Adding an LLM in the middle of this will add, at least, hundreds of milliseconds of latency to interactions.
5. 8organicbits ◴[] No.46204152[source]
The golden rule of LLMs is that they can make mistakes and you need to check their work. You're describing a situation where the intended user cannot check the LLM output for mistakes. That violates a safety constraint and is not a good use case for LLMs.
6. devinprater ◴[] No.46205099[source]
I, myself, as a singular blind person, would absolutely love this. But we ain't there yet. On-device AI isn't finetuned for this, and neither Apple nor Google have shown indications of working on this in release software, so I'm sure we're a good 3 years away from the first version of this.