←back to thread

X and NeWS history

(minnie.tuhs.org)
177 points colinprince | 2 comments | | HN request time: 0.001s | source
Show context
mwcampbell ◴[] No.15325908[source]
I've read some of Don Hopkins' comments about all the cool things that were done with PostScript running in the NeWS server. I wonder, if NeWS had survived long enough that Sun had to add accessibility for people with disabilities (to sell to governments and schools if nothing else), how would they have done it? Today, the current state of the art is very chatty IPC (meaning tens of microseconds per cross-process call), partially assisted by bulk fetching and caching. To appreciate how bad that can be, think about a screen reader making at least one IPC call per word when reading text, to ask the application about word boundaries and bounding rectangles. That's a problem I hope to solve someday. In a parallel universe, though, maybe the screen reader is running inside the NeWS server, in the same address space as a PostScript-based UI toolkit.
replies(1): >>15327767 #
DonHopkins ◴[] No.15327767[source]
I've written up some thoughts and discussions about accessibility and JavaScript, and discussed them a few times here!

I discovered the great work of Morgan Dixon and James Fogarty, which proves that you can do some amazing things with screen scraping, pattern matching, visual deconstruction, augmentation and reconstruction! His work needs to be combined with platform specific accessibility APIs via JavaScript.

I'm proposing "aQuery", a high level scriptable accessibility tool that is to native user interface components like jQuery is to DOM, for selecting and querying components, matching visual patterns, handing events, abstracting platform dependencies and high level service interfaces, building and scripting higher level widgets and applications, etc.

http://donhopkins.com/mediawiki/index.php/AQuery

https://news.ycombinator.com/item?id=11520967

Morgan Dixon's and James Fogarty's work is truly breathtaking and eye opening, and I would love for that to be a core part of a scriptable hybrid Screen Scraping / Accessibility API approach.

Screen scraping techniques are very powerful, but have limitations. Accessibility APIs are very powerful, but have different limitations. But using both approaches together, screencasting and re-composing visual elements, and tightly integrating it with JavaScript, enables a much wider and interesting range of possibilities.

Think of it like augmented reality for virtualizing desktop user interfaces. The beauty of Morgan's Prefab is how it works across different platforms and web browsers, over virtual desktops, and how it can control, sample, measure, modify, augment and recompose guis of existing unmodified applications, even dynamic language translation, so they're much more accessible and easier to use!

https://news.ycombinator.com/item?id=12425668

This link has the most up-to-date links to Morgan's work, and his demo videos!

https://news.ycombinator.com/item?id=14182061

https://prefab.github.io/

Prefab: The Pixel-Based Reverse Engineering Toolkit Prefab is a system for reverse engineering the interface structure of graphical interfaces from their pixels. In other words, Prefab looks at the pixels of an existing interface and returns a tree structure, like a web-page's Document Object Model, that you can then use to modify the original interface in some way. Prefab works from example images of widgets; it decomposes those widgets into small parts, and exactly matches those parts in screenshots of an interface. Prefab does this many times per second to help you modify interfaces in real time. Imagine if you could modify any graphical interface? With Prefab, you can explore this question!

https://www.youtube.com/watch?v=w4S5ZtnaUKE

Imagine if every interface was open source. Any of us could modify the software we use every day. Unfortunately, we don't have the source.

Prefab realizes this vision using only the pixels of everyday interfaces. This video shows how we advanced the capabilities of Prefab to understand interface content and hierarchy. We use Prefab to add new functionality to Microsoft Word, Skype, and Google Chrome. These demonstrations show how Prefab can be used to translate the language of interfaces, add tutorials to interfaces, and add or remove content from interfaces solely from their pixels. Prefab represents a new approach to deploying HCI research in everyday software, and is also the first step toward a future where anybody can modify any interface.

More Prefab demos:

https://prefab.github.io/videos.html

replies(1): >>15327883 #
1. mwcampbell ◴[] No.15327883[source]
Fascinating! What are some of the limitations of screen scraping that are addressed by accessibility APIs? Is it primarily the unreliability? Do you envision using an accessibility API as the primary source of information, with screen scraping as a fallback?
replies(1): >>15327997 #
2. DonHopkins ◴[] No.15327997[source]
Accessibility APIs expose lots of rich high level information like associating a label with a widget (to name one simple example, but it gets MUCH more complex than that). And of course the position and meaning and state of each object on the screen.

There needs to be a higher level scriptable way to get a handle on all that complexity, like jQuery helps automate the creation and manipulation and abstraction of HTML DOM and events and handlers.

PhoneGap/Cordova plugins and NativeScript both take a similar approach: they have a generic Objective C / Java / Whatever <=> JavaScript bridge that lets you directly call all (or most) of the native APIs, including accessibility, directly on each platform. Then you write higher level platform independent APIs on top of the native APIs in JavaScript, a different one for each platform, but the same high level interface to JavaScript programmers.

https://docs.nativescript.org/core-concepts/accessing-native...

I would try to use the native API as much as possible since they're precise yet tedious. It should have a platform-independent accessibility-specific selector language like xpath or jQuery selectors, which should be implemented efficiently in native code like querySelector. The pattern matching engine would send asynchronous events back to JavaScript. You could write JavaScript handlers for patterns of pixels as well as accessibility path patterns appearing and disappearing, like jQuery lets you bind handlers to patterns of DOM elements that don't exist yet, instead of concrete existing elements. Those handlers would create higher level widgets that could manage those existing pixels and widgets. (Like recognizing a youtube video player in any web browser window, and wrapping it in an aQuery widget implementing an abstract VideoPlayer interface, for example.)

Of course calling the native accessibility API is lighter weight than screen scraping and pixel matching, and you could use the native API to drill down to just the region of pixels you want to match or screencast, to minimize the amount of screen scraping and pixel matching required.