←back to thread

858 points cryptophreak | 2 comments | | HN request time: 0s | source
Show context
taeric ◴[] No.42934898[source]
I'm growing to the idea that chat is a bad UI pattern, period. It is a great record of correspondence, I think. But it is a terrible UI for doing anything.

In large, I assert this is because the best way to do something is to do that thing. There can be correspondence around the thing, but the artifacts that you are building are separate things.

You could probably take this further and say that narrative is a terrible way to build things. It can be a great way to communicate them, but being a separate entity, it is not necessarily good at making any artifacts.

replies(17): >>42934997 #>>42935058 #>>42935095 #>>42935264 #>>42935288 #>>42935321 #>>42935532 #>>42935611 #>>42935699 #>>42935732 #>>42935789 #>>42935876 #>>42935938 #>>42936034 #>>42936062 #>>42936284 #>>42939864 #
SoftTalker ◴[] No.42935611[source]
Yes, agree. Chatting with a computer has all the worst attributes of talking to a person, without any of the intuitive understanding, nonverbal cues, even tone of voice, that all add meaning when two human beings talk to each other.
replies(4): >>42935666 #>>42935682 #>>42936328 #>>42984355 #
TeMPOraL ◴[] No.42936328[source]
That comment made sense 3 years ago. LLMs already solved "intuitive understanding", and the realtime multimodal variants (e.g. the thing behind "Advanced Voice" in ChatGPT app) handle tone of voice in both directions. As for nonverbal cues, I don't know yet - I got live video enabled in ChatGPT only few days ago and didn't have time to test it, but I would be surprised if it couldn't read the basics of body language at this point.

Talking to a computer still sucks as an user interface - not because a computer can't communicate on multiple channels the way people do, as it can do it now too. It sucks for the same reason talking to people sucks as an user interface - because the kind of tasks we use computers for (and that aren't just talking with/to/at other people via electronic means) are better handle by doing than by talking about them. We need an interface to operate a tool, not an interface to an agent that operates a tool for us.

As an example, consider driving (as in, realtime control - not just "getting from point A to B"): a chat interface to driving would suck just as badly as being a backseat driver sucks for both people in the car. In contrast, a steering wheel, instead of being a bandwidth-limiting indirection, is an anti-indirection - not only it lets you control the machine with your body, the control is direct enough that over time your brain learns to abstract it away, and the car becomes an extension of your body. We need more of tangible interfaces like that with computers.

The steering wheel case, of course, would fail with "AI-level smarts" - but that still doesn't mean we should embrace talking to computers. A good analogy is dance - it's an interaction between two independently smart agents exploring an activity together, and as they do it enough, it becomes fluid.

So dance, IMO, is the steering wheel analogy for AI-powered interfaces, and that is the space we need to explore more.

replies(3): >>42936587 #>>42936620 #>>42936997 #
taeric ◴[] No.42936620{3}[source]
I think this gets to how a lot of these conversations go past each other? A chat interface for getting a ride from a car is almost certainly doable? So long as the itinerary and other details remain separate things? At large, you are basically using a chat bot to be a travel agent, no?

But, as you say, a chat interface would be a terrible way to actively drive a car. And that is a different thing, but I'm growing convinced many will focus on the first idea while staving off the complaints of the latter.

In another thread, I assert that chat is probably a fine way to order up something that fits a repertoire that trained a bot. But, I don't think sticking to the chat window is the best way to interface with what it delivers. You almost certainly want to be much more actively "hands on" in very domain specific ways with the artifacts produced.

replies(1): >>42940389 #
1. TeMPOraL ◴[] No.42940389{4}[source]
> But, I don't think sticking to the chat window is the best way to interface with what it delivers. You almost certainly want to be much more actively "hands on" in very domain specific ways with the artifacts produced.

Yes, this is what I've also tried to hint at in my comment, but failed part-way. In most of the cases I can imagine chat interface to be fine (or even ideal), it's really only good as a starting point. Take two examples based on your reply:

1) Getting a car ride. "Computer, order me a cab home" is a good start. It's even OK if I then get asked to narrow it down between several different services/fares (next time I'll remember to specify that up front). But if I want to inspect the route (or perhaps adjust it, in a hypothetical service that supports it), I'd already prefer an interactive map I can scroll and zoom, with PoIs I can tap on to get their details, than to continue a verbal chat.

2) Ordering food in a fast food restaurant. I'm fine starting it with a conversation if I know what I want. However, getting back the order summary in prose (or worse, read out loud) would already be taxing, and if I wanted to make final adjustments, I'd beg for buttons and numeric input boxes. And, in case I don't know what I want, or what is available (and at what prices), a chat interface is a non-starter. Interactive menu is a must.

You sum this up perfectly:

> You almost certainly want to be much more actively "hands on" in very domain specific ways with the artifacts produced.

Chat may be great to get that first artifact, but afterwards, there's almost always a more hands-on interface that would be much better.

replies(1): >>42940757 #
2. taeric ◴[] No.42940757[source]
Oh, apologies, I meant my post to be a highlight of how I agree with you! Your post is great!