Most active commenters
  • taeric(5)

←back to thread

858 points cryptophreak | 17 comments | | HN request time: 0.001s | source | bottom
Show context
taeric ◴[] No.42934898[source]
I'm growing to the idea that chat is a bad UI pattern, period. It is a great record of correspondence, I think. But it is a terrible UI for doing anything.

In large, I assert this is because the best way to do something is to do that thing. There can be correspondence around the thing, but the artifacts that you are building are separate things.

You could probably take this further and say that narrative is a terrible way to build things. It can be a great way to communicate them, but being a separate entity, it is not necessarily good at making any artifacts.

replies(17): >>42934997 #>>42935058 #>>42935095 #>>42935264 #>>42935288 #>>42935321 #>>42935532 #>>42935611 #>>42935699 #>>42935732 #>>42935789 #>>42935876 #>>42935938 #>>42936034 #>>42936062 #>>42936284 #>>42939864 #
1. SoftTalker ◴[] No.42935611[source]
Yes, agree. Chatting with a computer has all the worst attributes of talking to a person, without any of the intuitive understanding, nonverbal cues, even tone of voice, that all add meaning when two human beings talk to each other.
replies(4): >>42935666 #>>42935682 #>>42936328 #>>42984355 #
2. taeric ◴[] No.42935666[source]
Yeah, this is something I didn't make clear on my post. Chat between people is the same bad UI. People read in the aggression that they bring to their reading. And get mad at people who are legit trying to understand something.

You have some of the same problems with email, of course. Losing threading, in particular, made things worse. It was a "chatification of email" that caused people to lean in to email being bad. Amusing that we are now seeing chat applications rise to replace email.

replies(1): >>42940265 #
3. aylmao ◴[] No.42935682[source]
I would also call it having all the worst attributes of a CLI, without the succinctness, OS integration, and program composability of one.
replies(1): >>42936090 #
4. 1ucky ◴[] No.42936090[source]
You should check out out MCP by Anthropic, which solves some of the issues you mentioned.
5. TeMPOraL ◴[] No.42936328[source]
That comment made sense 3 years ago. LLMs already solved "intuitive understanding", and the realtime multimodal variants (e.g. the thing behind "Advanced Voice" in ChatGPT app) handle tone of voice in both directions. As for nonverbal cues, I don't know yet - I got live video enabled in ChatGPT only few days ago and didn't have time to test it, but I would be surprised if it couldn't read the basics of body language at this point.

Talking to a computer still sucks as an user interface - not because a computer can't communicate on multiple channels the way people do, as it can do it now too. It sucks for the same reason talking to people sucks as an user interface - because the kind of tasks we use computers for (and that aren't just talking with/to/at other people via electronic means) are better handle by doing than by talking about them. We need an interface to operate a tool, not an interface to an agent that operates a tool for us.

As an example, consider driving (as in, realtime control - not just "getting from point A to B"): a chat interface to driving would suck just as badly as being a backseat driver sucks for both people in the car. In contrast, a steering wheel, instead of being a bandwidth-limiting indirection, is an anti-indirection - not only it lets you control the machine with your body, the control is direct enough that over time your brain learns to abstract it away, and the car becomes an extension of your body. We need more of tangible interfaces like that with computers.

The steering wheel case, of course, would fail with "AI-level smarts" - but that still doesn't mean we should embrace talking to computers. A good analogy is dance - it's an interaction between two independently smart agents exploring an activity together, and as they do it enough, it becomes fluid.

So dance, IMO, is the steering wheel analogy for AI-powered interfaces, and that is the space we need to explore more.

replies(3): >>42936587 #>>42936620 #>>42936997 #
6. ryandrake ◴[] No.42936587[source]
> We need an interface to operate a tool, not an interface to an agent that operates a tool for us.

Excellent comment and it gets to the heart of something I've had trouble clearly articulating: We've slowly lost the concept that a computer is a tool that the user wields and commands to do things. Now, a computer has its own mind and agency, and we "request" it to do things and "communicate" with it, and ask it to run this and don't run that.

Now, we're negotiating and pleading with the man inside of the computer, Mr. Computer, who has its own goals and ambitions that don't necessarily align with your own as a user. It runs what it wants to run, and if that upsets you, user, well tough shit! Instead of waiting for a command and then faithfully executing it, Mr. Computer is off doing whatever the hell he wants, running system applications in the background, updating this and that, sending you notifications, and occasionally asking you for permission to do even more. And here you are as the user, hobbled and increasingly forced to "chat" with it to get it to do what you want.

Even turning your computer off! You used to throw a hardware switch that interrupts the power to the main board, and _sayonara_ Mr. Computer! Now, the switch does nothing but send an impassioned plea to the operating system to pretty please, with sugar on top, when you're not busy could you possibly power off the computer (or mostly power it off, because off doesn't even mean off anymore).

replies(2): >>42937186 #>>42937995 #
7. taeric ◴[] No.42936620[source]
I think this gets to how a lot of these conversations go past each other? A chat interface for getting a ride from a car is almost certainly doable? So long as the itinerary and other details remain separate things? At large, you are basically using a chat bot to be a travel agent, no?

But, as you say, a chat interface would be a terrible way to actively drive a car. And that is a different thing, but I'm growing convinced many will focus on the first idea while staving off the complaints of the latter.

In another thread, I assert that chat is probably a fine way to order up something that fits a repertoire that trained a bot. But, I don't think sticking to the chat window is the best way to interface with what it delivers. You almost certainly want to be much more actively "hands on" in very domain specific ways with the artifacts produced.

replies(1): >>42940389 #
8. smj-edison ◴[] No.42936997[source]
This is one reason I love what Bret Victor has been doing with Dynamic Land[1]. He's really been doing in on trying to engage as many senses as possible, and make the whole system understandable. One of his big points is that the future in technology is helping us understand more, not defer our understanding to something else.

[1] https://dynamicland.org/

EDIT: love your analogy to dance!

9. xp84 ◴[] No.42937186{3}[source]
This is a great observation. I've mostly thought of it, not in relation to AI, but in relation to the way Apple and to a lesser extent, Microsoft, act like they are the owners of the computers we "buy." An update will be installed now. Your silly user applications will be closed by force if necessary. System stability depends on it!

The modern OS values the system's theoretical 'system health' metrics far above things like "whether the user can use it to do some user task."

Another great example is how you can't boot a modern Mac laptop, on AC power, until it has decided its battery is sufficiently charged. Why? None of your business.

Anyway to get back on topic, this is an interesting connection you've made, the software vendor will perhaps delegate decisions like "is the user allowed to log into the computer at this time" or "is a reboot mandatory" to an "agent" running on the computer. If we're lucky we'll get to talk to that agent to plead our case, but my guess is Apple and Microsoft will decide we aren't qualified to have input to the decisions.

replies(1): >>42937273 #
10. ryandrake ◴[] No.42937273{4}[source]
An example of where this is going is Apple's so-called "System Integrity Protection"[1] which is essentially an access level to system files that's even higher than root. It's Apple arrogantly protecting "their" system from the user, even from the root user:

    System Integrity Protection is designed to allow modification of these protected parts only by processes that are signed by Apple and have special entitlements to write to system files, such as Apple software updates and Apple installers.
Only Apple can be trusted to operate what is supposed to be your computer.

1: https://support.apple.com/en-us/102149

replies(1): >>42937878 #
11. skydhash ◴[] No.42937878{5}[source]
Which is why I love my freebsd installation (and before that Alpine Linux) and why I develop on a VM on macOS. I can trivially modify the system components to get the behavior that I need. I consider macOS as a step up from ChromeOS, but not a general purpose computer OS. Latest annoyance was the fact that signing out of Books.app signs you out of the App Store (I didn’t want epubs to be synced).
12. Karrot_Kream ◴[] No.42937995{3}[source]
> Now, a computer has its own mind and agency, and we "request" it to do things and "communicate" with it, and ask it to run this and don't run that.

FWIW this happens what happens with modern steering wheels as well. Power steering is its own complicated subsystem that isn't just about user input. It has many more failure modes than an old-fashioned, analog steering wheel. The reason folks feel like "Mr. Computer" has a mind of its own is because of the mismatch between user desire and effect. This is a UX problem.

I also think chat and RAG are the biggest two UX paradigms we've spent exploring when it comes to LLMs. It's probably worth folks exploring other UX for LLMs that are enabling for the user. Suggestions in documents and code seem to be a UX that more people enjoy using but even then there's a mismatch.

13. SoftTalker ◴[] No.42940265[source]
Yeah this is part of why RTO is not an entirely terrible idea. Remote work has these downsides -- working with another person over a computer link sucks pretty hard, no matter how you do it (not saying WFH doesn't have other very real upsides).
replies(1): >>42940793 #
14. TeMPOraL ◴[] No.42940389{3}[source]
> But, I don't think sticking to the chat window is the best way to interface with what it delivers. You almost certainly want to be much more actively "hands on" in very domain specific ways with the artifacts produced.

Yes, this is what I've also tried to hint at in my comment, but failed part-way. In most of the cases I can imagine chat interface to be fine (or even ideal), it's really only good as a starting point. Take two examples based on your reply:

1) Getting a car ride. "Computer, order me a cab home" is a good start. It's even OK if I then get asked to narrow it down between several different services/fares (next time I'll remember to specify that up front). But if I want to inspect the route (or perhaps adjust it, in a hypothetical service that supports it), I'd already prefer an interactive map I can scroll and zoom, with PoIs I can tap on to get their details, than to continue a verbal chat.

2) Ordering food in a fast food restaurant. I'm fine starting it with a conversation if I know what I want. However, getting back the order summary in prose (or worse, read out loud) would already be taxing, and if I wanted to make final adjustments, I'd beg for buttons and numeric input boxes. And, in case I don't know what I want, or what is available (and at what prices), a chat interface is a non-starter. Interactive menu is a must.

You sum this up perfectly:

> You almost certainly want to be much more actively "hands on" in very domain specific ways with the artifacts produced.

Chat may be great to get that first artifact, but afterwards, there's almost always a more hands-on interface that would be much better.

replies(1): >>42940757 #
15. taeric ◴[] No.42940757{4}[source]
Oh, apologies, I meant my post to be a highlight of how I agree with you! Your post is great!
16. taeric ◴[] No.42940793{3}[source]
Agreed.

I'm actually in an awkward position where I was very supportive of RTO two years ago, but have since become very reliant on some things I could not do with a rigid RTO policy.

Regardless of RTO or WFH, patience and persistence remain vital qualities.

17. hakfoo ◴[] No.42984355[source]
The idea of chat interfaces always seemed to be to disguise available functionality.

It's a CLI without the integrity. When you bought a 386, it came with a big book that said "MS-DOS 4.01" and enumerated the 75 commands you can type at the C:\> prompt and actually make something useful happen.

When you argue with ChatGPT, its whole business is to not tell you what those 75 commands are. Maybe your prompt fits its core competency and you'll get exactly what you wanted. Maybe it's hammering what you said into a shape it can parse and producing marginal garbage. Maybe it's going to hallucinate from nothing. But it's going to hide that behind a bunch of cute language and hopefully you'll just keep pulling the gacha and blaming yourself if it's not right.