←back to thread

858 points cryptophreak | 3 comments | | HN request time: 0s | source
Show context
taeric ◴[] No.42934898[source]
I'm growing to the idea that chat is a bad UI pattern, period. It is a great record of correspondence, I think. But it is a terrible UI for doing anything.

In large, I assert this is because the best way to do something is to do that thing. There can be correspondence around the thing, but the artifacts that you are building are separate things.

You could probably take this further and say that narrative is a terrible way to build things. It can be a great way to communicate them, but being a separate entity, it is not necessarily good at making any artifacts.

replies(17): >>42934997 #>>42935058 #>>42935095 #>>42935264 #>>42935288 #>>42935321 #>>42935532 #>>42935611 #>>42935699 #>>42935732 #>>42935789 #>>42935876 #>>42935938 #>>42936034 #>>42936062 #>>42936284 #>>42939864 #
tpmoney ◴[] No.42936062[source]
I disagree. Chat is a fantastic UI for getting an AI to generate something vague. Specifically I’m thinking of AI image generation. A chat UI is a great interface for iterating on an image and dialing it in over a series of iterations. The key here is that the AI model needs to keep context both of the image generation history and that chat history.

I think this applies to any “fuzzy generation” scenario. It certainly shouldn’t be the only tool, and (at least as it stands today) isn’t good enough to finalize and fine tune the final result, but a series of “a foo with a bar” “slightly less orange” “make the bar a bit more like a fizzbuzz” interactions with a good chat UI can really get a good 80% solution.

But like all new toys, AI and AI chat will be hammered into a few thousand places where it makes no sense until the hype dies down and we come up with rules and guidelines for where it does and doesn’t work

replies(1): >>42936757 #
1. badsectoracula ◴[] No.42936757[source]
> Specifically I’m thinking of AI image generation

I heavily disagree here, chat - or really text - is a horrible UI for image generation, unless you have almost zero idea of what you want to achieve and you don't really care about the final results.

Typing "make the bar a bit more like a fizzbuzz" in some textbox is awful UX compared to, say, clicking on the "bar" and selecting "fizzbuzz" or drag-and-dropping "fizzbuzz" on the "bar" or really anything that takes advantage of the fact we're interacting with a graphical environment to do work on graphics.

In fact it is a horrible UI for anything, except perhaps chatbots and tasks that have to do with text like grammar correction, altering writing styles, etc.

It is helpful for impressing people (especially people with money) though.

replies(1): >>42940422 #
2. tpmoney ◴[] No.42940422[source]
> Typing "make the bar a bit more like a fizzbuzz" in some textbox is awful UX compared to, say, clicking on the "bar" and selecting "fizzbuzz" or drag-and-dropping "fizzbuzz" on the "bar" or really anything that takes advantage of the fact we're interacting with a graphical environment to do work on graphics.

That assumes that you have a UX capable of determining what you're clicking on in the generated image (which we could call a given if we assume a sufficiently capable AI model since we're already instructing it to alter the thing), and also that it can determine from your click that you've intended to click on the "foo" not the "greeble" that is on the foo or the shadow covering that part of the foo or anything else that might be in the same Z stack as your intended target. Pixel bitching adventure games come to mind as an example of how badly this can go for us. And yes, this is solvable, Squeak has a UI where repeatedly clicking in the same spot will iterate through the list of possibilities in that Z stack. But it could also get really messy really quickly.

Then we have to assume that your UX will be able to generate an entire list of possible things you might want to be able to to do with that thing that you've clicked, including adding to it, removing it, removing part of it, moving it, transforming its dimensions, altering its colors, altering the material surface and on and on and on. And that list of possibilities needs to be navigable and searchable in a way that's faster than just typing "make the bar more like a fizzbuzz" into a context aware chat box.

Again, I'm not arguing the chat interface should be the only interface. In fact, as you point out we're using a graphical system, it would be great if you could click on things or select them and have the system work on too. It should be able to take additional input than just chat. But I still think for iterating on a fuzzy idea, a chat UI is a useful tool.

replies(1): >>42952390 #
3. badsectoracula ◴[] No.42952390[source]
Yes, you'd need something that is more involved than a textbox, but this is/can be part of AI research too: computer vision is a very old topic in AI research after all.

About how the exact UI will be for multiple objects, this is something that multiple professional tools - like 3D engines or editors - already do. A common solution (aside from the multiple click you mentioned) is to use a modifier key like Alt+click to show a list of the items under the mouse. An AI system could also show alternative interpretations of what is clicked.

> Again, I'm not arguing the chat interface should be the only interface.

The main issue here is that current AIs are trained using chat-like interactivity at their core with everything else being a "wrapper" around the chat parts, when it should be the other way around.