←back to thread

121 points tylerg | 9 comments | | HN request time: 1.359s | source | bottom
Show context
kenjackson ◴[] No.43660091[source]
This is actually no different than for humans once you get past the familiar. It's like the famous project management tree story: https://pmac-agpc.ca/project-management-tree-swing-story

If anything, LLMs have surprised at much better they are than humans in understanding instructions for text based activities. But they are MUCH worse than humans when it comes to creating images/videos.

replies(2): >>43662572 #>>43662984 #
barotalomey ◴[] No.43662984[source]
> If anything, LLMs have surprised at much better they are than humans in understanding instructions for text based activities.

That's demonstrateably false, as proven by both OpenAI's own research [1] and endless independent studies by now.

What is fascinating is how some people cling on false ideas about what LLM is and isnt.

Its a recurring fallacy that's bound to get it's own name any time soon.

1: https://news.ycombinator.com/item?id=43155825

replies(2): >>43663692 #>>43663986 #
kenjackson ◴[] No.43663986[source]
You’re comparing an LLM to expert programmers. Compare an LLM on the same task versus the average college student. And try it for a math problem. A poetry problem. Ask it a more complex question about history or to do an analysis of an essay you wrote.

Put it this way — I’m going to give you a text based question to solve and you have a choice to get another human to solve it (randomly selected from adults in the US) or ChatGPT, and both will be given 30 minutes to read and solve the problem — which would you choose?

replies(1): >>43664125 #
1. aleph_minus_one ◴[] No.43664125[source]
> Put it this way — I’m going to give you a text based question to solve and you have a choice to get another human to solve it (randomly selected from adults in the US) or ChatGPT, and both will be given 30 minutes to read and solve the problem — which would you choose?

You wouldn't randomly selected an arbitrary adult from the USA to do a brain surgery on you, so this argument is rabulistic.

replies(2): >>43664319 #>>43666211 #
2. kenjackson ◴[] No.43664319[source]
Brain surgery requires a license.

But I do expect an arbitrary adult to be able to follow instructions.

Ok. How about you give me a text based task where you would pick the random adult over the LLM?

replies(2): >>43664551 #>>43673244 #
3. aleph_minus_one ◴[] No.43664551[source]
> Brain surgery requires a license.

This is rather a red-tape problem. :-)

4. daveguy ◴[] No.43666211[source]
I would chose a random person from my company that was hired to work in that domain to solve problems in that domain. Yes, regardless of the position. Accountant in the domain, yes. Office organizer in the domain, yes. Essentially anyone in the domain, yes. No offense, but by restricting the selection to the general human population you're setting a low bar for LLMs here.
replies(1): >>43667233 #
5. kenjackson ◴[] No.43667233[source]
If the bar is for LLMs to replace domain experts about four years after introduction then yes, they are failing miserably.

But if you were to go back to 2020 and ask if your take a random human over a the state of the art AI to answer a text question you’d take the random human every time except for arithmetic (and you’d have to write it in math notation and not plain English).

And if you were to ask AI experts when would you chose an AI they’d say at least not for a decade or two, if ever.

replies(1): >>43674003 #
6. nyclounge ◴[] No.43673244[source]
I think you and the parent may be talking about 2 different things.

Do I want to use an LLM to do it from business owner perspective? Yeah probably it is cheaper and more convenient. Which one I want to use, depending the problem we are solving here right?

I'm more concern about the integrity of the current digital infrastructure. In that sense I would NOT trust ANY thing really important to anything digital, much less to LLM. Can I use it for exploration then require an actually human expert approval/edit. Absolutely!

As long as the digital doesn't result in significant physical or financial damage.

Edit: and for HN ppl, of course the LLM will have have to be open weight and all and run locally in a air gaped GPU, preferably in a Faraday cage.

7. daveguy ◴[] No.43674003{3}[source]
I wasn't talking about how impressive AI systems are, or how far they've come. I was talking about the fact that any random human with any experience in a specific field -- even though they are not a domain expert -- is going to do better than an LLM. Or, human common sense >>>> what LLMs are doing.
replies(1): >>43675229 #
8. kenjackson ◴[] No.43675229{4}[source]
We will have to agree to disagree about your fundamental point.
replies(1): >>43677015 #
9. daveguy ◴[] No.43677015{5}[source]
Fair enough. We will see.