Most active commenters

kibwen(4)

Popular/hot comments

>>45529610 #

←back to thread

Figure 03, our 3rd generation humanoid robot

(www.figure.ai)

Show context

HAL3000 ◴[09 Oct 25 14:59 UTC] No.45528648[source]▶

>>45527402 (OP) #

All of the examples in videos are cherry picked. Go ask anyone working on humanoid robots today, almost everything you see here, if repeated 10 times, will enter failure mode because the happy path is so narrow. There should really be benchmarks where you invite robots from different companies, ask them beforehand about their capabilities, and then create an environment that is within those capabilities but was not used in the training data, and you will see the real failure rate. These things are not ready for anything besides tech demos currently. Most of the training is done in simulations that approximate physics, and the rest is done manually by humans using joysticks (almost everything they do with hands). Failure rates are staggering.

replies(17): >>45529270 #>>45529335 #>>45529542 #>>45529760 #>>45529839 #>>45529903 #>>45529962 #>>45530530 #>>45531634 #>>45532178 #>>45532431 #>>45532651 #>>45533534 #>>45533814 #>>45534991 #>>45539498 #>>45542410 #

1. ipnon ◴[09 Oct 25 15:41 UTC] No.45529270[source]▶

>>45528648 #

Now the question is if this is GPT-2 and we’re a decade away from autonomous androids given some scaling and tweaks, or if autonomous androids is just an extremely hard problem.

replies(4): >>45529367 #>>45529610 #>>45529686 #>>45532243 #

2. lossolo ◴[09 Oct 25 15:48 UTC] No.45529367[source]▶

>>45529270 (TP) #

https://www.figure.ai/company

"Building Figure won’t be an easy win; it will require decades of commitment and ingenuity."

"Our focus is on what we can achieve 5, 10, 20+ years from now, not the near-term wins."

At least it's not Musk's forever "next year".

replies(2): >>45529512 #>>45529591 #

3. phatskat ◴[09 Oct 25 15:59 UTC] No.45529512[source]▶

>>45529367 #

Musk really missed an opportunity to promise wrecking the govt “next year” - we all would’ve rolled our eyes a la “fully autonomous driving next year” and been eating our hats by now

4. MountDoom ◴[09 Oct 25 16:04 UTC] No.45529591[source]▶

>>45529367 #

> At least it's not Musk's forever "next year".

The problem with the principled approach to high-uncertainty projects is that if you slowly execute on a sequential multi-year plan, you will almost certainly find out in year 9 that multiple of the late-stage tasks are much harder than you thought.

You just don't know ahead of the time. Just look at how many corporations and research labs had decades-long strategies to build human-like AI that went nowhere. And then some guys came up with a novel architecture and all of sudden, you can ask your computer to write an essay about penguins.

Musk's approach is that if you have an infinite supply of fresh grads who really believe in you and are willing to work crazy hours, giving them a "next year" deadline is more likely to give you what you want than telling them "here's your slow-paced project you're gonna be working on for the next decade". And I guess he thinks to himself that some of them are going to burn out, but it's a sacrifice he's willing to make.

replies(2): >>45530203 #>>45538018 #

5. kibwen ◴[09 Oct 25 16:05 UTC] No.45529610[source]▶

>>45529270 (TP) #

For LLMs, the input is text, and the output is text. By the time of GPT-2, the internet contained enough training data to make training an interesting LLM feasible (as judged by its ability to output convincing text).

We are nowhere near the same for autonomous robots, and it's not even funny. To continue to use the internet as an analogy for LLMs, we are pre-DARPANET, pre-ASCII, pre-transistor. We don't even have the sensors that would make safe household humanoid robots possible. Any theater from robot companies about trying to train a neural net based on motion capture is laughably foolish. At the current rate of progress, we are more than decades away.

replies(5): >>45530490 #>>45530660 #>>45532119 #>>45533182 #>>45536662 #

6. jcims ◴[09 Oct 25 16:10 UTC] No.45529686[source]▶

>>45529270 (TP) #

I don't know if I caught your comment in my peripheral vision or what but GPT-2 is exactly where I conceptually placed this.

Neural networks for motion control is very clearly resulting in some incredible capability in a relatively short amount of time vs. the more traditional control hierarchies used in something like Boston Dynamics. Look at Unitree's G1

https://www.youtube.com/shorts/mP3Exb1YC8o

https://www.youtube.com/watch?v=bPSLMX_V38E

It's like an agile idiot, very physically capable but no purpose.

The next domain is going to be incorporating goals and intent and short/long term chains of causality into the model, and for that it seems we're presently missing quite a bit usable training data. That will clearly evolve over time, as will the fidelity of simulations that can be used to train the model and the learned experience of deployed robots.

replies(1): >>45531038 #

7. Judgmentality ◴[09 Oct 25 16:52 UTC] No.45530203{3}[source]▶

>>45529591 #

> Musk's approach is that if you have an infinite supply of fresh grads who really believe in you and are willing to work crazy hours, giving them a "next year" deadline is more likely to give you what you want than telling them "here's your slow-paced project you're gonna be working on for the next decade". And I guess he thinks to himself that some of them are going to burn out, but it's a sacrifice he's willing to make.

This feels incredibly generous. I'm pretty sure his approach is that he needs to keep the hype cycle going for as long as possible. I also believe it's partially his willingness to believe his own bullshit.

replies(2): >>45530687 #>>45533197 #

8. tyre ◴[09 Oct 25 17:16 UTC] No.45530490[source]▶

>>45529610 #

I would guess Amazon has a ridiculous amount of access to training data in its warehouses. Video, package sizes, weights, sorting.

I’m sure they could pretty easily spin up a site with 200 of these processing packages of most sizes (they have a limited number of standardized package sizes) nonstop. Remove ones that it gets right 99.99% of the time and keep training on the more difficult ones, the move to individual items.

Caveat: I have no idea what I’m talking about.

replies(1): >>45533505 #

9. blackoil ◴[09 Oct 25 17:29 UTC] No.45530660[source]▶

>>45529610 #

McD must be selling millions of burgers every day and cameras are cheap and omnipresent, so should not be difficult to get videos for single type of tasks.

replies(1): >>45533795 #

10. blackoil ◴[09 Oct 25 17:31 UTC] No.45530687{4}[source]▶

>>45530203 #

Most likely you are right. Best way to peddle a lie is to believe it.

11. robots0only ◴[09 Oct 25 18:04 UTC] No.45531038[source]▶

>>45529686 #

Locomotion and manipulation are pretty different. The former we know how to do well -- this is what you see in unitree videos. Manipulation still not so much. This is not at all like GPT-2 because we still don't know what to scale (and even the data to scale is not there).

12. bmau5 ◴[09 Oct 25 19:37 UTC] No.45532119[source]▶

>>45529610 #

Does your estimate account for advancements in virtual simulation models that has simultaneously been happening? From people I speak to in the space (which I am very much not in) - they had mentioned these advancements have dramatically improved the rate of training and learning - though they also advised we're some ways off from showtime.

replies(1): >>45533810 #

13. hadlock ◴[09 Oct 25 19:50 UTC] No.45532243[source]▶

>>45529270 (TP) #

This is where I'm at. If you look at Boston Dynamics' first videos, they're 45 second clips of 4 legged robots walking in not even a straight line, just proving they could walk 5 feet over level ground without falling over. The top comment, from 4 years ago is "This was 11 years ago. Now these things are dancing." https://www.youtube.com/watch?v=3gi6Ohnp9x8

If you can make it look believable on camera for 15 seconds under controlled studio conditions... it's probable you can do it autonomously in 10-15 years. I don't think anyone is going to be casually buying these for their house by this time next year, but it certainly demonstrates what is realistically possible.

If they can provably make these things safe, it will have huge implications for in home care in advanced age, where instead of living in an assisted living home at $huge expense for 20+ years, you might be able to live on your own for most of that time.

I am cautiously optimistic.

replies(1): >>45533856 #

14. ACCount37 ◴[09 Oct 25 21:15 UTC] No.45533182[source]▶

>>45529610 #

Robotics has a big training data problem. But your "we don't have the sensors" claim is absolutely laughable.

It was never about the sensors. It was always about AI.

replies(1): >>45533786 #

15. rishabhaiover ◴[09 Oct 25 21:17 UTC] No.45533197{4}[source]▶

>>45530203 #

I’d rather put my faith in his grand illusions than in your sanctimonious high priest pose.

replies(2): >>45533214 #>>45539903 #

16. higginsniggins ◴[09 Oct 25 21:19 UTC] No.45533214{5}[source]▶

>>45533197 #

who do you think spins "grand illusions" if not your "high priest" who dosn't even know you exist.

17. eulgro ◴[09 Oct 25 21:53 UTC] No.45533505{3}[source]▶

>>45530490 #

A more efficient way might be to train them in simulation. If you simulate a warehouse environment and use that to pre-train a million robots in parallel at 100x real time learning would go much faster. Then you can fine tune on reality for details missed by the simulation environment.

18. kibwen ◴[09 Oct 25 22:41 UTC] No.45533786{3}[source]▶

>>45533182 #

No, it doesn't matter if you have a hypergenius superintelligence if it's locked in a body with no hardware support for useful proprioception. You will not go to space today.

replies(2): >>45534060 #>>45536197 #

19. kibwen ◴[09 Oct 25 22:43 UTC] No.45533795{3}[source]▶

>>45530660 #

There is no reason to employ humanoid robots in industrial environments when it will always be easier and cheaper to adapt the environment to a specialized non-humanoid robot than to adapt robots into humanoid shape. This is true for the same reason that no LLM is ever going to beat Stockfish at chess.

20. kibwen ◴[09 Oct 25 22:44 UTC] No.45533810{3}[source]▶

>>45532119 #

As Tesla could tell you with their failure to deliver self-driving cars, it doesn't matter if you have exabytes of training data if it's all the wrong kind of data and if your hardware platform is insufficiently capable.

21. gonzobonzo ◴[09 Oct 25 22:52 UTC] No.45533856[source]▶

>>45532243 #

The robot (BigDog) in that video shows numerous capabilities that Spot still can't do (climbing over terrain like that, being able to respond to a kick like that, the part on the ice, etc.). Even 16 years later.

This only highlights the fact that making a cool prototype do a few cool things on video is far, far easier than making a commercial product that can consistently do these things reliably. It often takes decades to move from the former to the latter. And Figure hasn't even shown us particularly impressive things from its prototypes yet.

replies(1): >>45536233 #

22. ACCount37 ◴[09 Oct 25 23:24 UTC] No.45534060{4}[source]▶

>>45533786 #

Lmao no. Every motor is a sensor. And the better my world model is, the less sensors I need to keep it up.

23. serf ◴[10 Oct 25 07:25 UTC] No.45536197{4}[source]▶

>>45533786 #

A 'hypergenius superintelligence' could achieve most, if not all useful proprioception simply by looking at motor amperage draw, or if that's unavailable then total system amperage draw.

An arm moving against gravity has a higher draw, the arc itself creates characteristics, a motion or force against the arm or fingers generates a change in draw -- a superintellligence would need only an ammeter to master proprioception, because human researchers can do this in a lab and they're nowhere near the bar of 'hypergenius superintelligence'.

24. serf ◴[10 Oct 25 07:31 UTC] No.45536233{3}[source]▶

>>45533856 #

It's an unfair comparison. Yes, they're both 4 legged 'dogs', but they use radically different design criteria -- design criteria that the BigDog was used to refine.

I'm not surprised that a Honda Civic can't navigate the Dakar Rally route..

25. fragmede ◴[10 Oct 25 08:56 UTC] No.45536662[source]▶

>>45529610 #

Time will tell if that's true. We don't have the same corpus of data, that's true, but what we do have is the ability to make a digital twin, where the robot practices in a virtual world, what would happen. It can do 10,000 jumping jacks every hour, parallelized across a whole GPU supercomputer, and that data can be fed in as training data.

26. 542354234235 ◴[10 Oct 25 12:09 UTC] No.45538018{3}[source]▶

>>45529591 #

>Just look at how many corporations and research labs had decades-long strategies to build human-like AI that went nowhere.

They didn't go nowhere; they just didn't result in human-like AI. They gave us lots of breakthroughs, useful basic knowledge, and knowledge infrastructure that could be built off for related and unrelated projects. Plenty of shoot for the moon corporations didn't result in human-like AI either, but also probably did go nowhere, since they were focused on an all or nothing strategy. The ones that do succeed in a moonshot relied on those breakthroughs from decades-long research.

I'm not going to get into what Musk has been doing because I'm just not,

27. ◴[10 Oct 25 15:06 UTC] No.45539903{5}[source]▶

>>45533197 #

↑