Figure 03, our 3rd generation humanoid robot

All of the examples in videos are cherry picked. Go ask anyone working on humanoid robots today, almost everything you see here, if repeated 10 times, will enter failure mode because the happy path is so narrow. There should really be benchmarks where you invite robots from different companies, ask them beforehand about their capabilities, and then create an environment that is within those capabilities but was not used in the training data, and you will see the real failure rate. These things are not ready for anything besides tech demos currently. Most of the training is done in simulations that approximate physics, and the rest is done manually by humans using joysticks (almost everything they do with hands). Failure rates are staggering.

I'm really confused by this video. What is it even supposed to be doing?

Is it supposed to be taking packages and placing them label face down?

I cannot understand how a robot doing this is cheaper than a second scanner so you can read the label face down or face up. I mean you could do that with a mirror.

But I'm not convinced it is even doing that. Several packages are already "label side down" and it just moves them along. Do those packages even have labels? Clearly the behavior learned is "label not on top", not "label side down". No way is that the intended behavior.

If the bar code is the issue, then why not switch to a QR code or some other format? There's not much information you need in shipping so the QR code can have lots of redundancy, making it readable from many different angles and even if significantly damaged.

The video description also says "approaching human-level dexterity and speed". No way. I'd wager I could do this task at least 10x its speed, if not 20x. And that I'd do it better! I mean I watched a few minutes at 2x speed and man is it slow. Sure, this thing might be able to run 24/7 without breaks, but if I'm running 10-20x faster then what's that matter? I could just come in a few hours a day and blow through its quota. I'd really like to see an actual human worker for comparison.

But if we did want something to do this very narrow task for 24/7, I'm pretty sure there are a hundred different cheaper ways to do it. If there aren't, then it is because there is some edge cases that are pretty important. And without knowing that then we can't actually properly evaluate this video. Besides, this video seems like a pretty simple ideal case. I'm not sure what an actual amazon sorting process looks like, but I suspect not like this.

Regardless, the results look pretty cool and I'm pretty impressed with Figure even if it is an over-simplified case.