Most active commenters
  • jashmota(12)
  • roamerz(3)

←back to thread

103 points jashmota | 27 comments | | HN request time: 0.001s | source | bottom

Hey HN, We're Jash and Mahimana, cofounders of Flywheel AI (https://useflywheel.ai). We’re building a remote teleop and autonomous stack for excavators.

Here's a video: https://www.youtube.com/watch?v=zCNmNm3lQGk.

Interfacing with existing excavators for enabling remote teleop (or autonomy) is hard. Unlike cars which use drive-by-wire technology, most of the millions of excavators are fully hydraulic machines. The joysticks are connected to a pilot hydraulic circuit, which proportionally moves the cylinders in the main hydraulic circuit which ultimately moves the excavator joints. This means excavators mostly do not have an electronic component to control the joints. We solve this by mechanically actuating the joysticks and pedals inside the excavators.

We do this with retrofits which work on any excavator model/make, enabling us to augment existing machines. By enabling remote teleoperation, we are able to increase site safety, productivity and also cost efficiency.

Teleoperation by the operators enables us to prepare training data for autonomy. In robotics, training data comprises observation and action. While images and videos are abundant on the internet, egocentric (PoV) observation and action data is extremely scarce, and it is this scarcity that is holding back scaling robot learning policies.

Flywheel solves this by preparing the training data coming from our remote teleop-enabled excavators which we have already deployed. And we do this with very minimal hardware setup and resources.

During our time in YC, we did 25-30 iterations of sensor stack and placement permutations/combinations, and model hyperparams variations. We called this “evolution of the physical form of our retrofit”. Eventually, we landed on our current evolution and have successfully been able to train some levels of autonomy with only a few hours of training data.

The big takeaway was how much more important data is than optimizing hyperparams of the model. So today, we’re open sourcing 100hrs of excavator dataset that we collected using Flywheel systems on real construction sites. This is in partnership with Frodobots.ai.

Dataset: https://huggingface.co/datasets/FlywheelAI/excavator-dataset

Machine/retrofit details:

  Volvo EC380 (38 ton excavator)
  4xcamera (25fps)
  25 hz expert operator’s action data
The dataset contains observation data from 4 cameras and operator's expert action data which can be used to train imitation learning models to run an excavator autonomously for the workflows in those demonstrations, like digging and dumping. We were able to train a small autonomy model for bucket pick and place on Kubota U17 from just 6-7 hours of data collected during YC.

We’re just getting started. We have good amounts of variations in daylight, weather, tasks, and would be adding more hours of data and also converting to lerobot format soon. We’re doing this so people like you and me can try out training models on real world data which is very, very hard to get.

So please checkout the dataset here and feel free to download and use however you like. We would love for people to do things with it! I’ll be around in the thread and look forward to comments and feedback from the community!

1. seabrookmx ◴[] No.45364639[source]
> The joysticks are connected to a pilot hydraulic circuit, which proportionally moves the cylinders in the main hydraulic circuit which ultimately moves the excavator joints

I've actually spent a decent amount of time running an excavator, as my Dad owns a construction / road building company. It was a great summer job!

An important note about the pilot hydraulics is that they _provide feedback to the operator_. I would encourage any system that moves these controls on behalf of a remote human operator or AI to add strain gauges or some other way to measure this force feedback so that this data isn't lost.

The handful of "drive by wire" pieces of equipment that my Dad or other skilled operators in my family have ran were universally panned, because the operators are isolated from this feedback and have a harder time telling when the machine is struggling or when their inputs are not sufficiently smooth. In the automotive world, skilled drivers have similar complaints about fully electronic steering or braking systems, as opposed to traditional vacuum or hydraulic boosting approaches where your foot still has a direct hydraulic connection to the brake pads.

replies(4): >>45365403 #>>45365849 #>>45366303 #>>45366966 #
2. jeffbee ◴[] No.45365403[source]
My car with its drive-by-wire brakes has a brake feedback simulator that gives the driver the kind of feeling associated with power-boosted hydraulic brakes. This is by far the most expensive single component in the car. Arguably these are just expensive accommodations for human flaws. A self-driving car wouldn't need them. Can't the self-driving system act directly on data like pressure, flow, and displacement?
replies(3): >>45365470 #>>45366607 #>>45366959 #
3. 01HNNWZ0MV43FF ◴[] No.45365470[source]
Maybe it doesn't matter for a car because feeling the car's motion tells you most of what you need to know. A car is not meant to touch anything but the road, in normal conditions. I think steering is the only case where force feedback is very important for a car - In the winters up here, I can feel the steering go loose when I hit a patch of ice.

I imagine an excavator, meant to touch and dig through things, and lift things, benefits from force feedback for the same reason VR would.

Have you played those VR sword games? BeatSaber works great because you're cutting through abstract blobs that offer no resistance. But the medieval sword-slashing games feel weird because your sword can't impact your opponent.

I saw a video recently of a quadcopter lifting heavy objects. When it's overloaded, it can't maneuver because all its spare power is spent generating lift to maintain altitude. If the controls had force feedback, the copter's computer could tell you "I'm overloaded, I can't move" by putting maximum resistance on the sticks.

replies(1): >>45366682 #
4. opwieurposiu ◴[] No.45365849[source]
Yes, and you also get feedback from your butt as the machine tips and wobbles, particularly on smaller machines. Hearing the engine straining helps also. Often you can not clearly see what you are digging, this feedback lets you know if you are running into a rock or something.

One big advantage would be cameras mounted on the boom and rear view cameras, as many machines have obstructed views.

replies(1): >>45366310 #
5. jashmota ◴[] No.45366303[source]
You're right! This is exactly why we like to do mechanical actuation - we are able to achieve bilateral telepresence, which essentially gives the torque (haptic) feedback over the internet! So on small excavators, you can absolutely feel the resistance. We also stream the engine audio, which tells you how hard the hydraulic pump is working. Operators like our system for these reasons :)

I'd like to get a chance to talk to you and your Dad to get feedback. How do I reach you? My email is contact at useflywheel dot ai

replies(1): >>45372030 #
6. jashmota ◴[] No.45366310[source]
We're indeed streaming the audio and have haptic feedback. My hypothesis is the seat vibration isn't as helpful. It's sub optimal and operators would be far more productive without it. We would do a paper when we have enough data on this. We're also putting more cameras but streaming a lot of them at once is tricky.
replies(2): >>45366735 #>>45377591 #
7. jashmota ◴[] No.45366607[source]
That's indeed what we're trying to test to the extreme - to see how far we could go with just vision. We haven't done heavy excavation workflows yet, but we have some early success with some excavation workflows with just vision input and joystick action output (even without joint angle feedback!). We're betting on having really huge data with compact observation input and experiment to see if it holds water. If not we can always dial it down and add more sensors/feedback.
8. jashmota ◴[] No.45366682{3}[source]
Interestingly, we had some people try out VR teleop: https://x.com/Scobleizer/status/1970245161306464667

https://x.com/jash_mota/status/1969091992140304703

I think force feedback is key for small excavators, but not really true for 25+ tons excavators. Hence how easy it is for operators to accidentally kill someone with it.

9. Redster ◴[] No.45366735{3}[source]
Humbly, have you used excavators of varying sizes on uneven ground? I have and would suggest it's more important than you might think. But if you've operated them, you might know better than I.

Also, teleoperation is likely to produce lower-quality operation data than hooking up to locally operated excavators. Just a thought.

replies(1): >>45366765 #
10. jashmota ◴[] No.45366765{4}[source]
I might be less experienced than you - I've operated upto 38 tons on maximum 15 degrees incline. I wasn't moving tracks a lot when I did that. Would like to hear what scenarios you'be been in and how would you describe your experience? Maybe I could try those out to learn more!
11. cyberax ◴[] No.45366959[source]
There are no drive-by-wire brakes in the US or Europe for regular cars. Your car's actuator moves the piston that is mechanically linked to your pedal.

So even if the electric system fails completely, you can still actuate the brakes.

replies(3): >>45367128 #>>45369182 #>>45370707 #
12. roamerz ◴[] No.45366966[source]
I have a 20 ton Takeuchi and I don’t recall any feedback in the controls at all. The feedback I use is from the seat and sounds of the machine - well besides of course visual of course.

I cannot imagine this being useful to me unless the virtual operators cab closely mimicked an actual machine. It would have to have audio from the machine and be on a platform that tilted relative to the real thing. It would also need 270 degrees of monitors with a virtual mirror to see behind. On the front monitor, minimally, would need the to see vertically up and down too.

I also imagine all of this would be more useful to seasoned operators who can do most things on excavators in their sleep (definitely not me lol)

replies(1): >>45368579 #
13. jashmota ◴[] No.45368579[source]
The way I think about this - we should not have multi screens. Human field of vision is 60 degrees for central and about 120 degrees binocular. The bucket of the excavator is way narrower than this which means actual task doesn't require wide vision.

So if we are able to have really good autonomous safety layers to ensure safe movements, and dynamically resize remote teleop windows, you'd make the operator more efficient. So while we stream 360 degree view, we get creative in how we show it.

That's on the vision side. We also stream engine audio, and do haptic feedback.

Takeuchi are interesting! Rare ones to have blades even on bigger sizes - is that why you got one?

replies(2): >>45368860 #>>45369335 #
14. AlotOfReading ◴[] No.45368860{3}[source]
Just a suggestion from someone who's worked on industrial robots and autonomous vehicles, but I think you're underselling a lot of difficulties here.

Skilled humans have a tendency to fully engage all of their senses during a task. For example, human drivers use their entire field of vision at night even though headlights only illuminate tiny portions of their FoV. I've never operated an excavator, but I would be very surprised if skilled operators only used the portion of their vision immediately around the bucket and not the rest of it for situational awareness.

That said, UI design is a tradeoff. There's a paper that has a nice list of teleoperation design principles [0], which does talk about single windows as a positive. On the other hand, a common principle in older aviation HCI literature is the idea that nothing about the system UI should surprise the human. It's hard to maintain a good idea of the system state when you have it resizing windows automatically.

The hardest thing is going to be making really good autonomous safety layers. It's the most difficult part of building a fully autonomous system anyway. The main advantage of teleop is that you can [supposedly] sidestep having one.

[0] https://doi.org/10.1109/THMS.2014.2371048

replies(1): >>45368948 #
15. jashmota ◴[] No.45368948{4}[source]
I definitely agree with you - recreating the scene in teleop is challenging. In excavators, however, it does make it better. An excavator has huge blindspots on the right (due to arm), to the back and sometime near the bucket. Hence the workers who are hired to stand around (banksmen, spotters, signalmen) and signal to the operator.

It's like driving a Ford F150 without backup camera. You'd add the backup camera upfront, and not display the back view at the back window.

It's definitely challenging and we're far from something that's perfect. We're iterating towards something that's better everyday.

replies(1): >>45369077 #
16. AlotOfReading ◴[] No.45369077{5}[source]
Yeah, it sounds like a fun challenge. Hope you have lots of success tackling it
17. seabrookmx ◴[] No.45369182{3}[source]
It depends on how you define brake by wire, but the one I'm referencing is the C8 Corvette and it's "eBoost" system. This isn't a purely electronic system like throttle by wire is, but it does mean there is no longer a linear relationship between pedal pressure and brake pad pressure. And my point about isolating the driver from feedback still holds true.
18. roamerz ◴[] No.45369335{3}[source]
Well sure if you are just looking at where the bucket is digging but there is often a dump truck sitting on either your right or left flank waiting for what’s in your bucket (don’t forget the beep button lol). Having a monitor to either side duplicates what you are seeing out of your peripheral vision when operating the real thing. Would make transitioning from real to virtual much easier and imho safer.

Yes that is precisely why - makes for a much more versatile machine. TB180FR - it’s med-small, about 10 ton.

replies(1): >>45369362 #
19. jashmota ◴[] No.45369362{4}[source]
I think swinging (which is about 40% of dig and dump workflow by time spent) should not be manual. That's lowest levels of autonomy which requires roughly centering to the pit/truck which we have already achieved. Hence operator only has to look in front!

Those workflow numbers come from multiple observations at different sites, one of the examples is this: https://www.youtube.com/watch?v=orEsvu1CS64

I wish to talk to you because it's rare sight someone has a Takeuchi - is there a way to connect? My email is contact at useflywheel dot ai

replies(2): >>45369428 #>>45371300 #
20. roamerz ◴[] No.45369428{5}[source]
>>I think swinging should not be manual.

I disagree and here's a couple of reasons why I say that:

1. What am I going to do with the time between releasing control and regaining it from the autonomous control?

2. In that break of workflow my first thought is it will cause a break in my concentration.

3. When I am swinging back from the truck to the trench the bucket is naturally in my control. It seems that in autonomy mode the transition from autonomous to my control would be very unnatural and choppy. I suppose with time it would be okay but man seems to violate the whole smooth is fast concept.

I'll shoot you an email.

21. kruador ◴[] No.45370707{3}[source]
Toyota's hybrids, at least, have valves in the hydraulic system. If everything is working, the driver's pedal is isolated from the physical pistons. Pressing the pedal instead moves a 'stroke simulator' (a cylinder with a spring in it), and the pressure is measured with a transducer. The Brake ECU tries to satisfy as much braking demand through regenerative braking as possible, applying the rear brakes to keep balance and front brakes if you brake too hard, requesting more braking than can be generated or the battery can absorb.

If there's a failure of the electrical supply to the brake ECU, or another fault condition occurs, various valves then revert to their normally-open or normally-closed positions to allow hydraulic pressure from the pedal through to the brake cylinders, and isolate the stroke simulator.

Because the engine isn't constantly running and providing a vacuum that can be used to assist with brake force, the system also includes a 'brake accumulator' and pump to boost the brake pressure.

Reference: https://pmmonline.co.uk/technical/blue-prints-insight-into-t...

I don't know for certain, but I would assume that other hybrids and EVs have similar systems to maximise regenerative braking.

22. mschuster91 ◴[] No.45371300{5}[source]
> I think swinging (which is about 40% of dig and dump workflow by time spent) should not be manual.

It's been over a decade since I last operated an excavator, so grains of salt as usual - but I'd say it should be manual, or at least semi-automated. You need to take care where you unload the bucket on a truck, to avoid its weight distribution being off-center, or to keep various kinds of soil separated on the bed (e.g. we'd load the front with topsoil and fill the rear with the gravel or whatever else was below.

replies(1): >>45380786 #
23. IgorPartola ◴[] No.45372030[source]
Not my industry at all.

I am curious if something like this is an opportunity for a whole new type of controls and feedback. Since the operator doesn’t have to be in the excavator physically they could take on any position: standing, sitting, lying down, etc. Instead of sending haptic feedback to the joystick it could be sent to a vibrating wrist band. You could hook up the equivalent of a Nintendo Power Glove to have the scoop operated by the operator simulating scooping action. Turning the excavator can be controlled by the operator turning their head and moving it around can be done by the operator walking on an infinite treadmill. Motor strain can be done via color of light or temperature rather than sound. You could have a VR helmet that can also show you a birds eye view from a companion drone, overlay blueprints, show power and water lines, measure depth, etc. I don’t know if it is possible but maybe you could even measure soil composition somehow to show large rocks, soft soil that is dangerous to drive over, inclination angles where the excavator is about to drive, etc.

I imagine skilled operators prefer familiar controls but perhaps there are interesting improvements unlocked by exploring alternatives. It might also fundamentally change how accessible it is for non-professionals to use these machines. I rented an excavator from Home Depot a few years ago to dig a foundation and the learning curve was not shallow. I wonder if a more “natural” interface would help keep people safer.

replies(1): >>45381020 #
24. fragmede ◴[] No.45377591{3}[source]
The problem is how do you quantify it and metricize it and get some numbers to compare? Because it's absolutely invaluable after you get to a level of mastery with a machine that you're sitting in that the seat moves in. You just feel it, you can't explain it in words, you're just one with the machine. Your butt really needs to feel that in order for that to really happen. Can you do it without it? Absolutely. Does it make it worse to not have that? Also true. It's extra effort, it's extra cost, and it makes things better for the actual users of the product, but it'll never show up as something you can measure.

Do you want to sell Microsoft Teams to the executives or do you want to give joy to the people who actually use the product?

replies(1): >>45380862 #
25. jashmota ◴[] No.45380786{6}[source]
I agree - the dumping and digging itself (where you move boom, arm, bucket much more than swing/tracks) should be manual. But swinging to the truck and back to pit (pure swinging motion to center around these areas of interest) do not have to be manual. I agree with your and other comments that the transition has to be smooth and that's something we're working on.
26. jashmota ◴[] No.45380862{4}[source]
You've touched one of my favorite things to think about. I think about this a lot. There will always be people who enjoy fully analog cars/planes/watches over alternatives. I think that doesn't define the whole market. How many of new joinees are we seeing in this industry? Would you rather have a functioning back because you operate these remotely or go to site and sit in one of these and have a broken back by 50s? This is leaving out the fact that remote operation could save lots of lives. In our launch video we did put together some of the news clips on it: https://www.ycombinator.com/launches/O8i-flywheel-ai-waymo-f...
27. jashmota ◴[] No.45381020{3}[source]
These are really really interesting thoughts around the remote teleop interface. A few things you mentioned have been on my mind. Teleop UX is underdeveloped and its affects is underestimated, and it could turn out to be a huge thing for us and humanoid companies if autonomy is harder than it seems now. I don't believe construction equipment operation today is optimal. We just go with it as that's what we have. It's far from intuitive and easy for a newbie to start operating to give some context and requires hours of practice. There would be a bit of training to be done for remote teleop as well. Might as well make it easy to interface and improve on the existing experience, which hasn't really evolved for years and decades!

I'd like to have a chat with you if you're up for it: contact at useflywheel dot ai