←back to thread

126 points thunderbong | 7 comments | | HN request time: 0.76s | source | bottom
1. gxcode ◴[] No.42064758[source]
Author of the post here - happy to answer any questions.
replies(4): >>42065216 #>>42065649 #>>42074214 #>>42074366 #
2. philzook ◴[] No.42065216[source]
Beautiful stuff, great post!
replies(1): >>42065280 #
3. gxcode ◴[] No.42065280[source]
Thank you, really appreciate that.
4. beltranaceves ◴[] No.42065649[source]
Great job! I'm working in a similar blog post and it was fun seeing how you approached it. I was surprised the wasm implementation is fast enough, I was even considering writing webGpu compute shaders for my solver
5. dartharva ◴[] No.42074214[source]
OT, but can you share the CSS you're using for your site (the blog)? I love how clean it is.
replies(1): >>42078958 #
6. exe34 ◴[] No.42074366[source]
hi, this is brilliant, thank you! I will definitely go through it soon.

I have been trying to figure something out for a while but maybe haven't quite found the right paper for it to click just yet - how would you mix this with video feedback in a real robot - do you forward predict the position and then have some means of telling if they overlap in your simulated image and reality?

I've tried grounding models like cogvlm and yolo, but often the bounding box is just barely useful to go face something, not actually reach out and pick something.

there are grasping datasets, but then I think you still have to train a new model for your given object+gripper pair - so I'm not clear where the MPC part comes in.

so I guess I'm just asking for any hints/papers that might make it easier for a beginner to grasp.

thanks :-)

7. gxcode ◴[] No.42078958[source]
I ended up making my own theme, but my starting point was PicoCSS: https://picocss.com