

Google's New Robot AI Can Fold Delicate Origami, Close Zipper Bags (arstechnica.com) 21
An anonymous reader quotes a report from Ars Technica: On Wednesday, Google DeepMind announced two new AI models designed to control robots: Gemini Robotics and Gemini Robotics-ER. The company claims these models will help robots of many shapes and sizes understand and interact with the physical world more effectively and delicately than previous systems, paving the way for applications such as humanoid robot assistants. [...] Google's new models build upon its Gemini 2.0 large language model foundation, adding capabilities specifically for robotic applications. Gemini Robotics includes what Google calls "vision-language-action" (VLA) abilities, allowing it to process visual information, understand language commands, and generate physical movements. By contrast, Gemini Robotics-ER focuses on "embodied reasoning" with enhanced spatial understanding, letting roboticists connect it to their existing robot control systems. For example, with Gemini Robotics, you can ask a robot to "pick up the banana and put it in the basket," and it will use a camera view of the scene to recognize the banana, guiding a robotic arm to perform the action successfully. Or you might say, "fold an origami fox," and it will use its knowledge of origami and how to fold paper carefully to perform the task.
In 2023, we covered Google's RT-2, which represented a notable step toward more generalized robotic capabilities by using Internet data to help robots understand language commands and adapt to new scenarios, then doubling performance on unseen tasks compared to its predecessor. Two years later, Gemini Robotics appears to have made another substantial leap forward, not just in understanding what to do but in executing complex physical manipulations that RT-2 explicitly couldn't handle. While RT-2 was limited to repurposing physical movements it had already practiced, Gemini Robotics reportedly demonstrates significantly enhanced dexterity that enables previously impossible tasks like origami folding and packing snacks into Zip-loc bags. This shift from robots that just understand commands to robots that can perform delicate physical tasks suggests DeepMind may have started solving one of robotics' biggest challenges: getting robots to turn their "knowledge" into careful, precise movements in the real world. DeepMind claims Gemini Robotics "more than doubles performance on a comprehensive generalization benchmark compared to other state-of-the-art vision-language-action models."
Google is advancing this effort through a partnership with Apptronik to develop next-generation humanoid robots powered by Gemini 2.0. Availability timelines or specific commercial applications for the new AI models were not made available.
In 2023, we covered Google's RT-2, which represented a notable step toward more generalized robotic capabilities by using Internet data to help robots understand language commands and adapt to new scenarios, then doubling performance on unseen tasks compared to its predecessor. Two years later, Gemini Robotics appears to have made another substantial leap forward, not just in understanding what to do but in executing complex physical manipulations that RT-2 explicitly couldn't handle. While RT-2 was limited to repurposing physical movements it had already practiced, Gemini Robotics reportedly demonstrates significantly enhanced dexterity that enables previously impossible tasks like origami folding and packing snacks into Zip-loc bags. This shift from robots that just understand commands to robots that can perform delicate physical tasks suggests DeepMind may have started solving one of robotics' biggest challenges: getting robots to turn their "knowledge" into careful, precise movements in the real world. DeepMind claims Gemini Robotics "more than doubles performance on a comprehensive generalization benchmark compared to other state-of-the-art vision-language-action models."
Google is advancing this effort through a partnership with Apptronik to develop next-generation humanoid robots powered by Gemini 2.0. Availability timelines or specific commercial applications for the new AI models were not made available.
Can it fold an infinite paper (Score:1)
Possible job opportunities? (Score:2)
Google's New Robot AI Can Fold Delicate Origami, Close Zipper Bags
So... Coroner / morgue assistant?
Re: Possible job opportunities? (Score:3)
Do you think it will be cheaper than buying the bags with the little plastic zipper closer on them?
Can it fold clothes? (Score:1)
Paper is thin and delicate (Score:2)
Human skin is thin and delicate. Google origami bot can fold thin and delicate materials. If A=B and B=C then A=C. As such, it follows that google origami bot can fold humans covered with skin into a swan or other fun shapes.
Google ultimately kills all its products, and you are the product they sell to advertisers. Be afraid.
Real problems (Score:2)
Call me when it can do my dishes.
Re: (Score:2)
Re: (Score:2)
Call me when it can do my dishes.
Dishes are sort of doable. Since we already have dishwashers, we just need robots to load and unload dishes.
The example of a truly hard robot task is folding laundry. There's no obvious way to do it. There is a huge diversity of types of clothing, shapes, and fabrics. Items can be inside out or partially so or even clinging to each other. And the non-rigid fabrics make folding the exact same item require different dynamic adjustments.
Re: Real problems (Score:2)
Male clothes are easy. Try female wardrobes. Random shapes, often asymmetrical, with straps, gaps, inner layers and ornaments. AI would break down and cry.
Re: Real problems (Score:2)
The problem with dishes is fragility and lack of touch when it comes to fishing for stuff in soapy water without breaking anything.
Re: (Score:2)
Clothing (Score:2)
If this tech really does get perfected and if it's cheap enough then I guess the good news is all the sweatshops are going to close but the bad news is we're going to have millions and millions of people who are completely unnecessary and we don't tend to treat people like that very well. And in any case is going to be some upheaval or
Of course the last thin
Re: (Score:2)
This isn't closing any sweatshops .. each robot costs $100k (what does a sweatshop worker get paid?) and it needs a nanny and a whole team of highly paid experts to support it. A sweatshop of a hundred people costs the same amount and just needs a manager and guard with a baton.
No more need for physical human labor (Score:2)
This is reaching the level of sophistication where the need for physical human labor basically goes away. If a robot can fold origami, it can glue pipes or nail shingles. It can do mining. It can do final assembly for an iPhone.
So the real question becomes this: If the cost to make things essentially becomes the cost of materials plus the cost of the equipment to do it, with no ongoing costs other than electricity, will those things continue to have financial value? What prevents the very rapid race to
Lame (Score:2)
This thing looks like it is tens of billions of dollars in refinement away from being able to be deployed to do useful farm or factory tasks, let alone home.
Working on the wrong problem (Score:2)
Figure out teleoperation first. Make it able to do, via teleoperation augmented by certain degrees of autonomy, dangerous tasks at construction sites, bio labs. or factories first. That's the best path I see to robotics becoming commercially useful instead of a huge VR or 3D tv style flop.
Hmmm... (Score:3)
Maybe there'll be a "happy ending" to all this AI stuff after all!
I’ll be impressed (Score:1)
Status: Step 2 (Score:2)
1. Generative AI - Pretty much passes the Turing Test
2. Autonomy - Connect Gen AI to a robot
3. Reproduction - Intelligent machines producing new (and possibly better) intelligent machines.
After a long day of Turing tests (Score:2)