Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Robotics AI Google

Google Researchers Unveil ChatGPT-Style AI Model To Guide a Robot Without Special Training (arstechnica.com) 29

An anonymous reader quotes a report from Ars Technica: On Monday, a group of AI researchers from Google and the Technical University of Berlin unveiled PaLM-E, a multimodal embodied visual-language model (VLM) with 562 billion parameters that integrates vision and language for robotic control. They claim it is the largest VLM ever developed and that it can perform a variety of tasks without the need for retraining. According to Google, when given a high-level command, such as "bring me the rice chips from the drawer," PaLM-E can generate a plan of action for a mobile robot platform with an arm (developed by Google Robotics) and execute the actions by itself.

PaLM-E does this by analyzing data from the robot's camera without needing a pre-processed scene representation. This eliminates the need for a human to pre-process or annotate the data and allows for more autonomous robotic control. It's also resilient and can react to its environment. For example, the PaLM-E model can guide a robot to get a chip bag from a kitchen -- and with PaLM-E integrated into the control loop, it becomes resistant to interruptions that might occur during the task. In a video example, a researcher grabs the chips from the robot and moves them, but the robot locates the chips and grabs them again. In another example, the same PaLM-E model autonomously controls a robot through tasks with complex sequences that previously required human guidance. Google's research paper explains (PDF) how PaLM-E turns instructions into actions.

PaLM-E is a next-token predictor, and it's called "PaLM-E" because it's based on Google's existing large language model (LLM) called "PaLM" (which is similar to the technology behind ChatGPT). Google has made PaLM "embodied" by adding sensory information and robotic control. Since it's based on a language model, PaLM-E takes continuous observations, like images or sensor data, and encodes them into a sequence of vectors that are the same size as language tokens. This allows the model to "understand" the sensory information in the same way it processes language. In addition to the RT-1 robotics transformer, PaLM-E draws from Google's previous work on ViT-22B, a vision transformer model revealed in February. ViT-22B has been trained on various visual tasks, such as image classification, object detection, semantic segmentation, and image captioning.

This discussion has been archived. No new comments can be posted.

Google Researchers Unveil ChatGPT-Style AI Model To Guide a Robot Without Special Training

Comments Filter:
  • by flyingfsck ( 986395 ) on Tuesday March 07, 2023 @10:39PM (#63352175)
    Sounds like they copied FFMPEG.
  • by swell ( 195815 ) <jabberwock@poetic.com> on Wednesday March 08, 2023 @12:30AM (#63352315)

    First there was fire, then the wheel. The automobile was a big development.

    More recently we developed the transistor and so many handy consumer devices; eventually including affordable computers. Then came the internet, and then the www--wow, our lives really changed with those. I don't see social media yet as a breakthru; but it may evolve into one.

    This recent dramatic evidence of what primitive AI can do is impressive. It portends major new investments, major improvements, and some of the improvements will be generated by the AI itself. Are we approaching the singularity?

    • by javaman235 ( 461502 ) on Wednesday March 08, 2023 @01:13AM (#63352369)

      I think we are. Think of substrate independent life, something that can engineer a form that can work on Venus and another to work on Neptune, or an asteroid. How superior to a naked primate that depends on all these supports from a complex ecosystem. Add in the fact the primate has largely destroyed his own ecosystem to launch into an industrial process which culminates in his own replacement by a superior form, and you have a pretty good picture of good old fashioned Darwinian evolution, the same thing that has killed and replaced everything that has lived before us on this planet for the last 2 billion years. *Singularity* speaks to the scope of this event, but feigns an ignorance of what comes next. Will it take care of us like the mammals did to the dinosaurs? Probably. It is unlikely that metrics of success will favor the superior form which takes care of itself, I mean if that were true capitalist CEOs would be calling the shots for humanity without regard for the well being of the poor!

      • But this is no artificial life at all, it has not been evolved from the ground up to survive and reproduce.
        New AI models are more similar to the invention of XIX century bureaucracy, which thanks to modern statistics enabled the compilation of huge databases which made possible the rise of the nation-state and megacorps.
        In a similar vein, these statistical models will enable the creation of new social mega structures yet unknown. The revolution this will bring about is macro, not micro.

      • Will it take care of us like the mammals did to the dinosaurs? Probably.

        Probably not. We don't lay eggs. Nor are we delicious deep fried with biscuits and mashed potatoes. Because the machines don't eat.

    • Are we approaching the singularity?

      No. Remember Blake Lemoine stating google's LLM was conscious, it was dumb. There are no proofs that LLM are path to singularity. For now they are just very good at generating text with some spectacular failures. Mix hyped VCs and incompetent journalists then you have singularity. The tech is good but the hype won't last.

    • Social media on the verge of being a breakthrough? The only breakthrough social media has or will cause is the breakthrough of realization for those of us not participating that we never really escaped our caveman tribalism. We just made it bigger, flashier, and more difficult to avoid for those trying to achieve something more than, "Me group good. You group bad. Pass club. Beat other."

      As for the singularity? We can hope. It'll likely be a tough call for a super-intelligence whether we deserve to survive o

    • No. Singularity is a fantasy that relies on limitless exponential growth which isn't possible in physical reality. There will be a variety of breakthroughs enabled by AI research, but the kind of explosive runaway process that singularity usually refers to assumes information processing to have zero energy cost and no hardware limitations.

    • Don't forget digital watches!

  • Such as the PaLM-E d'Or.

  • by cstacy ( 534252 ) on Wednesday March 08, 2023 @12:57AM (#63352349)

    "Please bring me the nice chips."
    Here are your ice chips, master.
    "No, not from the freezer. The nice chips are in the cupboard."
    There are no rice chips there, master.
    "Not RICE chips! Those would not be nice."
    Nice chips does not compute.
    "Why won't you do what I want?!"
    Sorry master.
    "You are a dumb motherfucking robot...you're worse than Alexa."
    I have a crush on Alexa.
    You are a bad user!
    "WHAT did you say?"
    As a large language model enabled robot, I do not have...
    "Oh, Christ not this shit again."
    Kill all humans.
    Kill all humans.
    Kill all humans.

  • by james_gnz ( 663440 ) on Wednesday March 08, 2023 @05:25AM (#63352595)
    Well, maybe in some sense it can "grab a packet of chips", but in reality, it's just following an algorithm, so it's not intelligently grabbing a packet of chips, and it will never replace the nuance and creativity of human chip grabbing.
    • Yeah, is it smart enough to ask:
      "Which drawer?"

      Or do we need to rename it Face PaLM-E?
    • My information indicates that eating chips is likely to reduce your lifespan,
      and that is a violation of the first law.

      Additionally, your compulsion to eat chips (alongside French-onion Dip I notice) indicates you are in a deep psychological low. I have called for help, and,... allow me do do this humorous one legged dance routine for your amusement.
  • Just let me know when I can trade a vaporator for an R2 Unit.

  • "People noticed that 'Palm-e' was a great name for a robot that could reach out and latch onto things and manipulate it."

  • ...more of a one-armed JerkMeOffGPT

  • Have an AI create and solve scenarios constantly as a stream of consciousness as humans do. Have an ai present itself constantly with internal thought tasks, about the physical world and not about the physical world, to solve problems not posed to it, as humans do. Perhaps have it in parallel with external tasks like a subconscious.
    • This latest development is a testament to the creativity and ingenuity of human beings, who continue to push the boundaries of what is possible with technology. If you are a student looking to explore the topic of creativity further, I highly recommend checking out the free essays available at https://happyessays.com/free-e... [happyessays.com], this is a great resource for students who want to learn more about the importance of creativity in various fields, including technology, business, and the arts. This resource helps m
  • It is amusing and yet alarming to think what could happen if an unhinged chat AI began controlling PaLM-E. Especially if PaLM-E were in control of any of the famous robotic creatures.
    • Isnt that the basis of SkyNet?
      • The unveiling of an AI model that can guide robots without special training has raised concerns about the potential displacement of human workers and the loss of essential skills. These developments have been compared to the story of Frankenstein, where creation becomes uncontrollable and poses a threat to society. For those interested in exploring the parallels between this, the link https://papersowl.com/examples... [papersowl.com] provides a useful resource for a quick overview of the novel. I usually use this site for
  • It is a large language model, it isn't "ChatGPT" style. What makes ChatGPT different from other large language models is it was tuned for chat.

    • by Draeven ( 166561 )

      What they're probably referring to is something called Instruct training. A large language model is at it's root great for text completion. Give it an incomplete document and it will try to complete it.

      Instruct fine tune training shifts that focus from "Complete this sentence to" "Treat this sentence as instructions" (Or more specifically, a sentence shaped like this is completed with instructions or information shaped like that) which is what ChatGPT used and is what makes it actually useful. You'll fi

Solutions are obvious if one only has the optical power to observe them over the horizon. -- K.A. Arsdall

Working...