This Robot Learned to Use an Air Fryer From Two Training Examples

In partnership with

Hey there! 👋

Welcome back to SavvyMonk, your one-stop for AI and tech news that actually matters.

Today we're looking at a robotics story that quietly dropped this week and didn't get nearly enough attention. A San Francisco startup just published research showing their robot learned to use an appliance it had almost never seen in training.

Let's get into it.

Tech moves fast, but you're still playing catch-up?

That's exactly why 200K+ engineers working at Google, Meta, and Apple read The Code twice a week.

Here's what you get:

Curated tech news that shapes your career - Filtered from thousands of sources so you know what's coming 6 months early.
Practical resources you can use immediately - Real tutorials and tools that solve actual engineering problems.
Research papers and insights decoded - We break down complex tech so you understand what matters.

All delivered twice a week in just 2 short emails.

Join 200K+ engineers

TODAY'S DEEP DIVE

The Robot That Figured Things Out on Its Own

Physical Intelligence is a two-year-old startup based in San Francisco, and it has become one of the most closely watched AI companies in the Bay Area without making much noise about it.

The company was founded by AI academics and former Google DeepMind researchers, including co-founder Sergey Levine, a UC Berkeley professor focused on AI for robotics.

Sergey Levine, co-founder of Physical Intelligence

Backers include Khosla Ventures, Sequoia Capital, Thrive Capital, Lux Capital, OpenAI, and Jeff Bezos. The company's last public valuation was $5.6 billion, and Bloomberg reported in late March that it is in talks to raise about $1 billion more at a valuation exceeding $11 billion.

That context matters, because what the company published on April 16 is exactly the kind of result that justifies that level of investor interest.

What π0.7 Actually Does

The new model is called π0.7 (pronounced pi-zero-seven), and the paper published alongside it makes one central claim. The model can direct robots to perform tasks they were never explicitly trained on. The researchers call this compositional generalization, and it is a significant departure from how robot AI has worked up to now.

— # (#)

The standard approach to training a robot has always been rote memorization. You collect a large volume of data for one specific task, train a specialist model on it, and repeat that entire process for every new task you want the robot to handle. It is slow, expensive, and rigid, because a robot trained to fold laundry is useless in front of a coffee machine unless you start over from scratch.

π0.7 breaks that pattern by combining web-based pretraining data with physical action data, and then doing something more interesting than just memorizing individual tasks. It remixes and recombines what it has learned across different contexts to handle situations it has never encountered.

Levine described the scaling behavior by saying that once a model crosses the threshold from doing only what it was explicitly trained on to remixing things in new ways, capabilities grow faster than the data would predict. That is the same favorable scaling property that transformed language models and computer vision.

The Air Fryer Experiment

The most striking result in the paper involves something as mundane as an air fryer. When researchers looked into how much relevant training data the model had actually seen, they found two episodes, one where a different robot pushed the air fryer closed and one from an open-source dataset where another robot placed a plastic bottle inside one. That is essentially nothing, and yet with zero coaching, π0.7 made a passable attempt at cooking a sweet potato in the appliance.

What happened next is arguably more interesting than the result itself. In the initial run, the success rate was around 5 percent. After researchers spent about 30 minutes refining how they explained the task in plain language, the success rate climbed to 95 percent.

Researcher Ashwin Balakrishna put it plainly. "Often the cause of failure is not the robot, but the way humans explain." The implication is that how you prompt the model matters as much as what the model knows, which is the exact dynamic that shaped the development of large language models.

What Surprised the Researchers

The part of this story that doesn't usually make it into headlines is that the team itself was caught off guard. Balakrishna said he had always been able to predict what a model could do based on what was in the training data.

The last few months changed that. He bought a gear set, asked the robot to rotate it, and it worked without any specific training on gears. Levine drew a comparison to the early days of GPT-2, when it generated a story about unicorns in the Andes and nobody could explain exactly where that knowledge came from.

That kind of unpredictability is a double-edged signal. It suggests that robotic AI may be approaching the same kind of compounding capability curve that language models hit a few years ago. It also means the model's failure modes are harder to predict, which matters a great deal when the thing making decisions is operating in the physical world.

The Competitive Picture

Physical Intelligence is not alone in chasing this. Pittsburgh-based Skild AI, founded in 2023, raised $1.4 billion at a $14 billion valuation and has already deployed its omni-bodied Skild Brain commercially, reportedly generating $30 million in revenue within a few months of launch across security, warehouse, and manufacturing environments.

Skild has also taken direct shots at competitors, arguing that most robotics foundation models rely too heavily on internet-scale pretraining and lack true physical common sense.

Physical Intelligence, by contrast, is still in pure research mode. CEO Lachy Groom has said he does not give investors a commercialization timeline, which is an unusual posture but one the company's backers appear to accept for now. The gap between research capability and deployed product is still wide, and Skild's numbers show that closing it is where the real competition lies.

The Bottom Line

π0.7 is the most credible signal yet that the robotics field is approaching the same kind of capability inflection that transformed language AI. A model that surprises its own researchers and cooks a sweet potato on almost no relevant training data is a meaningful data point.

The open question is whether Physical Intelligence can translate that research lead into a real product before better-capitalized competitors get there first.

AI PROMPT OF THE DAY

Category: Research and Briefing

"I'm trying to understand a recent AI research paper for a non-technical audience. The paper is about [topic or paper name]. Please summarize the core claim in plain language, explain the key experiment used to demonstrate it, describe what makes this a meaningful advance over what existed before, and flag any important caveats or limitations the researchers themselves acknowledged."

ONE LAST THING

The air fryer experiment is a useful reminder that the most important moments in AI research are often buried in the methods section. A 5 percent success rate that becomes 95 percent because someone got better at explaining the task is not just a robotics story, it is a lesson about how much prompting and framing still shape what these models can actually do. The robots are getting smarter, but so is the craft of talking to them.

Hit reply, I read every response.

See you in the next one.

— Vivek

P.S. If you know someone who follows robotics or AI research, they'll want to read this one. They can subscribe at https://savvymonk.beehiiv.com/