AI agents that do your work while you sleep sound great. The reality is far messier—‘it’s like a toddler that needs to be overseen’

2 months ago 8

Summer Yue may work on safety and alignment on Meta’s superintelligence team, but even she admits she isn’t immune to overconfidence when it comes to autonomous AI agents.

In a post on X Monday, Yue described how her OpenClaw autonomous AI agents—built to run locally on a Mac mini computer—deleted her entire inbox, ignoring instructions to pause and ask for confirmation first.

“I had to RUN to my Mac Mini like I was defusing a bomb,” she said. It was, she added, a “rookie mistake.” The workflow had been working in a test inbox she used to safely trial the agent for weeks, she explained, but in the real inbox the agent lost her original instruction.

Yue’s experience stands in stark contrast to viral posts such as The Lobster Revolution: Why 24/7 AI Agents Just Changed Everything, in which Peter Diamandis claims always-on AI is far more frictionless.

“Let me tell you what it feels like to use this,” Diamandis wrote. “You wake up in the morning and your agent—mine is named Skippy, cheerfully sarcastic and absurdly capable—has done eight hours of work while you slept. It read a thousand pages of markdown. It organized your files. It drafted three project plans. It booked your travel. It research...

Read Entire Article