Ai Agents

AI Agent Cost Optimization: 4 Ways to Cut Token Spend

In short: AI agent cost optimization starts with context growth. A long-running agent can move from $50 a month to $2,500 because each turn resends system prompts, tool definitions, memory, files, and earlier messages. Four practices bring the bill back under control: prompt caching, lazy-loaded tools, model routing, and context cleanup. When an agent is new, the system prompt may be only a few hundred tokens, with two or three tools. Then the prompt grows, the tool list expands, memory accumulates, and every turn starts paying for earlier turns. The Claude system prompt leaked in late 2024 was 24,000 tokens, nearly 50 times larger than the starting point. OpenClaw users have reported sending more than 150,000 input tokens to Gemini 3.1 Pro, only to get 29 output tokens in the first turn. An unoptimized agent handling 100 messages a day with 166K input tokens can cost about $996 a month on Gemini 3.1 Pro and about $2,490 a month on Claude Opus 4.6. There are ways to push that cost back down to $50-$100 a month. ...

MUSE-Autoskill: ByteDance's Fix for AI Agents That Forget What They Learn

In short: ByteDance’s MUSE-Autoskill treats an agent skill as a lifecycle asset that gets created, remembered, tested, patched, and migrated, instead of a throwaway prompt. On the SkillsBench benchmark, human-written skills lifted it from 53.19% to 68.40%, and on the 35 tasks where it generated its own skill, accuracy reached 87.94%. Your coding agent spends twenty minutes working out a tricky deploy step. It works. The next day you hand it a nearly identical task, and it starts from zero: the same dead ends, the same twenty minutes. It read the docs, ran the commands, and even jotted down a lesson, but that lesson stayed trapped inside one task. When the task ended, the experience went with it. Agents forget, and for anyone who uses them daily this is a familiar, costly habit. On May 27, 2026, the ByteDance team released MUSE-Autoskill to attack exactly that: how an agent turns the experience it builds while doing tasks into skills it can reuse over the long term. There is more than one way to solve continual learning for agents. Some update the model weights, some optimize the outer workflow, and some externalize experience into memory and skills. This article focuses first on the two approaches most closely tied to skills. ...

SkillOpt: Stop Hand-Writing AI Agent Skills. Train Them.

In short: Microsoft Research’s SkillOpt turns AI agent skills into trainable artifacts. Instead of hand-writing CLAUDE.md, AGENTS.md, or best_skill.md and hoping the rules work, SkillOpt runs the agent, studies its failures, applies bounded text edits, validates the candidate skill, and keeps only changes that improve performance. Every serious AI agent user eventually starts writing instruction files: CLAUDE.md, AGENTS.md, best_skill.md, project rules, tool-use notes, formatting constraints, debugging routines. The pattern is familiar. You watch the agent fail a few times, write a better rule, rerun the task, then add another note. After a while, the instruction file becomes a small operating manual. If you work with Claude Code, Codex, Cursor, or any agent that lives inside a real project, this file quickly becomes part of the product. It tells the agent how to inspect files, when to run tests, how to format answers, which tool calls are safe, what to avoid in production code, and how to recover from common mistakes. The problem is that most of these files are written by feel. You notice a failure, write a rule, and hope the next run behaves better. Sometimes it does. Sometimes the new rule helps one task and harms another. Sometimes the instruction sounds precise to you but remains too vague for the model that has to act on it. ...

Humans Domesticate AI. AI Is Domesticating Us Too.

In short: AI agents do more than automate work. As humans domesticate AI with prompts, evals, and workflows, AI is also domesticating us by taking over the first move: the first outline, the first judgment, the first messy sentence, the first uncomfortable question. Wheat, Humans, and the Direction of Domestication In Sapiens: A Brief History of Humankind, Yuval Noah Harari makes a slightly uncomfortable point about the agricultural revolution: perhaps humans did not domesticate wheat so much as wheat domesticated humans. It sounds like a clever reversal at first, but the accounting is fairly plain. Wheat started as a wild grass in the Middle East. Over time it spread across the world, occupied enormous amounts of land, and got humans to clear fields, bend their backs, pull weeds, dig channels, build granaries, and stop wandering. Wheat did well. Human backs, less so. That story is useful before talking about AI, because it cuts through a lot of vague language about technology changing the world. A tool is not always something you use and then put back on the table. Stay with it long enough and it starts changing your movements, your schedule, and your sense of what feels normal. Wheat changed posture and settlement. The internet changed attention. AI is reaching a little further inward. It is changing how we begin to think about things. ...

Code as a Trained Output: The New Model of AI Coding

In short: AI coding agents are changing the status of code. In mature agentic workflows, code is no longer only written by humans; it is repeatedly generated, tested, corrected, and selected by an optimization loop. That makes tests look like loss functions, production failures look like generalization failures, architecture look like inductive bias, and harness engineering look like optimizer design. Introduction: A Shift We Have Not Yet Named Precisely Over the past eighteen months, software development has undergone a quiet but forceful restructuring. Tools such as Cursor, Claude Code, and Codex are pushing us away from the old workflow of “humans write code, machines assist with completion” toward something structurally different: humans describe intent, define constraints, and provide feedback, while agents repeatedly generate, run, and revise code until some convergence condition is met. Most industry commentary still frames this shift in productivity terms: “AI makes us write code N times faster.” That framing misses a more basic ontological question: in this new workflow, what has happened to the nature of code itself? ...

Why AI Agents Drift: Belief State Is the Real Bottleneck, Not Context Length

In short: Many AI agents look productive but are actually drifting — confidently executing the wrong moves on a wrong picture of the situation. The bottleneck for the next phase of agent systems is not larger context windows or stronger base models; it is whether the system can construct and maintain a stable belief state. This piece argues why belief state quality is the right optimization target, proposes five proxy metrics to measure it, and lays out where to put incremental engineering resources next. AI agents that look productive often turn out to be drifting — confidently executing the wrong moves on a wrong picture of the situation. Competition in agent systems is shifting from “whose model is stronger” toward “who can keep producing higher-quality belief state.” If you accept that framing, several seemingly unrelated problems suddenly line up: the same model behaves very differently inside different product shells; long-running agents fail not because they cannot answer but because their judgment of the situation is wrong; context windows keep growing, but system capability does not scale linearly with them; and scattered engineering pieces — skill, memory, retrieval, tool use, trace, summary — all start to matter at the same time. ...

5,000 Feeds, 20 Highlights: Your AI Agent Is Killing Your Serendipity

A friend recently showed me his new tool, beaming with excitement. He follows about 5,000 people on X. Researchers, founders, investors, developers, media figures — after years of accumulating, his feed had long since become a bottomless waterfall. He’d tried “read later” apps before, bookmarking over a thousand articles and actually reading five. Like most people. Now he uses an AI agent that reads the full output of all 5,000 accounts, compressing everything into 20 curated highlights per day. Fifty-four structured briefings in ten days. What used to take two hours to skim now takes five minutes. Ninety-five percent of noise, filtered out. “The root of information anxiety is the cost of filtering,” he said. “Hand the filtering to an agent, and the anxiety disappears.” He’s right. But only about the first half. The anxiety does disappear. What also disappears is everything you didn’t know you needed to know. Five thousand tweets compressed to twenty. Among the 4,980 discarded, there might have been one from a field you’ve never followed, using logic you’ve never encountered, explaining a problem you thought you’d already figured out. ...

SDD Was the Start. Harness Engineering Is the Real Game.

Last year, the AI coding conversation had a clear hero: Spec-Driven Development (SDD). This year, people are talking about harness engineering instead. That looks like a trend. It is a signal that the bottleneck moved. SDD is about making intent explicit so an agent can start in the right direction. Harness engineering is about building the environment, constraints, feedback, and governance that keep the agent on track after the 50th or 100th step. If you have ever watched an agent do impressive work for 20 minutes and then slowly degrade into a mess, you already understand why the vocabulary changed. TL;DR SDD helps agents start correctly Harness engineering keeps them correct over time The bottleneck moved from generation to verification Long-running reliability is now the real problem The SDD moment: why it caught on Early “agentic coding” had a predictable failure mode. You’d say: “Add user auth,” or “Make a dashboard,” or “Fix onboarding.” The agent would produce something that looked plausible. It might even compile. Then you’d try to use it, and realize half the work was guesswork. ...

Will AI Kill Software? Why the SaaSpocalypse Is Wrong (And What's Actually Changing)

Apps may fade into the background. Software won’t. Wall Street has a new consensus: AI is about to kill the software industry. Software stocks are down nearly 30% since the start of the year; pundits call it the SaaSpocalypse. But the story is wrong. AI is rewriting who builds software and how we pay for it—not eliminating it. This piece looks at why the real moats (data, workflows, habits) are getting deeper, how “Software as a Service” is turning into “Service as Software,” and what that means for builders and buyers. Two 19-year-old high schoolers built an AI calorie-tracking app called Cal AI that brought in over $30 million a year; it was recently acquired by MyFitnessPal. The deal size was not disclosed, but the two clearly came out on top. On another front, Cursor, the fastest-growing AI coding company in history and less than five years old, was reported in February to have passed $2 billion in annualized revenue. Whether we talk about AI companies that build apps or the AI-powered apps already in the world, the outlook seems bright. ...

AGI Won't Send You a Notification

Technological revolutions rarely announce themselves. The agricultural revolution had no press release. The industrial revolution had no countdown. Even the internet only became obvious in hindsight. Artificial General Intelligence will likely arrive the same way. There will be no moment when the world collectively agrees that AGI has appeared. No headline. No global notification. Instead, there will only be a moment years later when people look back and say: That was when everything started to change. By then, the transformation will already be underway. And this time, we may have far less time to adapt. The Speed of This Revolution Technological revolutions have always accelerated. When the steam engine entered factories in the late 18th century, it began replacing manual labor. Yet Britain did not pass its first meaningful labor protection law until 1833—almost seventy years later. The Second Industrial Revolution moved faster. Electricity, steel, and chemical industries reshaped entire economies within decades. Germany transformed from an agrarian country into an industrial power in less than thirty years. The internet accelerated things again. ...