MUSE-Autoskill: ByteDance's Fix for AI Agents That Forget What They Learn

In short: ByteDance’s MUSE-Autoskill treats an agent skill as a lifecycle asset that gets created, remembered, tested, patched, and migrated, instead of a throwaway prompt. On the SkillsBench benchmark, human-written skills lifted it from 53.19% to 68.40%, and on the 35 tasks where it generated its own skill, accuracy reached 87.94%. Your coding agent spends twenty minutes working out a tricky deploy step. It works. The next day you hand it a nearly identical task, and it starts from zero: the same dead ends, the same twenty minutes. It read the docs, ran the commands, and even jotted down a lesson, but that lesson stayed trapped inside one task. When the task ended, the experience went with it. Agents forget, and for anyone who uses them daily this is a familiar, costly habit. On May 27, 2026, the ByteDance team released MUSE-Autoskill to attack exactly that: how an agent turns the experience it builds while doing tasks into skills it can reuse over the long term. There is more than one way to solve continual learning for agents. Some update the model weights, some optimize the outer workflow, and some externalize experience into memory and skills. This article focuses first on the two approaches most closely tied to skills. ...

 · 17 min · hohoda

SkillOpt: Stop Hand-Writing AI Agent Skills. Train Them.

In short: Microsoft Research’s SkillOpt turns AI agent skills into trainable artifacts. Instead of hand-writing CLAUDE.md, AGENTS.md, or best_skill.md and hoping the rules work, SkillOpt runs the agent, studies its failures, applies bounded text edits, validates the candidate skill, and keeps only changes that improve performance. Every serious AI agent user eventually starts writing instruction files: CLAUDE.md, AGENTS.md, best_skill.md, project rules, tool-use notes, formatting constraints, debugging routines. The pattern is familiar. You watch the agent fail a few times, write a better rule, rerun the task, then add another note. After a while, the instruction file becomes a small operating manual. If you work with Claude Code, Codex, Cursor, or any agent that lives inside a real project, this file quickly becomes part of the product. It tells the agent how to inspect files, when to run tests, how to format answers, which tool calls are safe, what to avoid in production code, and how to recover from common mistakes. The problem is that most of these files are written by feel. You notice a failure, write a rule, and hope the next run behaves better. Sometimes it does. Sometimes the new rule helps one task and harms another. Sometimes the instruction sounds precise to you but remains too vague for the model that has to act on it. ...

 · 14 min · hohoda

10 Claude Code Skill-Writing Patterns the Docs Don't Teach You

On March 31, Anthropic accidentally shipped a source map file in their Claude Code npm package — and for a brief window, the complete TypeScript source (512,000 lines across ~1,900 files) was publicly accessible. The community archived it before Anthropic could pull it down. I spent a few days going through the built-in skills: simplify, batch, skillify, and a dozen others. Most of the community attention went to the hidden feature flags and the easter egg pet system. What caught my eye was less flashy: the way Anthropic’s engineers write their own skills differs from what the official docs teach. Claude Code Skills has two official references — the Skills docs and the Agent Skills Best Practices guide. Both are worth reading. Neither prepares you for what the built-in skills actually look like. This post distills 10 patterns that are in the source but not in the docs. Each one shows a ❌ typical doc-style approach vs ✅ the actual built-in skill approach. If you write SKILL.md files for Claude Code, these patterns change how you structure them. ...

 · 10 min · hohoda