Original title: A Harness for every task: dynamic works in Claude Code

Original by:@trq212

Photo by Peggy

Editor presses: Claude Code is moving from a code assistant to a programmable Agent desk。

The core value of the workflows presented here is that Claude can no longer simply "find and do it" in the same context window, but can generate an implementation framework based on mission dynamics: split tasks, send hairs, Agent, parallel processes, cross-validation, loops, and even different Agents compete with each other, and eventually consolidate the results。

This means that Claude Code's use is clearly spilling. It is not limited to code migration, reconstruction, test recurrence and code review, but can also be used for in-depth research, fact-checking, curriculum vitae screening, accident repetition, regulation deposition, business plan evaluation, and non-technical tasks such as naming brainstorm. Many complex tasks are intrinsically similar to programming: the need to dismantle the problem, to isolate the context, to validate assumptions, to deal with a great deal of detail and to choose among multiple candidate paths。

Dynamic workflows tries to solve problems that are common to large models in long missions: "intellectual inertity", "self-preference bias" that tends to accept their conclusions, and "target drift" that gradually deviates from the original goal after multiple rounds of implementation. By entrusting tasks to several Claudes with separate contexts, it changed complex tasks from "single Agent long run" to "many Agent synergism"。

Of course, workflows are not a panacea. It usually consumes more tokens and is not necessarily suitable for every ordinary coding task. But it provides an important direction: the competition for the future AI tool may lie not only in how smart a single model is, but also in its ability to organize a reliable, reusable and reviewable implementation process around complex objectives。

The following is the original text:

While the default Claude Code execution framework is built for programming, it also applies to many other types of tasks. Many missions have proved to be structurally similar to programming tasks. However, to achieve optimal performance for some specific task types, we still need to build a customized implementation framework on Claude Code, such as research, security analysis, intelligent team collaboration, or code review。

Workflows allows you to dynamically create an implementation framework that allows Claude to solve the above problems, and more types of problems, in Claude Code. You can share and reuse these jobs with others。

in this paper, i will share my own experience and thoughts on the initial use of workflows, helping you to use your capabilities more fully。

it should be noted, however, that best practices are still being developed. dynamic work flows often consume more token, so you need to consider when and how to use it。

Note: This paper is also published on Claude Blog。

Example

before entering into technical details, i would like to give some examples of how to help you understand the possibility of workflows:

"this test is about to fail every 50 times. build a workflow to reproduce it, make assumptions and run confrontational tests in different worktrees. /goal don't stop until a hypothesis has been verified."

"To use the workflow, to review my recent 50 sessions, to dig into my repeated corrections and to translate these recurring problems into the CLUDE.md rule."

"Ask the Slack's #incidents channel for the last six months to find out the root causes of recurring, but no one submitted the ticket."

"take my business plan and run a workflow, so that different parties break it from the point of view of investors, customers and competitors."

"Here is a folder containing 80 resumes. Using workflow, they are sorted according to back-end job requirements and the top 10 are reviewed. Ask me through the AsskUserQuestion tool and help you set evaluation criteria.”

"I need to name this CLI tool. The top three are selected through the tournament mechanism using a series of options for workstorms."

"Use the workflow, rename our User model Account everywhere."

"read my draft blog and use the workflow to test each of these technologies against the code library." i don't want to publish anything wrong."

How dynamic workflow works

A dynamic workflow executes a JavaScript file containing several special functions to generate and coordinate sub-intelligence。

Dynamic workflows also contain standard JavaScript functions, such as JSON, Math and Array, for processing data。

Of particular note is the dynamic workflow that determines which model is used by an individual, or whether the child runs in its own worktree. This makes it possible for Claude to choose the level of intelligence and isolation required on the basis of mission needs。

if a workflow is interrupted, e.g. by manual operation of the user, or by terminal exit, the workflow can continue from the break。

Why do you need a dynamic workflow

When you let the default Claude Code execute a framework to handle a task, it needs to be simultaneously planned and executed in the same context window. This approach is very effective for many programming missions, but sometimes it is ineffective in confrontational missions that operate over long periods of time, in large parallel or highly structured contexts。

The reason for this is that the longer Claude handles complex tasks in a single context window, the more likely it will be to have several specific types of failure patterns:

Agenic lazines (inertity of intelligent bodies), which means Claude, when dealing with a particularly complex, multi-component task, stops before it is really completed and claims that it has been completed after only partial progress has been made. For example, only 20 of the 50 projects were processed in the security review and the work was declared closed。

Self-preferential bias (self-preference bias) refers to Claude ' s preference for his own results or discovery, especially when asked to validate or judge the content of his output against a set of evaluation criteria。

Goal Drift (target drift) means that during multiple rounds of implementation Claude ' s loyalty to the original objective has gradually decreased, especially after the context has been compressed. Each summary would result in a loss of information and some detail requirements, such as peripherals or restrictions such as "do not do X", could be lost。

The creation of a workflow helps to alleviate these problems, as it allows for the organization of separate Claudes, allowing them to have their own context windows and to focus on isolated and targeted tasks。

Dynamic and static workflows

You may have created static workflows by Claude Agent SDK or claude-p to coordinate several Claude Code examples。

However, as static workflows need to cover a variety of marginal situations, they are usually more common. With Claude Opus 4.8 and dynamic workflows, Claude is now smart enough to develop a customized implementation framework for your specific use。

Practical mode for using dynamic workflows

You can direct Claude to create a dynamic workflow, or you can use the trigger word "interracode" to make sure Claude Code creates workflow。

But if you can build a mental model on how dynamic workflows work, it's easier to judge when it should be used, and to direct Claude through prompt。

Claude often uses and combines the following modes in the construction of workworks:

categorize and execute: use a classification entry to judge the type of task, and then to determine the type of task by way of different ant or behaviour. the output results can also be judged at the end of the process using a sorter。

scratch and synthesize: disassembly a task into smaller steps, allowing each step to be handled by an agent, and eventually synthesize the results. this approach is particularly appropriate in situations where the mandate contains a large number of small steps, or where each step requires a clean context window to avoid interference or cross-pollution. the combination of steps is equivalent to a "barrier": it will wait for all the parties out of the fan to complete and then merge their structured outputs into one result。

adversarial validation: re-runs an independent agent for each generated agent and validates its output confrontationally according to a set of evaluation criteria or guidelines。

Generates and screens: Generates a large number of ideas around a subject, then screens them according to evaluation criteria or validation processes, removes duplicate items and returns only to the most tested and quality ideas。

Championship: Not to split jobs, but to compete with one another. Generates N angents that try to do the same job in different ways. The results are then compared by prompt or model through evaluation anent until the winner is selected。

cycle until completion: for tasks for which the workload is unknown, do not set fixed rotations, but recycle ant until the condition for cessation is met, such as no new discoveries or errors in the log。

Use scene

You can think more creatively about when and how to make Claude Code create dynamic workflows. I found that workworks sometimes even more useful in non-technical work。

Migration and restructuring

Bun used workflows to rewrite Rust from Zig. You can read Jarred's post on X, understand the process。

the key is to break down tasks into a series of steps that need to be addressed, such as call points, failure tests, modules, etc. starts a sub anent in the worktree for each restoration mission to complete the restoration; then another angent is given a confrontational review and the results are merged. you can consider telling angent clearly not to use commands that consume too much resources, so that the parallel levels can be maximized without exhausting local machinery resources。

In-depth studies

We published a paper in Claude Code that uses dynamic workflows. In particular, it will send out executive web searches, extract sources, validate claims in a confrontational manner and produce a combined report with references。

However, such studies do not apply only to web searches. For example, you can get Claude to sort out a status report from the Slack context or to study how a function works by exploring the code library in depth。

Depth validation

on the other hand, if you have a report and want to verify every factual judgement and source cited in it, you can generate a workforce: first, an agent recognizes all de facto claims, and then a subagent is initiated for each claim. you can also get an agent to check the tracer subent to make sure it's of sufficient quality。

Sort

You might have a set of projects that you want to sort out by some qualitative indicator, and you believe Claude Code is good at evaluating that indicator. For example, support sheets are ranked according to the severity of the bug。

but if you try to sort 1,000 lines of content in a prompt, the quality will decline and the context window will not hold. it would be better to run the tournament mechanism and create a flow line consisting of two or two comparisons, since comparative judgements are usually more reliable than absolute scores; or to sort in parallel barrels before consolidating results. each comparison is done by an independent agent, so the certainty cycle can maintain the entire course structure, only the current running order needs to be kept in context。

Memory and rules

If you have a set of specific rules, and Claude, even seeing them in the CLAUDE.md, is often missing or poorly implemented, then you can create a workflow, list them and let them be checked article by article - each rule corresponds to one confirmation. The creation of a "suspect" personality as a sub-agent to review the legitimacy of these rules also helps to avoid excessive misstatement。

In turn, it can: dig into your recent sessions and code reviews to find your repeated corrections; get parallel agents to cluster these issues; run a confrontational test of each candidate rule to see if it really prevents a real error; and finally extract the filtered rules back into the CLUDE.md。

Genesis

The most effective way of debugging is to present several independent assumptions and to test them one by one. But if you use only one context window, Claude may fall into a self-biased bias。

workflow prevents this from structurally: it can activate multiple agents that generate assumptions based on non-overlapping evidence. for example, separate logs, files and data for different parties. each hypothesis could then be reviewed by a pool of certifying and rebuttal persons。

it's not just for code. workflows can also be used for sales analysis, such as "why is the march sales falling?"; for data engineering, such as "why has this pipeline failed?"; or for anything to repeat。

Large-scale referral

each team has support queues, bug reports or other backlogs that cannot be handled entirely by human beings. a dispensary workwork allows each project to be classified, weighed against the problems already being tracked and acted upon. this may mean trying to repair, or upgrading to human users。

a useful model for the flow of consultations is quarantine (separation). in other words, those who read untrustworthy public content are prohibited from performing high-authority operations; the high-authority operations should be carried out by those who specialize in operations。

You can combine the referral workworks with the /loop to keep Claude on this kind of mission。

Exploration and taste judgement

workflows are useful when you need to explore different paths to solutions, especially when designing, naming, and benefiting from an aesthetic assessment。

You can get Claude to explore a lot of options and give him a set of evaluation criteria about what a good solution is. The mission was completed when the review angent considered the results to have met the criteria. Different programmes could also be sequenced or screened through the tournament mechanism, based on the evaluation criteria。

Evals

you can run lightweight evals for specific tasks by starting independent antent in worktree, then starting comparison antent, comparing and scoring specific outputs according to evaluation criteria. for example, you can assess and improve a skill that you have created to see if it meets certain specific criteria。

models and smart horizontal pathways: you can create a classification that is tailored to your mission, angent, so that it decides which model to use. this approach is useful when missions involve a large number of calls of tools and pre-implementation studies can help identify the most appropriate models。

For example, for the task of "explaining how the auth module works", the most appropriate model depends on how many files are in the auth module and what the code library structure is. Catalogue angent can complete the study and then give the task route to Sonnet or Opus based on the expected complexity。

When should we use dynamic workflows

workflows are still new. while in many use scenarios it can have a far more conventional effect, not every mission needs it, and it may significantly increase token consumption。

It's best to use the workflows on missions that can expand Claude Code in new ways. For routine programming tasks, you can ask yourself: does the task really require more computing resources? For example, most traditional programming tasks do not require a team of five reviewers。

Techniques for building dynamic workflows

Prompt Design

the fuller the detail, the better the effect, especially using the specific techniques mentioned above。

workflows do not apply only to large missions. you can also suggest a "quick workflow" model. for example, you can create a fast confrontational review process to check a hypothesis。

use with /goal and /loop

when you use workworkworks that can be repeated, such as consultation, research or validation of workflows, you can match them with /loop and keep them running at regular intervals; and set mandatory completion requirements with /goal。

Token Use Budget

you can set a clear budget for the dynamic workflow token to limit the amount of tokens consumed by the task. you can write a budget requirement like "use 10k tokens" in prompt, which sets the ceiling at 10k token。

Save and share dynamic workflows

you can save workflows by pressing "s" in the workflow menu. you can submit them to ~/.claude/workflows or distribute them through skill。

If you want to share them through skill, you can put the JavaScript workflow file in the skill folder and quote it in SKILL.md. In order to be more flexible, you can also suggest Claude: treat the workflows in skill as templates, not scripts that have to run word for word。

A whole new world

workflows is a useful new way to expand Claude Code. I encourage you to see it as a starting point. We have much more to explore about how best to use it. Welcome to your discovery。

Thariq Shihipar and Sid Bidasaria (@sidbid) are members of the Anthropic technical team responsible for Claude Code。

[ Chuckles ]Original Link]

Claude Code presents a dynamic workflow: let AI learn to work on its own team