feat(agents): add task completion rules, skills discovery, and Edit tool #294

Perlover · 2025-12-18T14:41:54Z

Summary

Add strict task completion rules, improve skills discovery, and enable full tool inheritance for all subagents. This prevents agents from falsely marking tasks as complete without actually executing them, aligns skills usage with Claude Code's built-in mechanisms, and gives subagents access to all tools including MCP servers.

Linked item

Closes: # (must be an open Issue) OR
Implements: # (must be a Discussion)

Checklist

Linked to related Issue/Discussion
Documented steps to test (below)
Drafted "how to use" docs (if this adds new behavior)
Backwards compatibility considered (notes if applicable)

Documented steps to test

Create a spec with tasks that include "Run tests" or "Verify in browser" requirements
Run implementer agent on the spec
Verify that implementer reads CLAUDE.md from project root
Verify that implementer uses Skill(skill_name) tool to invoke relevant skills from <available_skills> section
Verify that tasks requiring test execution are NOT marked complete without actual test output
Verify that parent tasks remain [ ] if any subtasks are incomplete
Verify that subagents have access to Edit tool and other inherited tools
Verify that subagents have access to MCP tools (e.g., Playwright, Context7) when configured

Notes for reviewers

Problem 1: False task completion

Problem solved:
Implementer agents were marking tasks as completed without actually executing them. For example:

Tasks saying "Run E2E tests" were marked [x] without running tests
Parent tasks marked complete while subtasks remained [ ]
TDD GREEN phase marked complete without test execution output

Root causes addressed:

Implementer did not read CLAUDE.md (project-specific rules)
Implementer did not leverage available skills (may contain mandatory requirements like E2E testing)
No explicit rules about what constitutes task completion

Changes:

Added instruction to read CLAUDE.md from project root
Added instruction to use available skills via Skill(skill_name) tool
- Initially added hardcoded paths (.claude/skills/, ~/.claude/skills/)
- Replaced with reference to <available_skills> section and Skill() tool
- Reason: Claude Code already pre-loads skills as YAML frontmatter in system context, subagents inherit this context, so reading files manually is redundant
Added 6 critical task completion rules:
- Rule 1: Tasks complete only when actually executed
- Rule 2: Handle missing prerequisites (start services, but don't install deps)
- Rule 3: Parent tasks require all subtasks complete
- Rule 4: TDD phases require actual test execution
- Rule 5: Verification tasks require evidence
- Rule 6: Incomplete tasks must stay incomplete

Problem 2: Limited tool access for subagents

Problem solved:
All agent files previously had explicit tools: field in their YAML frontmatter listing only specific tools (Read, Write, etc.). This restricted subagents to only those listed tools, preventing access to:

Edit tool - for targeted string replacements
MCP tools - Playwright, Context7, or any user-installed MCP servers
Future tools - any new tools added to Claude Code

Solution:
Removed the tools: field entirely from all 8 agent files:

implementer.md
spec-writer.md
tasks-list-creator.md
spec-shaper.md
spec-verifier.md
implementation-verifier.md
product-planner.md
spec-initializer.md

Rationale:
Per Claude Code documentation: "Omit the tools field to inherit all tools from the main thread (default), including MCP tools." This approach:

Enables full tool inheritance - subagents automatically get all tools available to main Claude Code instance
Includes MCP tools - Playwright for browser automation, Context7 for docs, any user-configured MCP servers
Zero maintenance - new tools or MCP servers don't require updating agent files
Edit tool now available - subagents can use targeted string replacements instead of full file rewrites

Backwards compatibility

Fully backwards compatible
Existing workflows continue to work
New rules only add stricter validation, don't break existing behavior
Rule 2 explicitly prevents dependency installation conflicts when running parallel agents
Removing tools: field is additive - subagents gain access to more tools, not fewer

References

Documentation sources used for these changes:

Claude Code Tools Documentation

Frontmatter Reference (tools/allowed-tools configuration):
https://github.com/anthropics/claude-code/blob/main/plugins/plugin-dev/skills/command-development/references/frontmatter-reference.md
Agent Creator (agent file structure with tools field):
https://github.com/anthropics/claude-code/blob/main/plugins/plugin-dev/agents/agent-creator.md

Key Documentation Excerpts

From the agent creator documentation regarding tool inheritance:

"Omit the tools field to inherit all tools from the main thread (default), including MCP tools"

This is the recommended approach for maximum flexibility, allowing subagents to access:

All built-in Claude Code tools (Read, Write, Edit, Bash, Glob, Grep, etc.)
All configured MCP server tools (Playwright, Context7, etc.)
Any future tools without configuration changes

…group With Claude Opus 4.5, the previous wording caused two issues: 1. Multiple task groups in single implementer - Claude often attempted to pass all task groups to a single implementer subagent, exhausting context 2. No subagent at all - Sometimes Claude implemented task groups directly in the current conversation window without spawning any implementer This fix adds: - CRITICAL instruction to spawn SEPARATE implementer for EACH task group - Execution strategy based on task group dependencies - Examples of parallel execution (independent groups) - Examples of sequential execution (dependent groups) Each subagent now gets its own context window for focused implementation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…etion rules Problem: Implementer agents were marking tasks as completed without actually executing them. For example, tasks requiring "Run E2E tests" were marked done without running tests, and parent tasks were marked complete while subtasks remained incomplete. This led to false completion reports and unverified implementations. Root causes identified: 1. Implementer did not read CLAUDE.md which contains project-specific rules 2. Implementer did not read project or global skills which may contain mandatory requirements (e.g., E2E testing requirements) 3. No explicit rules about what constitutes task completion 4. No rules about parent/subtask relationships Changes: 1. Added instruction to read CLAUDE.md from project root 2. Added instruction to read project skills from .claude/skills/ 3. Added instruction to read global skills from ~/.claude/skills/ 4. Added 6 critical task completion rules: - Rule 1: Tasks complete only when actually executed - Rule 2: Start services yourself, but don't install dependencies (may conflict with parallel agents) - Rule 3: Parent tasks require all subtasks complete - Rule 4: TDD phases require actual test execution output - Rule 5: Verification tasks require evidence (screenshots) - Rule 6: Incomplete tasks must stay marked incomplete with explanation This ensures implementer agents: - Follow project-specific conventions and requirements - Don't falsely mark tasks as complete - Provide evidence for verification tasks - Handle parallel execution safely (no dependency conflicts)

…ference Problem: The previous commit added instructions for implementer to read skills from specific directories (.claude/skills/ and ~/.claude/skills/). However, this approach has issues: 1. Hardcoded paths are Claude Code internal implementation details 2. Claude Code already pre-loads available skills as YAML frontmatter in the <available_skills> section of the system context 3. Subagents inherit the same context with available skills already visible 4. Reading files manually duplicates what Claude Code already provides Solution: Replace the two lines with specific paths with a single instruction that: - Points to <available_skills> section in system context - Instructs to use the Skill(skill_name) tool to invoke relevant skills - Removes dependency on internal Claude Code directory structure This makes the instructions more portable and aligned with how Claude Code actually handles skills discovery and invocation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…tions All agent files previously had only Write and Read tools, which meant agents could only create/overwrite entire files but couldn't make targeted edits. This was inefficient when working with existing code. The Edit tool allows precise string replacements (old_string → new_string) without rewriting entire files. This is the preferred approach in Claude Code for modifying existing files, as documented in the official Claude Code skills reference. Changes: - implementer.md: added Edit tool - spec-writer.md: added Edit tool - tasks-list-creator.md: added Edit tool - spec-shaper.md: added Edit tool - spec-verifier.md: added Edit tool - implementation-verifier.md: added Edit tool - product-planner.md: added Edit tool - spec-initializer.md: added Edit tool 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…eritance Remove the `tools:` field from all 8 subagent YAML frontmatter configurations. Why this change was made: 1. **Default behavior is better**: When the `tools` field is omitted, subagents automatically inherit ALL tools from the main thread, including all MCP tools from user-configured MCP servers. 2. **MCP tools access**: With explicit `tools` field, subagents were restricted only to the listed tools. This meant MCP tools (like Playwright, Context7, or any user-installed MCP servers) required manual addition to each agent. 3. **Maintenance burden eliminated**: Previously, every new tool or MCP server required updating all agent files. Now subagents automatically get access to any tools available to the main Claude Code instance. 4. **Per Claude Code docs**: "Omit the tools field to inherit all tools from the main thread (default), including MCP tools" - this is the recommended approach for maximum flexibility. Affected agents: - implementer.md - implementation-verifier.md - spec-initializer.md - spec-shaper.md - spec-verifier.md - spec-writer.md - tasks-list-creator.md - product-planner.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…and compatibility In Claude Code 2.1.*, skills can be invoked via slash commands (typing `/` displays all skill names). However, skills with spaces in their `name` field fail to be found when selected from the menu. This change adds a new substep (3.1) to the improve-skills command that converts skill names to kebab-case format (lowercase, spaces replaced with hyphens). This ensures skills are properly discoverable and invocable via the slash command interface. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

tonydehnke · 2026-01-09T03:07:30Z

Do you have a fork of Agent-os that has all your improvements? I've been copying over bits and pieces here and there, but with none of these getting merged into Agent-OS and no recent releases of it, it's getting harder to remember what I've updated and what has not.

@CasJam what is the plan for getting PR's reviewed for Agent-OS?

Perlover · 2026-01-09T08:23:20Z

Do you have a fork of Agent-os that has all your improvements? I've been copying over bits and pieces here and there, but with none of these getting merged into Agent-OS and no recent releases of it, it's getting harder to remember what I've updated and what has not.

@CasJam what is the plan for getting PR's reviewed for Agent-OS?

Yes, of course, I have a fork of this project; otherwise, I would not be able to propose patches in this issue/pull request to which you have just replied. If you want to install my agent-os (that is, my version) “very easily and simply,” then I suggest the following:

curl -sSL "https://raw.githubusercontent.com/Perlover/agent-os/fix/implementer-task-completion-rules/scripts/base-install.sh" \
    | sed 's|buildermethods/agent-os|Perlover/agent-os|g; s|/raw/main/|/raw/fix/implementer-task-completion-rules/|g; s|branch="main"|branch="fix/implementer-task-completion-rules"|g' \
    > /tmp/install-fixed.sh && bash /tmp/install-fixed.sh

Perlover · 2026-01-09T08:28:18Z

Also, just yesterday I added another commit that I believe is very important. The issue is that the latest code started converting skills into commands for loading. The problem is that when agent-os converts profile standards into skills, and then you additionally run the /improve-skills command, you still end up with skills whose name field exists, but whose title is represented simply as English words separated by spaces. This is exactly what causes the problem with the new feature I mentioned earlier.

When you press the slash, you will see all skills, including those standards that were converted into skills. However, you will notice that if you select at least one of those skills, it will report that the command was not found. This appears to be related to an implicit or undocumented rule in Claude Code, according to which skill names are apparently required to be in kebab-case. That requirement is not met when agent-os converts standards into skills (even when /improve-skills is invoked). The latest commit should also address this issue (patched /improve-skills command)

… link Removed external Claude documentation links and embedded comprehensive skill-writing rules directly into the improve-skills command. Added: - Three-Phase Loading Model explanation - Supported/unsupported YAML frontmatter fields - Description patterns for maximum discoverability - Compactness Principle and File Structure Options - Quick Template and Quality Checklist Preserved Agent-OS specific content: - "Include all relevant instructions, details and directives" recommendation - Explicit mention of Agent OS standards files for linking Co-Authored-By: Claude <noreply@anthropic.com>

Perlover · 2026-01-09T09:10:23Z

I have also just improved the improve-skills command. In the previous version—the original one—it contained only a link to the Claude company website, stating that the rules for writing skills were located there. I consider this approach incorrect, because it implicitly assumes that every time this command is run, Claude Code will go to the internet, download those rules again, and study them. I do not think it actually does this; and even if it does, it would be in a very superficial way.

I reworked this command. I already had my own, well-developed skills for writing skills, which I had carefully refined based on the Claude Code documentation. I enhanced the command using my own skill for writing skills, so that the Improve Skills command does everything it did before, but now contains the documentation on how to act directly within the command itself. This way, it does not need to go to the internet to search for anything, and its behavior is always aligned with a single, consistent prompt.

Perlover · 2026-01-09T16:05:24Z

Last commit: e7856f1

Add Impact Analysis to Spec Writing Workflow

Problem

Current workflow searches for reusable code but not all affected code.

When refactoring shared constants/types, specs miss:

Duplicate definitions in other packages
Hardcoded values that should be updated
Subsystems not obvious from the main codebase

Result: specs look complete but implementation breaks things.

Solution

Add impact analysis at three stages:

research-spec.md - Ask users about OTHER areas using the same constants (background jobs, scripts, config, tests)
write-spec.md - New Step 2.5 with grep commands to find ALL usages and detect duplicates
create-tasks-list.md - New Step 1.5 to verify spec completeness before creating tasks

Why

Without this, a spec for "modify constant X" might update the main package but miss another package that has its own copy of the same constant. The code compiles, tests pass in isolation, but production breaks.

Problem: When writing specs for refactoring tasks that modify shared constants or types, the current workflow only searches for "reusable" code, not ALL affected code. This caused specs to miss critical areas like duplicate definitions in different packages. Solution: Add three layers of impact analysis: 1. write-spec.md - New Step 2.5 "Impact Analysis": - Search for ALL usages of constants/types being modified - Detect duplicate definitions across packages (RED FLAG) - Find hardcoded values that should use constants - Document all affected packages/modules 2. research-spec.md - New impact analysis question: - Ask users about OTHER packages that might be affected - Record affected areas for spec-writer to investigate 3. create-tasks-list.md - New Step 1.5 "Verify Spec Completeness": - Independent search before creating tasks - Compare results with files mentioned in spec - Flag and document any gaps found - Create additional task groups for missing areas This ensures refactoring specs capture ALL code that needs updates, preventing production issues from partially applied changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When changing constants or types, the existing impact analysis finds direct usages, duplicate definitions, and hardcoded values. However, it missed related data structures like lookup tables and transformation functions that depend on those values. Added Step 2.5 point 5: "Find related mappings and transformations" - Search for *Mapping*, *Map*, *transform*, *convert* patterns - Explains why this matters (lookup tables, config objects, converters) - Added to "Document your findings" checklist - Added constraint buildermethods#4 about checking related mappings Example: when changing a shared constant, a lookup table that maps external values to that constant's values would be missed by direct grep searches but found by searching for *Mapping* patterns. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…ery, and Edit tool

…me MCP tools Resolved conflicts: kept tools field removed from agent files (from PR buildermethods#294) while incorporating the Chrome tool expansion function in common-functions.sh

…er docs Changed `name` field from ambiguous "auto" to "**yes**" in Required column. Added clarification that name must match directory name. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…pstream PR buildermethods#294 Adapt improvements from buildermethods/agent-os PR buildermethods#294 to work with our JSON-based workflow system. These changes prevent false task completion and improve refactoring task handling. Changes to implement-tasks.md: - Add CLAUDE.md reading instruction for project-specific rules - Add skills discovery via <available_skills> section - Add 6 critical task completion rules: - Rule 1: Tasks complete only when actually executed - Rule 2: Handle missing prerequisites (start services, don't install deps) - Rule 3: Task groups require all tasks complete - Rule 4: TDD phases require actual test execution - Rule 5: Verification tasks require evidence - Rule 6: Incomplete tasks must stay incomplete Changes to create-tasks-list.md: - Add Step 1.5: Spec completeness verification for refactoring tasks - Run grep searches to find all affected files before creating tasks - Document gaps between spec and actual codebase impact Changes to research-spec.md: - Add Question 8 for impact analysis during requirements gathering - Add Impact Analysis section to requirements.md template - Capture affected areas for refactoring tasks Changes to write-spec.md: - Add Step 2.5: Comprehensive impact analysis for refactoring - Add "Files Requiring Modification" section to spec.md template - Check for duplicate definitions and related mappings Note: PRs buildermethods#301 (Chrome alias) and buildermethods#292 (separate subagents) were already merged. PR buildermethods#293 (init-spec clarity) not applicable due to JSON workflow rewrite. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Perlover and others added 4 commits December 17, 2025 16:20

Perlover changed the title ~~Fix/implementer task completion rules~~ feat(agents): add task completion rules, skills discovery, and Edit tool Dec 19, 2025

Perlover and others added 2 commits December 19, 2025 16:28

implementation-verifier.md: Added the Skill tools too.

e40f2e7

This was referenced Jan 6, 2026

Add Chrome tool alias for Claude in Chrome MCP tools #300

Open

Add Chrome tool alias for Claude in Chrome MCP tools #301

Open

Perlover force-pushed the fix/implementer-task-completion-rules branch from fbcb1f4 to a6027e7 Compare January 9, 2026 08:57

Perlover force-pushed the fix/implementer-task-completion-rules branch from a6027e7 to a386cf3 Compare January 9, 2026 09:09

Perlover force-pushed the fix/implementer-task-completion-rules branch from c623d14 to e7856f1 Compare January 9, 2026 16:13

Perlover force-pushed the fix/implementer-task-completion-rules branch from 9e28df3 to 1db26af Compare January 9, 2026 16:19

dchuk added a commit to dchuk/agent-os that referenced this pull request Jan 11, 2026

Merge PR buildermethods#294: Add task completion rules, skills discov…

de78def

…ery, and Edit tool

fix(improve-skills): clarify name field is required in YAML frontmatt…

1c5da73

…er docs Changed `name` field from ambiguous "auto" to "**yes**" in Required column. Added clarification that name must match directory name. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(agents): add task completion rules, skills discovery, and Edit tool #294

feat(agents): add task completion rules, skills discovery, and Edit tool #294

Uh oh!

Perlover commented Dec 18, 2025 •

edited

Loading

Uh oh!

tonydehnke commented Jan 9, 2026

Uh oh!

Perlover commented Jan 9, 2026

Uh oh!

Perlover commented Jan 9, 2026 •

edited

Loading

Uh oh!

Perlover commented Jan 9, 2026

Uh oh!

Perlover commented Jan 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(agents): add task completion rules, skills discovery, and Edit tool #294

Are you sure you want to change the base?

feat(agents): add task completion rules, skills discovery, and Edit tool #294

Uh oh!

Conversation

Perlover commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Linked item

Checklist

Documented steps to test

Notes for reviewers

Problem 1: False task completion

Problem 2: Limited tool access for subagents

Backwards compatibility

References

Claude Code Tools Documentation

Key Documentation Excerpts

Uh oh!

tonydehnke commented Jan 9, 2026

Uh oh!

Perlover commented Jan 9, 2026

Uh oh!

Perlover commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Perlover commented Jan 9, 2026

Uh oh!

Perlover commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Impact Analysis to Spec Writing Workflow

Problem

Solution

Why

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Perlover commented Dec 18, 2025 •

edited

Loading

Perlover commented Jan 9, 2026 •

edited

Loading

Perlover commented Jan 9, 2026 •

edited

Loading