1
Build a Multi-Tool Agent with Escalation Logic
Practice designing an agentic loop with tool integration, structured error handling, and escalation patterns.
- 1Define 3-4 MCP tools with detailed descriptions that clearly differentiate purpose, inputs, and boundaries. Include at least two similar tools requiring careful description to avoid selection confusion.
- 2Implement an agentic loop that checks stop_reason to decide whether to continue tool execution or present the final response. Handle both "tool_use" and "end_turn" correctly.
- 3Add structured error responses: errorCategory (transient/validation/permission), isRetryable boolean, and human-readable descriptions. Test that the agent retries transient errors and explains business errors.
- 4Implement a programmatic hook intercepting tool calls to enforce a business rule (e.g., blocking operations above a threshold), redirecting to an escalation workflow.
- 5Test with multi-concern messages and verify the agent decomposes the request, handles each concern, and synthesizes a unified response.
Refor莽a:Domain 1Domain 2Domain 5
2
Configure Claude Code for a Team Development Workflow
Practice configuring CLAUDE.md hierarchies, custom slash commands, path-specific rules, and MCP server integration.
- 1Create a project-level CLAUDE.md with universal coding and testing standards. Verify project-level instructions apply across all team members.
- 2Create .claude/rules/ files with YAML frontmatter glob patterns (e.g., paths: ["src/api/**/*"], paths: ["**/*.test.*"]). Test that rules load only when editing matching files.
- 3Create a project-scoped skill in .claude/skills/ with context: fork and allowed-tools restrictions. Verify it runs in isolation without polluting the main conversation.
- 4Configure an MCP server in .mcp.json with environment-variable expansion for credentials. Add a personal server in ~/.claude.json and verify both are available simultaneously.
- 5Test plan mode versus direct execution on a single-file bug fix, a multi-file library migration, and a new feature with multiple valid approaches. Observe when plan mode provides value.
Refor莽a:Domain 3Domain 2
3
Build a Structured Data Extraction Pipeline
Practice designing JSON schemas, using tool_use for structured output, validation-retry loops, and batch processing.
- 1Define an extraction tool with required and optional fields, an enum with an "other" + detail pattern, and nullable fields. Process documents where some fields are absent and verify the model returns null rather than fabricating values.
- 2Implement a validation-retry loop: on validation failure, send a follow-up including the document, failed extraction, and specific error. Track which errors are resolvable (format) vs not (info absent).
- 3Add few-shot examples demonstrating extraction from varied formats (inline citations vs bibliographies, narrative vs tables) and verify improved handling.
- 4Submit a batch of 100 documents via the Message Batches API, handle failures by custom_id, resubmit with modifications (e.g., chunking oversized docs), and calculate processing time relative to SLA constraints.
- 5Have the model output field-level confidence scores, route low-confidence extractions to human review, and analyze accuracy by document type and field.
Refor莽a:Domain 4Domain 5
4
Design and Debug a Multi-Agent Research Pipeline
Practice orchestrating subagents, managing context passing, error propagation, and synthesis with provenance tracking.
- 1Build a coordinator delegating to at least two subagents (web search, document analysis). Ensure allowedTools includes "Task" and each subagent receives findings directly in its prompt.
- 2Implement parallel subagent execution via multiple Task calls in a single response. Measure latency improvement over sequential.
- 3Design structured subagent output separating content from metadata: each finding includes a claim, evidence excerpt, source URL/name, and publication date. Verify synthesis preserves attribution.
- 4Simulate a subagent timeout and verify the coordinator receives structured error context, can proceed with partial results, and annotates coverage gaps.
- 5Test with conflicting source data and verify synthesis preserves both values with attribution rather than arbitrarily selecting one, distinguishing well-established from contested findings.
Refor莽a:Domain 1Domain 2Domain 5