Practical patterns for
Claude Code
Software Architect at (软件架构师 @ 
claude-code-best-practice
Fetch weather of Karachi (获取卡拉奇天气)
By solving this problem statement, we’ll learn different concepts of agentic engineering along the way.
通过解决这个问题,我们将学习 智能体工程 的不同概念。
progressive disclosure — feeding the AI only what it needs right now, not everything upfront
orchestration — coordinating several AI agents like a conductor leads a band
dumb zone — the stretch where AI has too much context and starts thinking worse, not better
agentic workflows — AI that plans, acts, checks its work, and adapts — multi-step on its own
harness — the scaffolding around the model — files, terminal, tools — that turns a chatbot into a worker
compaction — auto-summarizing old chat so the AI keeps going without hitting its memory ceiling
context window — the AI’s working memory — how much it can “see” at once before older details fall off
ralph wiggum loop — when the AI repeats the same broken step in circles, like a confused kid who can’t stop
MCP — a universal adapter letting AI talk to your tools (GitHub, Slack, databases)
hooks — auto-triggers that run your rules before or after the AI does anything
context rot — quality decay as the conversation drags on and earlier details blur
prompt engineering — the craft of phrasing requests so the AI understands exactly what you mean
AI slop — low-effort, generic AI output that looks polished but says nothing
inference — the moment the model actually runs to produce an answer. Training is when the model learned (once, long ago). Inference is the model answering you, right now
hallucination — when AI confidently makes up facts that sound true but aren’t
context bloat — overstuffing the AI’s memory so it slows down and loses focus
one-shot prompting — giving the AI one example and asking it to follow the same pattern
token burn — wasting expensive AI “words” on unnecessary back-and-forth or bloated prompts
vibe coding — describing what you want in plain English and hoping the AI nails it
agentic engineering — building guardrails so AI acts like a reliable teammate, not a gamble
discuss
Model (Brain 🧠 — e.g. Opus, GPT) + Harness (Body 💪 — e.g. tools, MCP, memory)
模型 (大脑) + 工具链 (身体)
A horse. A model.
一匹马。一个模型。
The model is the horse. Raw power, no direction.
模型就是那匹马。原始的力量,没有方向。
Every turn is a fresh API call.
每一轮都是一次全新的 API 调用。
Memory only exists if the harness replays the transcript.
只有当工具链重放对话记录时,记忆才存在。
Stochastic means random/probabilistic — models don’t know the answer, they sample from a probability distribution.
随机意味着概率性 — 模型不"知道"答案,它们从概率分布中采样。
Prompt: “The sky is ___”
Each run the model samples — temperature controls how widely it samples.
Source: Bender, Gebru, McMillan-Major, Mitchell — On the Dangers of Stochastic Parrots (2021)
You set it to zero. You expect the same answer every time. You’re wrong.
你设为0。你期望每次得到相同答案。你错了。
Qwen3-235B at temperature = 0 — first divergence at token 103 (“Queens, New York” vs “New York City”)
Server load varies → batch size varies → kernel reductions reorder → numerics shift. Not GPU randomness — arithmetic order.
Batch-invariant kernels → consistent reduction order → identical numerics every run.
Determinism is engineered in — at every layer.
Source: Thinking Machines — Defeating Nondeterminism in LLM Inference (2025)
“What is the capital of Japan?”
“When did Pakistan gain independence?”
“Who wrote Romeo and Juliet?”
“Who won yesterday’s match?”
“What’s today’s USD → PKR rate?”
“What did Anthropic release yesterday?”
Every model — no matter how new — has a knowledge cut-off. Events after that date simply do not exist inside the model.
每个模型 — 无论多新 — 都有知识截止日期。该日期之后的事件在模型中根本不存在。
Knowledge cut-off: January 2026
Released 2026-04-17
Knowledge cut-off: December 1, 2025
Released 2026-04-23 — brand-new, but still has a cut-off.
Knowledge cut-off: January 2025
Released 2026-02-19
The raw model has no real-time access — no internet, no files, no clock.
原始模型没有实时访问能力 — 没有互联网、没有文件、没有时钟。
A horse harness. A model harness.
马的缰绳。模型的工具链。
The model is the horse. Raw power, no direction. The harness is everything else.
模型就是那匹马。原始的力量,没有方向。工具链是一切其他。
The origin is Old French harneis — gear, equipment, armor.
源自古法语 harneis — 装备、器具、盔甲。
In the diagram above: Turn × 1 · Inference × 2
Turn — one round from the user’s view: you ask, the assistant answers. The entire flow above — your request, the assistant’s tool calls, and the final reply — is one turn.
回合 — 从用户视角看的一轮:你提问,助手回答。上面的整个流程 — 你的请求、助手的工具调用、最终回复 — 是一个回合。
Inference — one call to the language model. The model wakes up, reads the input it was given, writes a reply, then forgets everything. Every arrow touching the “Language Model” column above is a separate inference. One turn can contain many inferences.
推理 — 一次对语言模型的调用。模型醒来,读取给定的输入,写出回复,然后忘掉一切。上面每个触碰"Language Model"列的箭头都是一次独立的推理。一个回合可以包含多次推理。
Source: Anthropic — Claude Code in Action: What is a coding assistant?
A dedicated Claude worker — own context, tools, focus.
一个专属的 Claude 工作者 — 独立的上下文、工具、关注点。
✅ fresh working memory per run 每次运行都有全新的工作记忆
What the specialist (or Claude) can actually do.
专家(或 Claude)实际能做什么。
✅ progressive disclosure — loaded on demand 渐进式披露 — 按需加载
Repeatable step-by-step recipes — like an AC install guide.
可重复的逐步配方 — 就像空调安装指南。
✅ reproducible recipes 可复现的配方
Knowledge you provide to the model.
你提供给模型的知识。
⚠️ 200-line problem 200行问题
What Claude holds in his head now — fresh every new chat session.
Claude 现在脑子里装的东西 — 每次新对话都是全新的。
⚠️ dumb-zone problem 盲区问题
Built-in: Read, Edit, Bash, WebSearch.
内置:读取、编辑、Bash、网页搜索。
USB-C for AI — plug in external tools (databases, browsers, APIs).
AI 的 USB-C — 插入外部工具(数据库、浏览器、API)。
e.g. 👁️ Claude in Chrome
Allow / ask / deny for tool use.
对工具使用的允许 / 询问 / 拒绝。
Deterministic scripts that fire on events.
事件触发时执行的确定性脚本。
Knowledge Anthropic bakes in.
Anthropic 内置的知识。
e.g. identity · tone · safety
例如:身份 · 语调 · 安全
✅ always on 始终开启
The harness reaches out via WebSearch and fetches a real answer from live sources.
工具链通过网页搜索从实时来源获取真实答案。
Really?
真的吗?
Similar prompt — but this time the model decided not to use the tool.
相似的提示词 — 但这一次模型决定不使用工具。
The model first tried one source — it failed (403) — so it fell back to another.
模型先尝试了一个来源 — 它失败了(403) — 于是回退到另一个。
The model does not always follow the same path.
模型并不总是遵循相同的路径。
Model sometimes failed/forgot to use the tool.
模型有时失败/忘记使用工具。
Andrej Karpathy — OpenAI founding team · former Director of AI at Tesla · founder of Eureka Labs.
Uncle Bob warns that “vibe coding” — generating code from prompts without understanding what the LLM produces — is hazardous for novices.
Uncle Bob 警告说,"氛围编码" — 从提示词生成代码而不理解 LLM 产出什么 — 对新手来说是危险的。 LLMs are mathematical functions that predict the next most likely token via matrix multiplications, trained on internet text and GitHub code. They are powerful tools — but, as he puts it, “novices using power tools lose fingers.”
Robert C. “Uncle Bob” Martin — author of Clean Code · Clean Architecture · co-author of the Agile Manifesto.
Source: Robert C. Martin on X
Examples: weather reporter, front-end engineer, QA engineer.
例如:天气播报员、前端工程师、QA 工程师。
root/ ├── .claude/ │ └── agents/ │ └── weather-agent.md └── README.md
/agentsType /agents.
Opens an interactive menu — pick "Create new agent" and the CLI drafts the agent file for you.
打开交互式菜单 — 选择"Create new agent",CLI 会为你起草智能体文件。
Creates .claude/agents/<name>.md — a plain markdown file anyone can edit.
创建 .claude/agents/<name>.md — 任何人都可以编辑的纯 Markdown 文件。
Not so fast...
别急...
The agent will always call the Open-Meteo API — no more source drift.
智能体将始终调用 Open-Meteo API — 不再有来源漂移。
It’s not guaranteed that Claude will always call this agent.
无法保证 Claude 始终会调用这个智能体。
weather-agent
harness
"PROACTIVELY" for auto-invocation
prompt
haiku, sonnet, opus, or inherit (default). weather-agent uses sonnet
harness
WebFetch, Read, Write, etc.
harness
weather-fetcher
harness
5
harness
user, project, or local. weather-agent uses project
harness
green
harness
The skills: field is what makes the agent special. It preloads any-skill directly into the agent’s brain at startup — before the agent has received a single instruction.
claude-code-best-practice Tips & Tricks
claude-code-best-practice Tips & Tricks
Knowledge you provide to the model — read every session.
你提供给模型的知识 — 每次会话都会读取。
root/ ├── CLAUDE.md └── README.md
/initType /init in Claude Code.
Claude scans your codebase and drafts a starter CLAUDE.md for you — project conventions, key patterns, common commands.
Claude 扫描你的代码库并为你起草一个初始的 CLAUDE.md — 项目约定、关键模式、常用命令。
Creates CLAUDE.md in your repo root — a plain markdown file you edit and commit.
在你的仓库根目录创建 CLAUDE.md — 一个你可以编辑并提交的纯 Markdown 文件。
claude-code-best-practice Tips & Tricks
Examples: weather fetching, sorting CSV rows, generating SVG cards.
例如:获取天气、排序 CSV 行、生成 SVG 卡片。
root/ ├── .claude/ │ └── skills/ │ └── weather-fetcher/ │ └── SKILL.md └── README.md
Just describe what you want Claude to do. It drafts the skill file for you.
只需描述你想让 Claude 做什么。它会为你起草技能文件。
"Create a skill that fetches weather from Open-Meteo for a given city."
"创建一个技能,为指定城市从 Open-Meteo 获取天气。"
Anthropic ships an official skill-creator skill. Invoke it and it walks you through generating a properly-structured skill.
Anthropic 提供了一个官方的 skill-creator 技能。调用它,它会引导你生成结构正确的技能。
Recommended — always produces the correct SKILL.md format.
推荐 — 始终生成正确的 SKILL.md 格式。
⚠️ Watch out (method 1) 注意(方法1)
The prompting method sometimes creates the wrong structure. Instead of generating a folder with SKILL.md inside (e.g. weather-fetcher/SKILL.md), it creates a plain weather-fetcher.md file. The wrong form isn’t recognized as a skill by Claude Code.
提示词方法有时会创建错误的结构。不是生成一个包含 SKILL.md 的文件夹(例如 weather-fetcher/SKILL.md),而是创建一个普通的 weather-fetcher.md 文件。错误的格式不会被 Claude Code 识别为技能。
Most fields control how and when the skill loads — enforced by the harness. Only description lives in prompt-land.
大多数字段控制技能如何以及何时加载 — 由工具链强制执行。只有 description 属于提示词层面。
/slash-command (defaults to directory name)
harness
/ menu (e.g. [city-name])
harness
true to prevent Claude from invoking this skill automatically
harness
false to hide from / menu — background knowledge only
harness
haiku, sonnet, opus
harness
fork to run skill in an isolated subagent context
harness
Small description, full body loaded on demand — this is progressive disclosure. Claude’s main context stays lean until the skill is actually needed.
The model's working memory — what it can see in this moment.
模型的工作记忆 — 它此刻能看到的内容。
/compact
Imagine Claude has a brain that holds everything it's aware of right now — your question, every file it's opened, every tool result, every word it's said back to you. If a thought isn't in the brain, Claude can't use it. Simple as that.
想象 Claude 有一个大脑,装着它此刻意识到的所有东西 — 你的问题、它打开的每个文件、每个工具结果、它对你说的每个字。如果一个想法不在大脑里,Claude 就无法使用它。就这么简单。
1. The brain is finite. It can hold about 1 million tokens — roughly 750,000 words. Big, but not infinite. 2. The brain empties at the end of every chat. When you start a new conversation, Claude remembers nothing from the last one unless you tell it again.
1. 大脑是有限的。它大约能装100万个token — 约75万个词。很大,但不是无限的。2. 大脑在每次对话结束时清空。当你开始新对话时,Claude 对上一次对话什么都不记得,除非你重新告诉它。
The moment you open Claude Code, certain things land in Claude's brain before you've typed a word. The rest waits in the wings — only loaded when you actually need it. This is called progressive disclosure.
当你打开 Claude Code 的那一刻,某些东西会在你还没打一个字之前就进入 Claude 的大脑。其余的在后台等待 — 只有当你真正需要时才加载。这就是渐进式披露。
Only descriptions of skills and agents are loaded at startup — the rest is fetched on-demand. That's progressive disclosure. It keeps the brain light.
只有技能和智能体的描述在启动时加载 — 其余按需获取。这就是渐进式披露。它让大脑保持轻盈。
by Nelson F. Liu · Stanford University · 2023
作者:Nelson F. Liu · 斯坦福大学 · 2023
CLAUDE.md) or near the user's most recent message. A bigger context window doesn't help if your payload lands in the middle.CLAUDE.md 中)或靠近用户最近的消息。如果你的内容落在中间,更大的上下文窗口也没有帮助。
This is the "dumb-zone problem" the deck has been warning about — now you know where it came from.
这就是这个演示文稿一直在警告的"盲区问题" — 现在你知道它从何而来。
Source: Liu et al. — Lost in the Middle: How Language Models Use Long Contexts (arXiv:2307.03172)
Repeatable step-by-step recipes — the instruction manual that makes Claude run the same playbook every time.
可重复的逐步配方 — 让 Claude 每次运行相同剧本的操作手册。