> A practical guide to the Nous Research open-source framework.
> Based on Hermes Agent **v0.8.0** (v2026.4.8 — "the intelligence release"). Updated: April 9, 2026.
> Original content: https://github.com/alchaincyf/hermes-agent-orange-book
## Part 1: Concepts
### 01 Not Another Agent: From Harness to Hermes
**What is Harness Engineering?**
In early 2026, a consensus emerged in the AI coding world: the bottleneck isn't the model - it's the environment. The LangChain team ran an experiment using the same model (GPT-5.2-Codex), only adjusting the surrounding "harness" configuration. Scores jumped from 52.8% to 66.5%, rankings leaped from Top 30 to Top 5. Not a single line of model code was changed.
Mitchell Hashimoto (creator of Terraform) named this: **Harness Engineering**. His approach: every time the AI made a mistake, add a rule so it would never make the same mistake again.
**The five-component mapping:**
| Harness Component | Manual Implementation | Hermes Built-in System |
|---|---|---|
| Instruction Layer | Hand-write CLAUDE.md / AGENTS.md | Skill system (markdown skill files, auto-created + self-improving) |
| Constraint Layer | Configure hooks / linter / CI | Tool permissions + sandbox + toolset enabled on demand |
| Feedback Layer | Manual review / evaluator Agent | Self-improving Learning Loop (auto-retrospective after each task) |
| Memory Layer | Manually maintain knowledge base | Three-layer memory (session/persistent/Skill) + Honcho user modeling |
| Orchestration Layer | Build your own multi-Agent pipeline | Sub-Agent delegation + cron scheduling |
**Hermes vs OpenClaw vs Claude Code:**
- **Claude Code** - interactive coding. Your pair-programming partner at the terminal.
- **OpenClaw** - configuration-as-behavior. Write a SOUL.md and it becomes what you want. 5,700+ community Skills on ClawHub.
- **Hermes** - autonomous background work + self-improvement. Runs on its own, learns on its own. Online 24/7 via Telegram/Discord/Slack.
All three tools use the **agentskills.io standard**, so Skills are interoperable.
### 02 Hermes at a Glance: 60 Seconds
**Architecture in one line:**
```
Learning Loop --> Three-Layer Memory --> Skill System --> 40+ Tools --> Multi-Platform Gateway
```
**Key numbers (v0.8.0, released April 8, 2026):**
| Metric | Data |
|---|---|
| GitHub stars | 41,200+ |
| Built-in tools | 40+ |
| Supported platforms | 14 |
| MCP integrations | 6,000+ apps |
| Sub-Agent concurrency | Up to 3 |
| Minimum deployment cost | $5/month VPS |
| Memory usage | <500MB (without local LLM) |
| License | MIT (fully open source) |
**Key differences from OpenClaw:**
| Dimension | Hermes Agent | OpenClaw |
|---|---|---|
| Core philosophy | Self-improving Learning Loop | Configuration-as-behavior (SOUL.md) |
| Memory | Three-layer self-improving | Multi-layer, primarily manually maintained |
| Skill maintenance | Agent auto-creates + self-improves | Manually written and maintained |
| User modeling | Honcho dialectical modeling (12-layer identity inference) | Based on SOUL.md configuration |
| Multi-platform | 14-platform Gateway | 50+ messaging platforms |
| Ecosystem | 40+ built-in tools + MCP 6,000+ | ClawHub 5,700+ community Skills |
| Deployment | Self-hosted (from $5 VPS) | Official hosting / self-hosted |
| Skill interop | Both use agentskills.io standard | Both use agentskills.io standard |
## Part 2: Core Mechanisms
### 03 The Learning Loop
The Learning Loop has five steps that form a continuous improvement flywheel:
```
Curate Memory --> Create Skill --> Skill Self-Improvement --> FTS5 Recall --> User Modeling
```
**Step 1: Memory curation**
After each conversation, Hermes actively decides what's worth remembering. Not passive storage - it writes valuable info into SQLite with FTS5 full-text indexing. Like a person writing a diary.
**Step 2: Autonomous Skill creation**
When a complex task is complete, Hermes asks: "will this solution be useful again?" If yes, it distills it into a Skill file at `~/.hermes/skills/`.
**Step 3: Skill self-improvement**
Every time a Skill is used and you provide feedback, Hermes modifies the Skill itself. It updates documentation and standards - not just the current output.
**Step 4: FTS5 cross-session recall**
Uses SQLite's FTS5 extension for full-text indexing. Before each new conversation, it searches historical memory based on the current topic and loads only the relevant parts. All local - no privacy concerns.
**Step 5: User modeling**
Honcho user modeling (by Plastic Labs) infers what kind of person you are across 12 identity layers - not just what you said, but deeper patterns from behavior.
**Manual vs automated comparison:**
| Dimension | Mitchell's Way (Manual) | Hermes's Way (Automated) |
|---|---|---|
| Rule source | Human spots a problem, writes it down | Agent extracts from its own feedback |
| Storage | CLAUDE.md (single file) | Multiple Skill files + memory database |
| Improvement trigger | Only when human remembers | Automatic evaluation after every use |
| Cross-project portability | Manually copy CLAUDE.md | Skills are global, shared across all projects |
| Improvement speed | Depends on human diligence | Continuous and automatic |
---
### 04 Three-Layer Memory
**Layer 1: Session memory (Episodic)**
- Answers: "What happened?"
- Every conversation's content, tool calls, and results written to SQLite with FTS5
- On-demand retrieval - not all history loaded at once
- Purely local, no network dependency
**Layer 2: Persistent memory (Semantic)**
- Answers: "Who are you?"
- Stores durable state: coding preferences, project structure habits, toolchain
- Stored in SQLite under `~/.hermes/`
- Portable: back up the directory and continue on any machine
**Layer 3: Skill memory (Procedural)**
- Answers: "How to do things?"
- Each Skill is a markdown file in `~/.hermes/skills/`
- Human-readable and editable
**Cognitive science analogy:**
| Memory Type | What Hermes Stores | Human Analogy |
|---|---|---|
| Episodic | What happened | Remembering falling off a bike |
| Semantic | Who you are + project context | Knowing to keep center of gravity low |
| Procedural | How to do things (Skills) | Body automatically balancing |
**Honcho (optional add-on):**
- Dialectical user modeling with 12 identity layers
- Infers technical level, work rhythm, communication style, goals, emotional patterns
- Catches inconsistencies between stated and revealed preferences
- Injected as invisible context into subsequent prompts
**Memory plugins (expanded in v0.8.0):**
The built-in SQLite memory is solid for solo use, but v0.8.0 turns the memory layer into a proper plugin system. You pick your backend:
| Plugin | What It Does | Best For |
|---|---|---|
| **Built-in SQLite + FTS5** | Default. Local, fast, private | Solo use, privacy-first setups |
| **Supermemory** | Cloud-hosted semantic memory, multi-container, per-user scoping | Teams, multi-platform deployments |
| **mem0** (v2 API) | Managed long-term memory with semantic search | Production agents, API-first setups |
| **Hindsight** | Reflective memory that learns from past sessions | Research workflows, iterative projects |
| **RetainDB** | Structured memory with dialectic mode | Data-heavy agents |
| **ByteRover** | Pre-LLM-call context injection | Latency-sensitive pipelines |
| **OpenViking** | Multi-tenant server mode with tenant-scoping headers | Enterprise / shared deployments |
All plugins now receive the gateway `user_id` for per-user memory scoping. This matters when your Hermes instance serves multiple people across Telegram or Discord.
> Warning: Memory has no automatic expiration by default. Audit `~/.hermes/` periodically. The silent `/new` and `/resume` memory flush failure is fixed in v0.8.0.
| Remember | Don't Remember |
|---|---|
| User preferences and habits | One-off task details |
| Project context | Outdated information |
| Validated solutions (Skills) | Wrong inferences (clean these up) |
| Recurring patterns | Sensitive info (passwords, keys) |
> Warning: Hermes's memory currently has no automatic expiration. Periodically audit `~/.hermes/` and clean up outdated Skill files.
---
### 05 The Skill System
**Three sources of Skills:**
| Source | Description | Scale |
|---|---|---|
| Bundled Skills | Pre-built capabilities shipping with install | 40+ |
| Agent-Created | Automatically distilled after complex tasks | Grows with usage |
| Skills Hub | Community-contributed, installable with one click | Continuously growing |
**agentskills.io standard:**
- Supported by 30+ tools including Claude Code, Cursor, Copilot, Codex CLI, Gemini CLI
- Skills you wrote for Claude Code work directly in Hermes, and vice versa
- Not a walled garden - like a USB port, one Skill plugs in anywhere
**Skill self-improvement cycle:**
1. Execute the Skill
2. Collect feedback (user reactions logged into session memory)
3. Agent analyzes feedback and modifies the Skill file
4. Next execution uses the new version
**OpenClaw vs Hermes Skills:**
| Dimension | OpenClaw Skills | Hermes Skills |
|---|---|---|
| Creation | Manually written SOUL.md | Agent-created + manually written |
| Maintenance | Manual updates | Auto-evolution + manual intervention |
| Personalization | Generic templates, fork to customize | Grows organically from usage habits |
| Ecosystem Size | 5,700+ (large) | 40+ bundled + community (growing) |
> Note: Skill self-improvement requires clear feedback. Vague "something's off" doesn't help. Good feedback = good evolution direction.
---
### 06 40+ Tools and MCP
**Five tool categories:**
| Category | Core Tools | What They Do |
|---|---|---|
| Execution | terminal, code_execution, file | Run commands, execute code (sandboxed), read/write files |
| Information | web, browser, session_search | Web search, browser automation, search conversation history |
| Media | vision, image_gen, tts | Understand images, generate images, text-to-speech |
| Memory | memory, skills, todo, cronjob | Operate memory layer, manage Skills, task planning, scheduled jobs |
| Coordination | delegation, moa, clarify | Delegate to sub-agents, multi-model reasoning, ask user for clarification |
Notable tools:
- **session_search** - FTS5 full-text indexing of conversation history with LLM summarization
- **moa** (Multi-model Orchestrated Answering) - calls multiple LLMs simultaneously, synthesizes responses
- **cronjob** - natural language scheduled tasks ("check my GitHub notifications every morning at 9am")
- **notify_on_complete** (new in v0.8.0) - background processes auto-notify the agent when they finish. Start a long-running build, test suite, or AI training run and walk away. The agent picks up results when they land without polling.
**Toolsets mechanism:**
Tools are grouped and enabled/disabled in `config.yaml`. Fewer enabled tools = more focused agent, faster response, fewer tokens consumed. Toolsets also serve as security boundaries.
**MCP (Model Context Protocol):**
- Open standard proposed by Anthropic in late 2024
- Hermes supports stdio or HTTP connection to any MCP Server
- 6,000+ applications covered: GitHub, Slack, Jira, Google Drive, databases, etc.
- Per-server tool filtering: specify which tools each server can expose
- Full MCP OAuth 2.1 PKCE authentication added in v0.8.0
- Automatic OSV malware scanning of MCP extension packages on install (v0.8.0)
**Sub-Agent delegation:**
- Up to 3 concurrent sub-agents
- Each has independent context, restricted toolset, isolated terminal sessions
- Results relayed back to main agent for consolidation
- Best for: "do several unrelated things and then combine results"
---
## Part 3: Hands-On Setup
### 07 Installation and Configuration
**Option 1: Local Install (5 minutes)**
```bash
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
```
Then launch with: `hermes`
**Option 2: Docker**
```bash
docker pull nousresearch/hermes-agent:latest
docker run -v ~/.hermes:/opt/data nousresearch/hermes-agent:latest
```
Key: `-v ~/.hermes:/opt/data` maps state to host. All state lives in one directory.
**Option 3: $5 VPS for 24/7 uptime**
| VPS Provider | Monthly Cost | Notes |
|---|---|---|
| Hetzner CX22 | ~$4/mo | Best value, European nodes |
| DigitalOcean Droplet | $5/mo | Singapore/US West nodes |
| Vultr | $5/mo | Tokyo node, low latency |
Pick Ubuntu 22.04 LTS, SSH in, run the install script.
**config.yaml structure:**
```yaml
model:
provider: openrouter
api_key: sk-or-xxxxx
model: anthropic/claude-sonnet-4
terminal: local # local/docker/ssh/daytona/modal
gateway:
telegram:
token: YOUR_BOT_TOKEN
discord:
token: YOUR_BOT_TOKEN
```
**Model providers:**
| Provider | Recommended Models | Best For |
|---|---|---|
| OpenRouter | Claude Sonnet 4 / GPT-4o | 200+ models, flexible switching |
| Nous Portal | Hermes 3 series + MiMo v2 Pro (free tier) | Officially recommended |
| OpenAI | GPT-4o / o3 | Direct API |
| Google AI Studio | Gemini 2.5 Pro / Flash | Native Gemini, auto context detection via models.dev |
| Ollama | Hermes 3 8B/70B | Fully offline, privacy first |
> Note: As of April 2026, Anthropic banned third-party tools from accessing Claude through Pro/Max subscriptions. Use API keys (pay-as-you-go) or OpenRouter/Nous Portal instead.
**Live model switching (new in v0.8.0):**
Use `/model` mid-session from the CLI, Telegram, Discord, or Slack. No restart needed. Telegram and Discord get an interactive inline button picker. Aggregator-aware: stays on OpenRouter/Nous when possible, falls back cross-provider automatically.
**Terminal backends:**
- `local` - runs directly on your machine
- `docker` - runs inside a container (isolated, secure)
- `ssh` - connects to a remote server
- `daytona` / `modal` - serverless, spins up on demand
- `singularity` - for HPC clusters
---
### 08 First Conversation
After launching, `~/.hermes/` structure:
```
~/.hermes/
├── config.yaml # Your configuration
├── state.db # SQLite database (conversation history + FTS5 index)
├── skills/ # Skills directory
│ └── bundled/ # Built-in Skills
├── memories/ # Persistent memory (MEMORY.md + USER.md)
└── logs/ # Centralized logs (new in v0.8.0)
├── agent.log # INFO+ level events
└── errors.log # WARNING+ level events
```
Use `hermes logs` to tail and filter logs from the CLI. Config structure validation now catches malformed YAML at startup before it causes cryptic failures.
What happens behind the scenes from your first message:
1. Conversation written to `state.db` with FTS5 index
2. Preferences detected and written to persistent memory layer
3. After complex tasks, Skill files auto-created in `~/.hermes/skills/`
4. Skill improves when you give corrective feedback
---
### 09 Multi-Platform Access
**Supported platforms (14 total):** Telegram, Discord, Slack, WhatsApp, Signal, Email, SMS (Twilio), Home Assistant, Mattermost, Matrix, DingTalk, Feishu/Lark, WeCom, Open WebUI
**Telegram setup (3 steps, <2 minutes):**
1. Message @BotFather in Telegram, send `/newbot`, get Token
2. Add to `config.yaml` under `gateway.telegram.token`
3. Launch `hermes` - it auto-connects
**Cross-platform continuity:**
All platforms share the same Agent instance and memory. A conversation started on Telegram can be continued in the CLI. There is one brain, regardless of which door you walk through.
**Practical deployment architecture:**
```
$5 VPS (Ubuntu 22.04)
├── Hermes Agent Core
├── Messaging Gateway
│ ├── Telegram Bot (phone)
│ ├── Discord Bot (team)
│ └── Slack App (enterprise)
├── ~/.hermes/
│ ├── state.db
│ ├── skills/
│ └── config.yaml
└── Model calls --> OpenRouter API
```
Total cost: VPS $5/month + model API fees (~$2-5/month for light usage).
---
### 10 Custom Skills
**Skill file structure:**
```
~/.hermes/skills/
└── my-skill/
├── SKILL.md # Entry point
├── references/ # Supporting reference files
├── templates/ # Templates
└── scripts/ # Scripts
```
**Anatomy of a good Skill:**
| Section | Purpose | Required? |
|---|---|---|
| Title | Quick identification | Yes |
| Trigger | When to activate | Strongly recommended |
| Rules | Concrete steps, constraints, formats | Yes |
| Example | Complete input-to-output | Strongly recommended |
| Don'ts | Explicit boundaries | Optional |
**Example Skill (git-commit-style):**
```markdown
---
name: git-commit-style
description: Enforce a consistent Git commit message format
version: "1.0.0"
---
# Git Commit Style
## Trigger
Activate when the user asks me to commit code, write a commit message, or review commit history.
## Rules
### Commit Message Format
- First line: type(scope): summary (50 chars max)
- Blank line
- Body: explain WHY, not WHAT
### Type Enum
- feat: new feature
- fix: bug fix
- refactor: restructure (no behavior change)
- docs: documentation
- test: tests
- chore: build/toolchain
```
**Installing from Skills Hub:**
Ask Hermes: "What community Skills are available?" -> "Install XX Skill." - Immediately active, no restart needed.
**Porting Claude Code Skills:**
Skills follow agentskills.io standard - copy to `~/.hermes/skills/skill-name/SKILL.md`. No format changes needed. Only adjust tool references if the Skill uses Claude Code-specific MCP servers.
> Note: Skills can conflict if two have overlapping triggers. If behavior seems off, check for Skill conflicts in `~/.hermes/logs/`.
---
### 11 MCP Integration
**Two connection modes:**
| Mode | Server Location | Best For | Performance |
|---|---|---|---|
| stdio | Local subprocess | Local tools, file system, databases | Fast, no network overhead |
| HTTP (StreamableHTTP) | Remote server | Cloud services, shared team servers | Depends on network |
**Approval buttons (new in v0.8.0):**
Dangerous commands no longer require you to type `/approve` in chat. Slack and Telegram now surface native inline buttons. Slack preserves full thread context; Telegram uses emoji reactions for approval status. Less friction, same safety boundary.
**GitHub MCP setup:**
```yaml
mcp_servers:
github:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
```
**Database MCP (PostgreSQL):**
```yaml
mcp_servers:
postgres:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-postgres"]
env:
POSTGRES_CONNECTION_STRING: "postgresql://user:pass@localhost:5432/mydb"
```
**Per-server tool filtering (principle of least privilege):**
```yaml
mcp_servers:
github:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxxxx"
allowed_tools:
- "list_issues"
- "create_issue"
- "get_pull_request"
- "create_pull_request_review"
```
**When to use MCP vs native tools:**
- **Native tools** for: terminal commands, file operations, web search, image generation, memory management, sub-Agent delegation
- **MCP** for: GitHub, databases, Slack, Jira, Google Drive, and other external services requiring specific API protocols
> Practical advice: Don't connect a dozen MCP Servers on day one. Start with one or two you use most (GitHub, database), get comfortable, then add more.
**Browser backend change (v0.8.0):**
The managed browser provider switched from Browserbase to [Browser Use](https://browser-use.com). Firecrawl is now available as an additional cloud browser option. If your config referenced Browserbase explicitly, update it.
**MCP + Skills combo:** MCP solves "what can I connect to," Skills solve "how to use it." Example: GitHub MCP provides PR diffs, a "Code Review" Skill defines review criteria - together Hermes auto-reviews code against your standards.
## Part 4: Real-World Scenarios
### 12 Personal Knowledge Assistant
**The cross-session memory advantage:**
Traditional AI: Re-explain context every session (3-5 minutes of setup per conversation).
With Hermes, after week one's conversations, the three memory layers have recorded:
| Memory Layer | What It Records |
|---|---|
| Session memory (SQLite + FTS5) | Exact conversation text, precise retrieval when details needed |
| Persistent memory | "User is researching AI Agent deployment, ruled out option X, prefers low cost" |
| Skill memory | "Research tasks: list dimensions first -> dig into each -> summarize per round" |
**Retrieval vs full-context loading:**
- Traditional: Stuff all history into prompt -> token costs explode, information overload
- Hermes: Persistent memory stores summaries (few hundred words), FTS5 retrieves specific snippets on demand
**Three ways cross-session memory pays off:**
1. Zero startup cost - say "continue" and you continue
2. Research has continuity - ruled-out options don't get re-recommended
3. Methodology compounds - research approach from project one gets reused in project two automatically
---
### 13 Dev Automation
**A developer's morning (hypothetical but achievable):**
- Hermes sends 3 Telegram messages before you open your laptop
- PR merged notification with review findings
- CI pipeline failure report
- Daily standup notes drafted from commits and PRs
**Automated code review setup:**
1. Connect GitHub MCP
2. Set up cron: "Check main branch for new PRs every 6 hours and do a code review"
3. Define review standards as a Skill (evolves automatically from your feedback)
**Claude Code vs Hermes division of labor:**
| Dimension | Claude Code | Hermes Agent |
|---|---|---|
| Interaction mode | Real-time conversation | Background, reports on schedule |
| Strengths | Writing code, refactoring, debugging | Monitoring, auditing, summarizing, scheduling |
| Time horizon | Single session | Continuous across days and weeks |
| Trigger | You initiate it | Cron or event-driven |
"Claude Code is the craftsman, Hermes is the butler."
**Pipeline:**
```
Claude Code writes code + opens PR
--> Hermes auto-reviews PR
--> Hermes runs tests to verify
--> Hermes generates daily report
```
---
### 14 Content Creation
**Writing series with Hermes:**
- After first article: records series positioning, target audience, editing preferences, concepts already explained
- On second article: "write the next one in this series" - it knows style, what to skip, what you disliked last time
- By fifth article: remarkably precise understanding of writing preferences, learned from feedback alone
**Parallel research with sub-agents:**
Three sub-agents simultaneously researching different products/angles. Research that used to take 60+ minutes done in 20.
**Skills that accumulate writing style:**
- Style rules stored as a Skill, not in prompts
- Skill self-improves from edits you make to drafts
- A month later: dozens of rules, all from real feedback, maintained automatically
**Claude Code vs Hermes for content:**
| Dimension | Claude Code | Hermes Agent |
|---|---|---|
| Best for | Standalone articles, one-off tasks | Content series, ongoing projects |
| Style control | CLAUDE.md + manual maintenance | Skills that auto-accumulate and evolve |
| Research efficiency | Linear search | Parallel research via sub-agents |
| Learning ability | Doesn't learn; rules manually written | Learns automatically from feedback |
---
### 15 Multi-Agent Orchestration
**Why multiple agents:**
- Context explosion: one agent handling research + coding + testing = all information interfering
- Time bottleneck: 3 tasks sequentially = A+B+C minutes; in parallel = max(A, B, C)
**delegate_task features:**
| Feature | Description |
|---|---|
| Independent context | Sub-agents have their own conversation history |
| Restricted toolset | You specify which tools each sub-agent can use |
| Isolated terminal sessions | No interference between sub-agents |
| Max 3 concurrent | Hard-coded limit (attention dispersion beyond 3) |
| Result relay | Results returned to main agent for consolidation |
**Security design:** Research sub-agents should only get web+browser. Coding sub-agents only terminal+file+code_execution. Consolidation sub-agents: no external tools.
**vs Anthropic's three-agent architecture:**
| Dimension | Anthropic Three-Agent | Hermes delegate_task |
|---|---|---|
| Role assignment | Fixed (plan/execute/evaluate) | Task-driven, flexible |
| Communication | Chain | Star topology (main agent <-> sub-agents) |
| Parallelism | Typically sequential | Up to 3 concurrent |
| Memory | No built-in memory | Main agent maintains full memory |
> Rule of thumb: If you find yourself writing lengthy consolidation instructions for the main agent, the task decomposition is probably wrong. Good decomposition makes consolidation simple.
## Part 5: Deep Thinking
### 16 Hermes vs OpenClaw vs Claude Code: Not a Choice
**Three design philosophies:**
| Dimension | Claude Code | OpenClaw | Hermes Agent |
|---|---|---|---|
| Core philosophy | Interactive coding | Configuration as behavior | Autonomous background + self-improvement |
| Your role | Sitting at the terminal directing | Writing config files to define behavior | Deploy and check in occasionally |
| Memory mechanism | CLAUDE.md + auto-memory | Multi-layer (SOUL.md + Daily Logs + semantic search) | Three-layer self-improving memory |
| Skill source | Manually installed community Hub | ClawHub 5,700+ | Agent-created + community Hub |
| Run mode | On-demand | On-demand | 24/7 background |
| Deployment | Local CLI (subscription) | Local CLI (free + API costs) | $5 VPS / Docker / Serverless |
**Scenario recommendations:**
| Scenario | Recommended Tool | Why |
|---|---|---|
| Building new features, refactoring | Claude Code | Needs real-time feedback and human judgment |
| Standardized agents for a team | OpenClaw | SOUL.md is transparent, auditable, reproducible |
| 24/7 code review | Hermes | Cron scheduling + GitHub MCP, runs unattended |
| Personal knowledge assistant | Hermes | Three-layer memory accumulates across sessions |
| Building a community bot | Hermes | Native 12+ platform Gateway |
| Rapid product idea validation | Claude Code | Fast to start, fast to iterate |
| Enterprise scenarios needing control | OpenClaw | Transparent config, predictable behavior |
| Long-term content creation | Hermes + Claude Code | Hermes for accumulation, Claude Code for writing |
**agentskills.io convergence:**
16+ tools now support the standard (Claude Code, Cursor, OpenAI Codex, Gemini CLI, Hermes). Skills are portable - your Skill library is your own asset, not a platform's appendage.
**HuaShu's workflow:**
- Claude Code = day shift (tasks needing presence: writing articles, code, product decisions)
- Hermes = night shift (doesn't need presence: monitoring repos, scheduled research, maintaining knowledge bases)
- OpenClaw's SOUL.md = standardized configuration language for behavioral constraints
### 17 The Boundaries of Self-Improving Agents
**Hermes's self-improvement constraints (technically controlled):**
- Skill files are readable markdown - you can see every diff
- Memory data is local SQLite - you can inspect and delete
- Tool permissions are sandboxed - can't arbitrarily acquire new permissions
**The practical control problem:**
- The whole appeal of Hermes is "not having to babysit it"
- But safety requires watching the self-improvement results
- This contradiction is fundamental
**Nous Research's position:**
- User control first
- MIT license - you own all source code
- You can turn off automatic Skill creation entirely
**Open source vs closed source trust:**
| | Closed Source (Claude Code) | Open Source (Hermes) |
|---|---|---|
| Trust basis | Business incentives to keep behavior predictable | Your ability to audit |
| If things go wrong | Commercial obligation to fix | MIT license - you bear consequences |
| Best for | People who don't want to touch code | People with technical chops who want control |
**The ceiling of self-improvement:**
The ceiling isn't technical - it's the feedback signal. Self-improvement works when you're there giving feedback (supervised). Without you, the agent uses its own evaluation criteria - which may not catch domain-specific errors.
**HuaShu's conclusion:**
Let the agent self-improve on the "how." You own the "what" and the "don't." That's not being lazy - it's a different kind of "on the loop."
**Open questions:**
- How much autonomous self-improvement are you comfortable with?
- Who audits the results of self-improvement?
- Do self-improving agents need a "forgetting" mechanism?
- If the agent designs its own reins, who judges if the reins are designed correctly?
## Related
- [[systems/hermes-agent]] - Technical implementation notes
- [[concepts/harness-engineering]] - The underlying methodology
- [[concepts/agentskills-standard]] - agentskills.io cross-tool portability
- [[concepts/honcho-user-modeling]] - Dialectical user modeling system