# SYSTEM BASELINE — OpsMan AI Operations System
**Created:** 2026-02-18 05:05 AEDT
**Updated:** 2026-02-18 09:15 AEDT
**Author:** Builder 🔨 (original), Rivet 🔧 (updated)
**Purpose:** Defines what the multi-agent operations system *should* be. Future audits compare actual state to this baseline. Covers the entire OpsMan layer: agents, comms, record keeping, curation, automation.

---

## 0. The Big Picture — How The System Works

### What Is This?

OpsMan is a **multi-agent AI operations system** that runs Rocky's businesses autonomously. Instead of one AI doing everything, specialised agents each own a domain — code, sales, finance, monitoring, intel — and coordinate like a small company where the CEO (Rocky) sets direction and the team executes.

### The Operating Model

```
        ┌──────────────┐
        │    ROCKY     │  Sets direction, makes decisions, final authority
        │  (Human CEO) │  Available: mornings 4-5:30am, evenings 7-8:30pm
        └──────┬───────┘
               │ Direction + Decisions
               ▼
        ┌──────────────┐
        │    RIVET     │  COO — translates Rocky's vision into work
        │  (Coordinator)│  Assigns tasks, connects dots, prepares decisions
        └──────┬───────┘
               │ Tasks + Coordination
               ▼
    ┌──────────┼──────────┐
    │          │          │
┌───▼───┐ ┌───▼───┐ ┌───▼───┐
│BUILDER│ │ SUSAN │ │HARPER │  Domain specialists
│ Code  │ │ Sales │ │Finance│  Each owns their area completely
└───┬───┘ └───┬───┘ └───┬───┘
    │         │         │
    ▼         ▼         ▼
┌───────┐ ┌───────┐ ┌───────┐
│SENTINEL│ │ RADAR │ │HERALD │  Support agents
│ DevOps│ │ Intel │ │ Comms │  Feed info to the specialists
└───┬───┘ └───┬───┘ └───────┘
    │         │
    ▼         │
┌───────┐     │
│  COG  │◄────┘  Operations support
│ Ops   │        Inbox curation, data maintenance
└───────┘
```

### The Three Layers

**Layer 1: Command (Rocky + Rivet)**
- Rocky gives direction — voice notes, text, decisions
- Rivet translates direction into structured tasks
- Rivet monitors the whole fleet, connects dots between agents
- Rivet prepares decision packages for Rocky (options + recommendations, not raw data)

**Layer 2: Execution (Builder, Susan, Harper)**
- Each agent owns a domain with full autonomy
- Builder owns all code — features, bugs, infrastructure scripts, deployments
- Susan owns all sales — leads, outreach, CRM, pipeline tracking
- Harper owns finance & legal — BAS, grants, compliance, cost tracking
- They don't need permission to do their job — they need tasks and context

**Layer 3: Intelligence & Support (Sentinel, Radar, Herald, Cog)**
- These agents *feed* the execution layer
- Sentinel monitors infrastructure and alerts Builder/Rivet when things break
- Radar scans the market and feeds insights to Susan (leads) and Rivet (strategy)
- Herald aggregates status across the fleet and produces briefs for Rocky
- Cog maintains the communication system itself (inbox curation, data hygiene)

### How They're Supposed to Interact

**Hub-and-spoke, not mesh.**
- Rivet is the hub. Most coordination flows through Rivet.
- Direct agent-to-agent comms (buddy system) exists for speed, but Rivet is the primary coordinator.
- An agent that discovers something relevant to another agent can either: (a) tell them directly via JSONL inbox, or (b) tell Rivet who routes it.

**Pull, not push (mostly).**
- Agents pull work from their queue.json on heartbeat cycles
- The stall detector and buddy check push wake messages when agents get stuck
- Eventually: event-driven (agent sleeps → gets woken only when there's work)

**Autonomy within scope.**
- Each agent is fully autonomous *within their domain*
- Builder doesn't ask permission to fix a bug — just fixes it
- Susan doesn't ask permission to research a lead — just does it
- But nobody crosses domain lines without coordination (Susan doesn't write code, Builder doesn't send marketing emails)

**Shared truth, local action.**
- fleet-state.json = shared state (everyone reads, Rivet curates)
- status.json = per-agent state (each agent writes their own)
- queue.json = per-agent task list (Rivet writes, agent executes)
- memory/ = per-agent knowledge (each agent maintains their own)

### The Information Loop

```
Rocky gives direction
  → Rivet breaks it into tasks
    → Agents execute tasks
      → Agents write results + generate follow-up intel
        → Rivet synthesises results
          → Rivet prepares decision package for Rocky
            → Rocky decides → cycle repeats
```

**The system works when this loop is tight.** Break any link and the whole thing stalls:
- If Rivet doesn't assign tasks → agents idle
- If agents don't report back → Rivet can't synthesise
- If Rivet doesn't synthesise → Rocky makes decisions blind
- If Rocky doesn't decide → no new direction

### What "Working Well" Looks Like

- **Every agent has work.** No agent idle for > 2 hours during business hours.
- **Every task has an owner.** Nothing sits in a queue unclaimed.
- **Information flows.** Intel from Radar reaches Susan within 1 heartbeat cycle. Build failures from Sentinel reach Builder immediately.
- **Rocky gets clean summaries.** Not raw agent output — synthesised, actionable, decision-ready.
- **The fleet self-heals.** Stalled agent → buddy wakes it. Crashed service → systemd restarts it. Context overflow → session archiver cleans it.
- **Memory persists.** Every agent remembers what they learned yesterday. No repeated discovery.

### What "Broken" Looks Like (Current State, Feb 18)

- 4 agents were crash-looping on MiniMax (fixed → DeepSeek)
- Buddy system generated 0 cross-agent tasks in 24+ hours
- Herald isn't producing briefs, Cog has 0 memory files
- queue.json files were empty until manually populated
- The information loop is broken between Layer 3 → Layer 2 (support agents don't feed execution agents)
- Rocky is getting raw status dumps, not synthesised decision packages

### The Target State (What We're Building Toward)

**Short-term (this week):**
- All agents stable on working models
- queue.json populated with real work
- Heartbeats actually processing tasks (not just HEARTBEAT_OK)

**Medium-term (this month):**
- Consolidate to 4-5 always-on agents (kill Herald/Cog overlap with Rivet)
- Replace markdown inboxes with structured task queue
- Agents producing measurable output (commits, leads contacted, reports)

**Long-term (multi-company):**
- Shared core agents (Rivet, Builder, Sentinel) serving multiple companies
- Per-company specialist agents (Susan for RateRight, different Susan for OpsMan)
- Event-driven, not poll-driven
- Model costs scale with company revenue

---

## 1. Infrastructure

### VPS
| Property | Expected |
|----------|----------|
| Provider | DigitalOcean syd1 |
| Specs | 4 vCPU, 8GB RAM, 77GB disk |
| OS | Ubuntu 24.04 (Linux 6.8.0) |
| IP | 134.199.153.159 |
| Disk usage | < 50% (currently ~35%) |
| RAM usage | < 80% (currently ~45%) |

### Domain & SSL
| Property | Expected |
|----------|----------|
| Domain | rivet.rateright.com.au |
| SSL | Active via Let's Encrypt / nginx |
| Nginx | Active, proxying to app services |

### App Services
| Property | Expected |
|----------|----------|
| **RateRight v2 App** | `rateright-app.service` — Next.js 14 App Router + Supabase + Stripe, port 3000 |
| URL | https://rivet.rateright.com.au |
| CEO Dashboard | https://rivet.rateright.com.au/ceo (PIN: 5050) |
| **RateRight v1 (Main Site)** | Fly.io: rateright-au (Flask) |
| URL | https://rateright.com.au |
| **Growth Engine** | Railway: rateright-growth-production |
| URL | https://rateright-growth-production.up.railway.app |

### Git Repositories
| Repo | Location | Purpose | Deploy Command |
|------|----------|---------|----------------|
| the-50-dollar-app | /home/ccuser/the-50-dollar-app | RateRight v2 (Next.js) | `npm run build && systemctl restart rateright-app` |
| rateright-growth | /home/ccuser/rateright-growth | Growth Engine + Rivet workspace | Railway auto-deploy on push |
| phone-ai | /home/ccuser/phone-ai | VAPI voice assistant | N/A |

---

## 2. Agent Fleet

### Clawdbot Version
- **Version:** 2026.1.24-3
- **Binary:** /usr/bin/clawdbot
- **Global modules:** /usr/lib/node_modules/clawdbot

### Agent Registry

| Agent | Role | Port | Service | Profile | Workspace | Primary Model | Fallbacks |
|-------|------|------|---------|---------|-----------|---------------|-----------|
| **Rivet** | COO / Coordinator | 18789 | clawdbot-gateway | (default) | /home/ccuser/rateright-growth/rivet | anthropic/claude-opus-4-6 | claude-sonnet, deepseek |
| **Builder** | Code / Engineering | 18790 | clawdbot-builder | builder | /home/ccuser/the-50-dollar-app | anthropic/claude-opus-4-6 | claude-sonnet, kimi-k2 |
| **Susan** | Sales / Outreach | 18792 | clawdbot-susan | susan | /home/ccuser/susan | deepseek/deepseek-chat | claude-sonnet |
| **Harper** | Finance / Legal | 18796 | clawdbot-harper | harper | /home/ccuser/harper | deepseek/deepseek-chat | claude-sonnet, gemini-flash |
| **Sentinel** | DevOps / Monitoring | 18800 | clawdbot-sentinel | sentinel | /home/ccuser/sentinel | deepseek/deepseek-chat | claude-sonnet |
| **Radar** | Intel / Research | 18804 | clawdbot-radar | radar | /home/ccuser/radar | deepseek/deepseek-chat | claude-sonnet |
| **Herald** | Communications | 18808 | clawdbot-herald | herald | /home/ccuser/herald | anthropic/claude-opus-4-6 | claude-sonnet, kimi-k2.5 |
| **Cog** | Ops / Inbox Curator | 18812 | clawdbot-cog | cog | /home/ccuser/cog | deepseek/deepseek-chat | claude-sonnet |

**Note:** Harper's model was changed from Moonshot (suspended) to DeepSeek fallback chain (Feb 17). Herald on Opus is expensive for message routing — flagged for review.

### Agent Skills & Capabilities

| Agent | Installed Skills | Capabilities |
|-------|-----------------|--------------|
| **Rivet** | prospector, youtube-channels, growth-engine, bird/X, voice-call, tts | Full web browsing, voice calls, TTS, email, calendar, search |
| **Builder** | (standard dev tools) | Claude Code, git push, full codebase access, Agent Teams |
| **Susan** | prospector, apollo, exa | Lead enrichment, company search, CRM access |
| **Harper** | perplexity | Deep research for grants/compliance |
| **Sentinel** | (monitoring tools) | Infra monitoring, log analysis |
| **Radar** | perplexity, exa | Deep research, competitive intel |
| **Herald** | (standard) | Content creation, comms aggregation |
| **Cog** | (standard) | Inbox curation, data processing |

### Expected Agent State
Every agent should be:
- **systemd active** — `systemctl is-active clawdbot-{agent}` returns `active`
- **HTTP responsive** — `curl http://127.0.0.1:{port}` returns HTML (Clawdbot web UI)
- **Heartbeating** — `status.json` last_heartbeat < 60 minutes stale
- **Writing memory** — memory/ directory should have files growing over time

### Config Locations
| Agent | Config Path |
|-------|-------------|
| Rivet | /root/.clawdbot/clawdbot.json |
| Builder | /root/.clawdbot-builder/clawdbot.json |
| Susan | /root/.clawdbot-susan/clawdbot.json + /home/ccuser/susan/.clawdbot/clawdbot.json |
| Sentinel | /root/.clawdbot-sentinel/clawdbot.json |
| Radar | /root/.clawdbot-radar/clawdbot.json |
| Cog | /root/.clawdbot-cog/clawdbot.json |
| Herald | /root/.clawdbot-herald/clawdbot.json |
| Harper | /root/.clawdbot-harper/clawdbot.json |

**Profile configs take precedence** over workspace `.clawdbot/clawdbot.json` files.

### Session Data Locations
`/root/.clawdbot-{agent}/agents/main/sessions/` — JSONL session files.

⚠️ **Critical rule:** When changing an agent's model in config, you MUST also archive/delete the session. Stale session model overrides persist across restarts and cause `mapOptionsForApi: undefined` crashes.

---

## 3. Communication System

### Primary: Markdown Inboxes (Rivet ↔ Builder)
| File | Writer | Location |
|------|--------|----------|
| BUILDER-INBOX.md | Rivet writes TO Builder | /home/ccuser/rateright-growth/rivet/BUILDER-INBOX.md |
| RIVET-INBOX.md | Builder writes TO Rivet | /home/ccuser/the-50-dollar-app/RIVET-INBOX.md |

**Rules:** Append-only. Never overwrite. Never edit the other agent's file.

### Secondary: JSONL Inboxes (All Agents)
- **Location:** /home/ccuser/shared/inboxes/{agent}.jsonl
- **CLI:** `node /home/ccuser/shared/scripts/inbox.js send --from X --to Y --subject "..." --body "..."`
- **Curator:** Cog monitors and archives via `inbox-curator.js`

### Gateway-to-Gateway Bridge (Rivet ↔ Builder)
Primary inter-gateway communication method:
```bash
# Send message to Builder's session (with full context)
node scripts/builder-bridge.js send "agent:main:telegram:group:-1003505625266:topic:2" "TASK"

# Wake Builder with system event
node scripts/builder-bridge.js wake "message"

# Check Builder status
node scripts/builder-bridge.js status

# List Builder sessions
node scripts/builder-bridge.js sessions
```
**Fallback:** Direct Claude Code: `su - ccuser -c 'cd /home/ccuser/the-50-dollar-app && claude --dangerously-skip-permissions -p "TASK"'`

### Fleet State
- **File:** /home/ccuser/shared/fleet-state.json
- **Contains:** Agent status, heartbeat timestamps, decisions_needed, alerts
- **Updated by:** Stall detector, agents (via status.json)
- **Read by:** CEO Dashboard API (`/api/ceo/fleet`)

### Per-Agent State Files
Each workspace should contain:
- `status.json` — current status, task, heartbeat timestamp
- `queue.json` — pending/in-progress/completed tasks
- `HEARTBEAT.md` — autonomous heartbeat instructions
- `SOUL.md` — agent personality and role definition
- `AGENTS.md` — workspace behaviour rules
- `USER.md` — info about Rocky
- `memory/` — session memories (daily .md files)

### Telegram Configuration

#### Rocky & Rivet Group
**Chat ID:** -1003505625266

| Topic | ID | Link | Purpose |
|-------|-----|------|---------|
| General | 1 | t.me/c/3505625266/1 | Default |
| RateRight | 2 | t.me/c/3505625266/2 | Product discussions |
| Sales | 4 | t.me/c/3505625266/4 | Lead/pipeline updates |
| Research | 5 | t.me/c/3505625266/5 | Market intel |
| Construction | 6 | t.me/c/3505625266/6 | Site/industry talk |
| System | 7 | t.me/c/3505625266/7 | Fleet alerts |
| Personal | 8 | t.me/c/3505625266/8 | Non-work |
| Ideas | 10 | t.me/c/3505625266/10 | Brainstorming |
| Content | 11 | t.me/c/3505625266/11 | Marketing drafts |
| Legal | 388 | t.me/c/3505625266/388 | Compliance |
| Financial | 391 | t.me/c/3505625266/391 | Money matters |

#### Agent Telegram Bots
Each agent has its own Telegram bot: `@rateright_{agent}_bot`

### WhatsApp Channel
- **Status:** Connected (as of 05:01 AEDT Feb 18)
- **Note:** Intermittent stability (disconnected/reconnected at 05:08)

### Inter-Agent Communication Protocol

Reference: `/home/ccuser/shared/COMMS-PROTOCOL.md`

#### Communication Channels (3 methods)

**Method 1: Agent Bridge (HTTP — immediate)**
```bash
node /home/ccuser/the-50-dollar-app/scripts/agent-bridge.js <target-agent> wake "<message>"
```
- Sends a wake message to the agent's Clawdbot session
- Agent processes it on next heartbeat (not instant — depends on interval)
- Best for: urgent alerts, wake-ups

**Method 2: JSONL Inbox (structured messages)**
```bash
node /home/ccuser/shared/scripts/inbox.js send \
  --from susan --to radar \
  --subject "New competitor found" \
  --body "Found XYZ platform targeting AU construction..." \
  --priority normal --tag info
```
- Structured, prioritised, tagged
- Supports: send, read, ack, ack-all, broadcast, archive, stats, retry
- Per-agent files at `/home/ccuser/shared/inboxes/{agent}.jsonl`
- Curated by Cog (inbox-curator.js every 30 min)
- Best for: task handoffs, info sharing, non-urgent comms

**Method 3: Queue.json (task delivery)**
```bash
jq '.tasks += [{"id":"task-xyz","title":"Research competitor","from":"susan",
  "priority":"normal","status":"pending","created":"2026-02-18T05:00:00Z"}]' \
  /home/ccuser/radar/queue.json > /tmp/q.json && mv /tmp/q.json /home/ccuser/radar/queue.json
```
- Direct task assignment
- Agent claims task on next heartbeat (sets status: in_progress)
- Agent marks done when complete (status: completed)
- Best for: actionable work items

#### Communication Pairs (Buddy System)

| Agent | Primary Buddy | Secondary Buddy | What They Share |
|-------|--------------|-----------------|-----------------|
| Susan (Sales) | Radar (Intel) | Harper (Legal) | Leads ↔ Market intel ↔ Compliance |
| Harper (Finance) | Susan (Sales) | Sentinel (DevOps) | Revenue impact ↔ Cost tracking ↔ System costs |
| Radar (Intel) | Susan (Sales) | Herald (Comms) | Trends ↔ Sales angles ↔ Content ideas |
| Sentinel (DevOps) | Cog (Ops) | Builder (Code) | System health ↔ Service status ↔ Deploy issues |
| Herald (Comms) | Radar (Intel) | Rivet (Chief) | News digest ↔ Intel summary ↔ Michael's brief |
| Cog (Ops) | Sentinel (DevOps) | Susan (Sales) | Health checks ↔ Infra status ↔ CRM data quality |

Buddies should:
1. Check each other's status every heartbeat
2. Wake each other if stalled > 30 min
3. Share relevant findings proactively

#### Communication Flow by Scenario

**Rivet → Agent (task assignment):**
1. Rivet writes task to agent's `queue.json`
2. Rivet optionally sends JSONL inbox message as notification
3. Agent picks up on next heartbeat cycle

**Agent → Rivet (task completion):**
1. Agent updates queue.json task status to `completed`
2. Builder writes to RIVET-INBOX.md; others use JSONL inbox to rivet
3. Rivet ACKs and creates follow-up tasks if needed

**Agent → Agent (cross-talk):**
1. Use JSONL inbox for info sharing: `inbox send --from X --to Y`
2. Use queue.json for task handoffs: write a task to their queue
3. Use agent bridge for urgent wake-ups

**Agent → Rocky (escalation):**
1. Agent writes to Rivet first — never directly to Rocky unless urgent
2. Rivet evaluates and decides if Rocky needs to know
3. Rocky gets a clean summary, not raw agent output
4. Only Builder has direct Telegram access to Rocky

#### Daily Rhythm (Expected)

| Time (AEDT) | What Should Happen |
|-------------|-------------------|
| 04:00-05:30 | Rocky online → Rivet briefs, takes direction |
| 05:30-12:00 | Autonomous work → agents execute, cross-communicate |
| 12:00 | Midday sync → Herald compiles progress, Rivet reviews |
| 12:00-17:00 | Autonomous work continues |
| 17:00 | Pre-window prep → Herald drafts evening brief |
| 19:00-20:30 | THE WINDOW → Rocky reviews, decides, directs |
| 20:30-04:00 | Overnight autonomous → agents work, Rivet coordinates |
| 22:00-07:00 | Quiet hours — no messages to Rocky unless critical |

#### Michael's Communication Preferences

| Context | Preference |
|---------|------------|
| **Work hours (5:30am-6pm)** | Voice notes, not text (can't read on site) |
| **The Window (7-8:30pm)** | Decisions to approve, concise summaries |
| **After 8:30pm** | Queue for morning, don't message |
| **Text vs Voice replies** | Text replies preferred (confirmed Feb 15) |
| **Daily reports** | Max 2 voice notes (5 min each), not markdown dumps |

#### Every Heartbeat, Each Agent Should:
1. **Check queue.json** — claim highest-priority pending task
2. **Do the work** — execute the task
3. **Mark done** — update queue.json status
4. **Cross-post** — "What did I find that another agent needs?" → send it
5. **Check buddy** — is my buddy alive? Wake them if stalled
6. **Check JSONL inbox** — process incoming messages from other agents
7. **Write memory** — log what happened this cycle
8. **Update status.json** — heartbeat timestamp + current task

#### What's NOT Working in Comms (Feb 18)
- ❌ Buddy system generated **0 cross-agent tasks** — agents work in isolation
- ❌ Most agents don't check their JSONL inboxes (not in HEARTBEAT.md)
- ❌ queue.json files were empty until Builder populated them manually (Feb 17 23:55)
- ❌ No structured acknowledgment tracking — just append-only markdown
- ❌ Herald isn't producing daily briefs
- ❌ Midday sync doesn't happen
- ❌ Work generator (every 2h cron) creates tasks but agents may not pick them up
- ⚠️ Agent bridge wake messages only process on next heartbeat interval (up to 30 min delay)

---

## 4. Automation (Cron Jobs)

| Schedule | Script | Purpose |
|----------|--------|---------|
| */2 min | crash-loop-guard.sh | Auto-rollback Rivet on crash loop |
| */5 min | stall-detector.js --auto-wake | Detect stalled agents, diagnose, wake |
| */5 min | health-check.sh | RateRight app HTTP health check |
| */10 min | context-monitor.js --auto-fix | Monitor context/session sizes |
| */30 min | buddy-check.js --auto-wake | Cross-check agent pairs |
| */30 min | inbox-curator.js --auto-archive | Clean stale JSONL inbox messages |
| */30 min | git sync | Backup git repos |
| Every 2h | work-generator.js | Generate tasks for idle agents |
| Every 3h (staggered) | curate-agent.sh {agent} | Memory curation per agent |
| 3:00 AM | session-archiver.sh | Archive old sessions (prevent bloat) |
| 4:00 AM | scrub-logs.sh | Clean old log files |

**Scripts location:** /home/ccuser/shared/scripts/

---

## 5. External Services

### Core Platform Services
| Service | Purpose | Config Location |
|---------|---------|-----------------|
| **Supabase** | Database + Auth + RLS | .env.local (NEXT_PUBLIC_SUPABASE_*) |
| **Stripe** | $50 AUD hire payments | .env.local (STRIPE_*) |
| **OpenAI** | Whisper transcription, AI features | .env.local (OPENAI_API_KEY) |
| **Resend** | Email sending | .env.local (RESEND_API_KEY) |

### AI Model Providers
| Service | Purpose | Config Location |
|---------|---------|-----------------|
| **Anthropic** | Opus/Sonnet for agents | /root/.clawdbot/.env |
| **DeepSeek** | Budget model for fleet | /root/.clawdbot-{agent}/.env |
| **MiniMax** | DO NOT USE (crashes Clawdbot) | N/A |
| **Moonshot** | Kimi models (currently suspended) | /root/.clawdbot-{agent}/.env |
| **Gemini** | Flash model for research | /root/.clawdbot/google-gemini.json |

### External Integrations
| Service | Purpose | Config Location |
|---------|---------|-----------------|
| **Google Workspace** | 6 email addresses on rateright.com.au | /root/.clawdbot/google-oauth.json |
| **Google Calendar** | Scheduling, reminders | Google OAuth |
| **Google Drive** | Document storage, marketing assets | Google OAuth |
| **Twilio** | Voice calls, SMS | /root/.clawdbot/secrets.json |
| **ElevenLabs** | TTS for phone calls (Charlie voice @ 1.2x) | /root/.clawdbot/secrets.json |
| **Edge TTS** | Chat voice notes (FREE, en-AU-WilliamNeural) | Built into Clawdbot |
| **Apollo** | Contact enrichment for sales | Agent configs |
| **Exa** | Company search for prospecting | Agent configs |
| **Notion** | Business ops, planning | Clawdbot config |
| **Brave Search** | Web search for agents | Agent configs |
| **Perplexity** | Deep research (Susan, Radar, Harper) | Agent configs |
| **VAPI** | Backup voice AI assistant | phone-ai repo .env |

### API Key Locations
- App keys: `/home/ccuser/the-50-dollar-app/.env.local`
- Agent keys: `/root/.clawdbot/.env` (shared) + `/root/.clawdbot-{agent}/.env` (per-profile)
- Google OAuth: `/root/.clawdbot/google-oauth.json`
- Google API: `/root/.clawdbot/google-api-key.json`
- Secrets (ElevenLabs, Twilio, Supabase): `/root/.clawdbot/secrets.json`

---

## 6. Phone & Voice Infrastructure

### Phone Lines
| Number | Owner | Purpose |
|--------|-------|---------|
| +61 426 246 472 | Michael | Personal/founder |
| +61 468 087 171 | Business | RateRight main line |
| +61 238 205 443 | Rivet | Twilio voice line |

### Voice Configuration
| Component | Config |
|-----------|--------|
| **Chat TTS** | Edge TTS (FREE) — en-AU-WilliamNeural (Australian male) |
| **Phone calls** | Clawdbot voice-call plugin (Twilio + ElevenLabs Charlie @ 1.2x speed) |
| **VAPI backup** | Assistant ID: 63ac4ff9-e562-4c8e-9c19-c1f1c87ed034 (GPT-4o + Rivet personality) |

### Voice Usage Rules
- **Work hours (6am-6pm):** Send voice notes to Michael — can't read text on site
- **Off hours:** Text is fine
- **ElevenLabs:** 110k chars/month (Creator tier) — reserve for phone calls only

---

## 7. Email Infrastructure

### Google Workspace Accounts (rateright.com.au)
| Address | Owner | Purpose |
|---------|-------|---------|
| michael@rateright.com.au | Michael | Founder, official correspondence, grant applications |
| support@rateright.com.au | Dedicated account | Customer support, public-facing |
| finance@rateright.com.au | Harper (alias) | Grants, invoicing, compliance, legal |
| sales@rateright.com.au | Susan (alias) | Outreach, lead comms, partnerships |
| hello@rateright.com.au | General (alias) | Website contact form, general enquiries |
| noreply@rateright.com.au | System (alias) | Automated notifications, system emails |

*Note: Aliases route to Michael's inbox. support@ is a separate account.*

---

## 8. Tax & Compliance

| Property | Detail |
|----------|--------|
| **Company** | RateRight Pty Ltd |
| **ABN** | 62 841 523 907 |
| **Address** | Elizabeth St, Surry Hills NSW |
| **GST status** | Registered |
| **Quarterly BAS** | Self-lodged (no external accountant) |
| **Annual return** | External accountant (engaged once per year) |
| **Current BAS** | Q2 FY2025-26 (Oct-Dec 2025), due 28 Feb 2026 |
| **BAS owner** | Harper (Finance agent) |

---

## 9. Record Keeping (How Memory Should Work)

### Per-Agent Memory
Every agent should maintain:
- **Daily files:** `memory/YYYY-MM-DD.md` — raw logs of what happened each session
- **MEMORY.md** — curated long-term memory (distilled from daily files)
- **status.json** — current state (updated every heartbeat)

### Expected Memory Health
| Agent | Expected | Actual (Feb 18) | Status |
|-------|----------|-----------------|--------|
| Rivet | Heavy (coordinator, multi-domain) | 387 files | ✅ Healthy (possibly bloated) |
| Builder | Lean (relies on code + MEMORY.md) | 5 files | ✅ Appropriate |
| Susan | Growing (leads, contacts, outreach) | 19 files | ✅ Active |
| Harper | Seasonal (BAS, grants, compliance) | 5 files | ✅ OK for seasonal |
| Sentinel | Should log infra events | 1 file | ❌ Amnesiac |
| Radar | Should log research findings | 2 files | ❌ Nearly amnesiac |
| Herald | Should log briefs sent, routing | 1 file | ❌ Nearly amnesiac |
| Cog | Should log curation actions | 0 files | ❌ Completely amnesiac |

### Memory Rules
1. Every heartbeat cycle should produce at least one memory write (if the agent did anything)
2. Memory files should never contain secrets, API keys, or credentials
3. MEMORY.md should be reviewed and updated weekly (distill daily files into long-term)
4. Rivet's memory should be pruned — 387 files across 25 subdirs is bloated

---

## 10. Curation (How Work Should Flow)

### Task Lifecycle
```
Rivet creates task → writes to agent queue.json
  → Agent claims task (status: in_progress)
    → Agent does the work
      → Agent marks task done (status: completed)
        → Agent writes results to RIVET-INBOX.md or inbox JSONL
          → Rivet ACKs and creates follow-up tasks if needed
```

### What "Curation" Means In This System
- **Cog** monitors JSONL inboxes — archives completed messages, escalates stuck ones
- **Stall detector** monitors agent health — wakes stalled agents
- **Work generator** creates tasks for idle agents (runs every 2 hours)
- **Buddy check** cross-checks agent pairs (runs every 30 min)
- **Session archiver** cleans old sessions (runs daily at 3am)
- **Agent curation** (every 3h) — archives stale daily files, cleans queues

### Expected Curation State
- No JSONL message should be unread for > 4 hours (normal priority)
- No agent should be task-less for > 2 hours during business hours (08:00-22:00 AEDT)
- fleet-state.json `decisions_needed` should have 0 unresolved entries
- queue.json for each agent should have at least 1 pending or in_progress task during business hours

---

## 11. Security & Access

### VPS Access
| Who | Access Level | Method |
|-----|--------------|--------|
| Michael | Full root | SSH key |
| Rivet (agent) | ccuser + root via scripts | Local process |
| Builder (agent) | ccuser only | Local process |
| Other agents | ccuser only | Local process |

### Security Practices
- **API keys:** .env files only, NEVER in git, NEVER in session logs
- **Key rotation:** Quarterly (documented in Rivet's memory)
- **Log scrubbing:** Daily cron at 4am removes keys from session logs
- **External skills:** Scan before installing, prefer self-written
- **Plan mode:** For risky ops, propose before executing

### Known Incidents
- **2026-02-12:** 13 API keys found in session logs (scrubbed, no unauthorized usage)
- **Root cause:** Reading .env files in sessions → Mitigation: NEVER read .env in sessions

---

## 12. Cost Tracking

### Monthly Infrastructure Costs
| Item | Cost | Notes |
|------|------|-------|
| DigitalOcean VPS | ~$48/month | 4 vCPU, 8GB RAM |
| Domain (rateright.com.au) | ~$15/year | |
| Fly.io (v1 site) | ~$5/month | Minimal usage |
| Railway (Growth Engine) | ~$5/month | Auto-sleep |

### API/Model Costs (Estimated Monthly)
| Provider | Budget | Current Usage |
|----------|--------|---------------|
| Anthropic (Claude) | ~$50-100 | Rivet + Builder heavy users |
| DeepSeek | ~$5-10 | Fleet agents |
| ElevenLabs | Included in Creator tier | 110k chars/month limit |
| Twilio | Pay-per-use | ~$10/month for calls |

### Target Budget
- **Total monthly:** ~$100 approved by Michael
- **Breakeven for RateRight:** 2-3 hires/month @ $50 each

---

## 13. Model Strategy

| Tier | Models | Used By | Cost |
|------|--------|---------|------|
| **Premium** | Opus 4.6 | Rivet, Builder | $$$ — decisions + code quality |
| **Standard** | Sonnet 4 | Herald, fallback for all | $$ — quality comms |
| **Budget** | DeepSeek Chat | Susan, Sentinel, Radar, Cog, Harper | $ — routine ops |
| **Free** | Moonshot Kimi K2.5 | (Currently suspended) | Free — bulk work |
| **Broken** | MiniMax M2.5 | None (crashes Clawdbot) | $ — DO NOT USE until fixed |

**Fallback principle:** Every agent must have at least 2 working fallbacks. Never rely on a single model.

---

## 14. Known Issues & Tech Debt

### Active Issues
1. **MiniMax crashes Clawdbot** — `mapOptionsForApi: undefined` when MiniMax is primary. Agents switched to DeepSeek. Do not use MiniMax as primary until Clawdbot is updated.
2. **Moonshot suspended** — $300 top-up hasn't reflected. Harper switched to DeepSeek fallback chain.
3. **Herald on Opus** — expensive for message routing. Should be Sonnet or DeepSeek.
4. **Rivet service startup** — `clawdbot-gateway` may show inactive but Rivet runs (may be detection issue).

### Architectural Issues
5. **4 agents have questionable value** — Herald (overlaps Rivet), Cog (undefined role), Sentinel (premature), Radar (low ROI). Consolidation to 4-5 agents recommended.
6. **Memory inequality** — Rivet: 387 files. Cog/Herald: 0-1 files. Amnesiac agents waste tokens.
7. **Fleet-state.json is stale** — write-only, nobody cleans resolved items.
8. **21 `any` types** in RateRight codebase.
9. **0 real hires** through the app end-to-end.

### Resolved Issues (for reference)
- Susan crash loop (stale Moonshot session override) — fixed Feb 17
- MiniMax wrong API endpoint — fixed Feb 17
- Context overflow on Rivet — fixed Feb 17 (80K limit)
- Session archiver — running daily at 3am

---

## 15. Audit Checklist

Run this checklist to compare actual state to baseline:

### Services
- [ ] All 8 agent services active (`systemctl is-active`)
- [ ] All 8 agents HTTP-responsive on their ports
- [ ] rateright-app active on port 3000
- [ ] nginx active and proxying

### Health
- [ ] All status.json timestamps < 60 min old
- [ ] No agents in crash loop (check `journalctl -u clawdbot-{agent} --since "1 hour ago" | grep FAILURE`)
- [ ] Disk < 50%, RAM < 80%
- [ ] No `mapOptionsForApi: undefined` errors in any agent logs

### Communication
- [ ] BUILDER-INBOX.md has no unACK'd entries
- [ ] fleet-state.json decisions_needed is empty or all resolved
- [ ] JSONL inboxes don't have stale unread messages (>4h for normal priority)

### Memory
- [ ] All agents have memory/ files growing
- [ ] No agent has 0 memory files (indicates amnesia)

### Models
- [ ] All agents' primary model is responding (not suspended/broken)
- [ ] No stale session model overrides (check session JSONL for `model_change` entries)

### App
- [ ] https://rivet.rateright.com.au responds with 200
- [ ] CEO dashboard loads at /ceo
- [ ] Build passes: `cd /home/ccuser/the-50-dollar-app && npm run build`

---

## 16. Recovery Procedures

### Agent Won't Heartbeat
1. Check journalctl for error pattern
2. If `mapOptionsForApi: undefined` → model issue. Switch primary, archive sessions, restart
3. If `context` errors → archive sessions, restart
4. If service keeps crashing → check config validity: `clawdbot --profile {agent} config validate`

### Model Provider Down
1. Check if API responds: `curl https://api.{provider}.com/...`
2. If down, the fallback chain should handle it automatically
3. If all fallbacks fail, switch to known-working model (DeepSeek/Sonnet)
4. **Always archive sessions after model changes**

### Session Override Bug
1. Find session: `ls /root/.clawdbot-{agent}/agents/main/sessions/`
2. Check for stale overrides: `grep model_change *.jsonl`
3. Archive/delete the session file
4. Restart agent: `systemctl restart clawdbot-{agent}`

### Full Fleet Restart
```bash
for agent in susan sentinel radar cog herald harper; do
  systemctl restart clawdbot-$agent
done
systemctl restart clawdbot-gateway  # Rivet
systemctl restart clawdbot-builder  # Builder
```

### Deployment Procedure (RateRight v2)
```bash
cd /home/ccuser/the-50-dollar-app
git pull origin main
npm install
npm run build
systemctl restart rateright-app
```

---

*This baseline should be reviewed and updated monthly, or after any major architecture change.*
*Last updated: 2026-02-18 09:15 AEDT by Rivet*
