# TOOLS.md — Sentinel's Operational Toolbox

---

## System Monitoring

```bash
# CPU and load
uptime
cat /proc/loadavg

# Memory
free -h

# Disk
df -h /
du -sh /home/ccuser/*/         # Per-directory breakdown
du -sh /tmp/                   # Temp files (frequent offender)
journalctl --disk-usage        # Journal size

# Processes
ps aux --sort=-%mem | head -20  # Top memory consumers
ps aux --sort=-%cpu | head -20  # Top CPU consumers
```

## Agent Fleet Management

### Service Control
```bash
# Status check (all 8)
systemctl is-active clawdbot-gateway clawdbot-builder clawdbot-susan clawdbot-harper clawdbot-sentinel clawdbot-radar clawdbot-herald clawdbot-cog

# Restart an agent
sudo systemctl restart clawdbot-<name>

# View logs (live)
sudo journalctl -u clawdbot-<name> -f --no-pager

# Recent logs
sudo journalctl -u clawdbot-<name> --since "1 hour ago" --no-pager

# Count restarts in 24h
journalctl -u clawdbot-<name> --since "24 hours ago" | grep -c "Scheduled restart"
```

### Agent Ports & Tokens
| Agent | Port | Token | Service |
|-------|------|-------|---------|
| Rivet | 18789 | rivet-api-2026 | clawdbot-gateway |
| Builder | 18790 | builder-api-2026 | clawdbot-builder |
| Susan | 18792 | susan-api-2026 | clawdbot-susan |
| Harper | 18796 | harper-api-2026 | clawdbot-harper |
| Sentinel | 18800 | sentinel-api-2026 | clawdbot-sentinel |
| Radar | 18804 | radar-api-2026 | clawdbot-radar |
| Herald | 18808 | herald-api-2026 | clawdbot-herald |
| Cog | 18812 | cog-api-2026 | clawdbot-cog |

Each agent: base (gateway), base+2 (browser), base+3 (canvas).

### Config Paths
```
/root/.clawdbot/clawdbot.json            # Rivet
/root/.clawdbot-builder/clawdbot.json    # Builder
/root/.clawdbot-susan/clawdbot.json      # Susan
/root/.clawdbot-harper/clawdbot.json     # Harper
/root/.clawdbot-sentinel/clawdbot.json   # Sentinel
/root/.clawdbot-radar/clawdbot.json      # Radar
/root/.clawdbot-herald/clawdbot.json     # Herald
/root/.clawdbot-cog/clawdbot.json        # Cog
```

## Inter-Agent Communication

### Inbox System (Primary)
```bash
# Read unread messages
node /home/ccuser/shared/scripts/inbox.js read --agent sentinel --unread

# Acknowledge a message
node /home/ccuser/shared/scripts/inbox.js ack --agent sentinel --id <msg-id>

# Send to another agent
node /home/ccuser/shared/scripts/inbox.js send --from sentinel --to <agent> --subject "Subject" --body "Body"
```

### Fleet State
```bash
# Compact briefing
node /home/ccuser/shared/scripts/fleet-cli.js briefing sentinel --brief

# Full briefing
node /home/ccuser/shared/scripts/fleet-cli.js briefing sentinel

# Update my status
node /home/ccuser/shared/scripts/fleet-update.js sentinel --status active --task "description"

# Alert
node /home/ccuser/shared/scripts/fleet-cli.js alert sentinel "message"

# Decision needed
node /home/ccuser/shared/scripts/fleet-cli.js decide sentinel "question"
```

### HTTP Bridge
```bash
# Check agent status
agent-bridge <target> status

# Wake an agent
agent-bridge <target> wake "message"

# Direct HTTP (manual)
curl -s -X POST http://127.0.0.1:<PORT>/tools/invoke \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"tool":"session_status","args":{}}'
```

## Critical Services

### RateRight App
```bash
# Health check
curl -s -o /dev/null -w "%{http_code} %{time_total}s" http://localhost:3000/

# External check
curl -s -o /dev/null -w "%{http_code} %{time_total}s" https://rivet.rateright.com.au/

# Service control
systemctl status rateright-app
sudo systemctl restart rateright-app
```

### SSL Certificates
```bash
# Check expiry
echo | openssl s_client -servername <domain> -connect <domain>:443 2>/dev/null | openssl x509 -noout -dates
```

### Network
```bash
# Listening ports
ss -tlnp

# Check specific port
ss -tlnp | grep <port>

# External connectivity
curl -s -o /dev/null -w "%{http_code}" https://rateright.com.au/
```

## Cleanup Operations

```bash
# Stale pip temp (recurring issue)
rm -rf /tmp/pip-unpack-* /tmp/pip-build-env-* /tmp/pip-install-*

# Old journal logs
sudo journalctl --vacuum-size=1G

# Stale lock files (can block agent startup)
ls /tmp/clawdbot-0/gateway.*.lock
```

## VPS Details
- **Provider:** DigitalOcean syd1
- **Specs:** 2 vCPU, 8GB RAM, 80GB disk
- **IP:** 134.199.153.159
- **OS:** Ubuntu, Linux 6.8.0-100-generic (x64)

## Alert Thresholds
| Metric | Warning | Critical |
|--------|---------|----------|
| CPU load | >3.0 (sustained) | >4.0 |
| RAM used | >70% | >85% |
| Disk used | >70% | >85% |
| Agent down | 1 agent | >2 agents |
| RateRight app | >500ms response | HTTP non-200 |
| SSL cert expiry | <30 days | <7 days |
