# Inbound Intel Mining - Plan

## Problem
Incoming voicemails, SMS, and calls contain valuable intel (locations, headcount, timelines, objections, decision makers) that currently gets logged but NOT analyzed. Reps have to manually read and remember everything.

## Solution
Auto-extract intel from every inbound communication and add to lead profile. Zero user effort required - just works in the background.

---

## What Already Exists (Reusable)

| Function | Location | What it does |
|----------|----------|--------------|
| `extractTranscriptIntel()` | src/services/ai.js:1147 | Extracts companies, projects, people, intel nuggets from text |
| `extractPersonalIntelFromConversations()` | src/services/ai.js:1211 | Deep analysis for personal + business intel |
| `saveExtractedIntel()` | src/services/ai.js:1376 | Persists to lead_intel table with merge |
| `extractAndSaveCallIntel()` | src/services/ai.js:1510 | Orchestrates extraction + save (used for outbound calls) |
| Intent classification | src/utils/intent.js | Classifies SMS as positive/negative/question |
| Buying signal detection | src/routes/webhooks.js | Detects high/medium/low signals |
| Voicemail transcription | src/routes/voice.js:509 | Deepgram transcribes recordings |

**Key insight:** The extraction functions already exist and work. We just need to CALL them from the inbound webhooks.

---

## What Needs to Be Built

### 1. SMS Intel Extraction Hook

**Location:** `src/routes/webhooks.js` - POST /api/webhooks/twilio/inbound

**Current flow:**
```
SMS arrives → normalize phone → find lead → classify intent → detect buying signal → log to communications → Slack alert
```

**New flow:**
```
SMS arrives → normalize phone → find lead → classify intent → detect buying signal → log to communications → Slack alert → TRIGGER INTEL EXTRACTION (async)
```

**Implementation:**
```javascript
// After logging communication (line ~180 in webhooks.js)
// Don't await - run async to not slow webhook response
if (lead) {
  extractInboundIntel(lead.id, messageBody, 'sms').catch(err =>
    console.error('Intel extraction failed:', err)
  );
}
```

### 2. Voicemail Intel Extraction Hook

**Location:** `src/routes/voice.js` - transcribeRecording() function

**Current flow:**
```
Voicemail recorded → Deepgram transcribes → Update communications with transcript → Done
```

**New flow:**
```
Voicemail recorded → Deepgram transcribes → Update communications with transcript → TRIGGER INTEL EXTRACTION (async)
```

**Implementation:**
```javascript
// After transcript saved to communications (line ~540 in voice.js)
if (communication.lead_id) {
  extractInboundIntel(communication.lead_id, transcript, 'voicemail').catch(err =>
    console.error('Voicemail intel extraction failed:', err)
  );
}
```

### 3. Inbound Call Intel Extraction Hook

**Location:** `src/routes/voice.js` - after inbound call recording is transcribed

**Current flow:**
```
Inbound call ends → Recording saved → Transcribed (if enabled) → Done
```

**New flow:**
```
Inbound call ends → Recording saved → Transcribed → TRIGGER INTEL EXTRACTION (async)
```

### 4. New Unified Extraction Function

**Location:** `src/services/ai.js`

```javascript
/**
 * Extract intel from any inbound communication
 * @param {string} leadId - Lead UUID
 * @param {string} content - Message text or transcript
 * @param {string} source - 'sms', 'voicemail', 'inbound_call'
 */
async function extractInboundIntel(leadId, content, source) {
  // Skip if content too short
  if (!content || content.length < 20) return;

  // Get lead context
  const lead = await getLead(leadId);
  if (!lead) return;

  // Extract quick intel from this specific message
  const quickIntel = await extractQuickIntel(content, source);

  // If significant intel found, do deep extraction
  if (quickIntel.hasSignificantIntel) {
    await extractAndSaveCallIntel(leadId); // Reuse existing function
  } else {
    // Just save the quick intel
    await saveQuickIntel(leadId, quickIntel, source);
  }

  // Update lead metadata if needed
  if (quickIntel.urgentFlags) {
    await updateLeadUrgency(leadId, quickIntel.urgentFlags);
  }
}
```

### 5. Quick Intel Extraction (Lightweight)

For every inbound message, do a fast extraction without full GPT call:

```javascript
async function extractQuickIntel(content, source) {
  // Use GPT-4o-mini for speed/cost
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [{
      role: 'system',
      content: `Extract key intel from this ${source} message. Be concise.

Return JSON:
{
  "locations": ["suburb or area mentioned"],
  "headcount": "number of workers needed if mentioned",
  "timeline": "when they need it",
  "trade": "trade type if mentioned",
  "objection": "any concern or pushback",
  "question": "any question they asked",
  "decisionMaker": "other person mentioned (partner, boss)",
  "urgency": "high/medium/low",
  "hasSignificantIntel": true/false
}`
    }, {
      role: 'user',
      content: content
    }],
    response_format: { type: 'json_object' },
    max_tokens: 300
  });

  return JSON.parse(response.choices[0].message.content);
}
```

---

## Database Changes

**None required!** All storage tables already exist:
- `lead_intel` - personal and business intel
- `communications.metadata` - per-message intel
- `lead.metadata` - aggregated signals
- `lead_dossier` - living document (optional enhancement)

**Optional enhancement:** Add `intel_source` tracking
```sql
-- Track where intel came from
ALTER TABLE lead_intel ADD COLUMN IF NOT EXISTS
  intel_sources JSONB DEFAULT '[]';
-- Example: [{"source": "sms", "date": "2026-01-18", "excerpt": "need 12 concreters"}]
```

---

## What Gets Extracted

| From SMS | Example |
|----------|---------|
| Locations | "we're in Parramatta" → Location: Parramatta |
| Headcount | "need about 12 guys" → Headcount: 12 |
| Timeline | "starting next month" → Timeline: Next month |
| Decision makers | "checking with my partner" → Decision maker: Has partner |
| Objections | "bit pricey" → Objection: Price concern |
| Questions | "do you cover Newcastle?" → Question: Service area |
| Trades | "need sparkies" → Trade: Electrical |

| From Voicemail | Example |
|----------------|---------|
| All of the above | Same extraction |
| Callback request | "give us a bell" → Wants callback |
| Urgency | "ASAP" or "no rush" → Urgency level |
| Name confirmation | "it's Tony here" → Confirmed first name |
| Company name | "from Acme Formwork" → Company: Acme Formwork |

---

## UI Changes

### Lead Profile - Intel Section

Add "Auto-captured" badge to intel that came from inbound:

```
+------------------------------------------+
| INTEL                                     |
+------------------------------------------+
| 📍 Location: Parramatta [Auto-captured]  |
| 👷 Needs: 12 concreters [Auto-captured]  |
| 📅 Timeline: Next month [Auto-captured]  |
| 👥 Decision: Has business partner        |
| 🚧 Objection: Price concern              |
+------------------------------------------+
| Last updated: 2 hours ago from SMS       |
+------------------------------------------+
```

### Call Prep Page Enhancement

Show auto-captured intel prominently:

```
+------------------------------------------+
| WHAT WE KNOW (Auto-captured)             |
+------------------------------------------+
| From their voicemail yesterday:          |
| "Big job in Parramatta, need 12 guys"    |
|                                          |
| From SMS 3 days ago:                     |
| "Checking with business partner first"   |
+------------------------------------------+
```

---

## Async Processing

**Critical:** Don't slow down webhooks. All extraction runs async.

```javascript
// In webhook handler
res.status(200).send('OK'); // Respond to Twilio immediately

// Then process async (fire and forget with error handling)
setImmediate(async () => {
  try {
    await extractInboundIntel(leadId, content, source);
  } catch (err) {
    console.error('Async intel extraction failed:', err);
    // Log to error tracking but don't crash
  }
});
```

---

## Cost Control

| Measure | Implementation |
|---------|----------------|
| Min content length | Skip extraction if < 20 chars |
| Use GPT-4o-mini | Fast and cheap for quick extraction |
| Cache results | Don't re-extract same message |
| Throttle per lead | Max 5 extractions per lead per hour |
| Skip duplicates | Don't extract if identical recent message |

---

## Implementation Order

1. [ ] Create `extractInboundIntel()` function in ai.js
2. [ ] Create `extractQuickIntel()` lightweight extractor
3. [ ] Hook into SMS webhook (webhooks.js)
4. [ ] Hook into voicemail transcription (voice.js)
5. [ ] Hook into inbound call transcription (voice.js)
6. [ ] Add async processing wrapper
7. [ ] Test with real inbound SMS
8. [ ] Test with voicemail
9. [ ] Update Lead Profile UI to show auto-captured intel
10. [ ] Update Call Prep to surface recent intel

---

## Success Metrics

- Intel extracted per day (should match inbound volume)
- % of leads with auto-captured intel
- Intel accuracy (spot check)
- Time to first intel capture after signup

---

## Estimated Effort

| Phase | Effort |
|-------|--------|
| Backend extraction functions | 0.5 day |
| Hook into webhooks | 0.5 day |
| Testing | 0.5 day |
| UI updates (optional) | 0.5 day |
| **Total** | **1.5-2 days** |

---

## Why This Should Be High Priority

1. **Foundational** - Better intel makes EVERY other feature smarter:
   - Call Prep briefs become richer
   - Live Copilot has more context
   - Actionable Playbook messages are more personalized
   - AI scoring is more accurate

2. **Zero friction** - Works in background, no user action needed

3. **Low effort** - Most code already exists, just wiring it up

4. **Compounds over time** - Every conversation makes the lead profile richer

**Recommended priority: #15** (before Tooltips, Battleground, Playbook)

---

*Created: Jan 18, 2026*
