# AI Call Scoring

> Auto-score calls 1-10, flag best/worst for coaching

**Created:** Jan 20, 2026
**Status:** Complete ✅
**Built:** Jan 21, 2026
**Effort:** 8-12 hours
**Ongoing Cost:** ~$20-40/mo (additional GPT-4 calls)

---

## Problem

No objective way to measure call quality:
- Which calls were excellent? Which need coaching?
- What patterns lead to successful calls?
- How is each rep improving over time?

Gong charges $1000+/user for this. We can build it for ~$30/mo.

---

## Solution

AI analyzes every call transcript and scores on multiple dimensions.

### Scoring Dimensions

| Dimension | Weight | What It Measures |
|-----------|--------|------------------|
| **Rapport Building** | 15% | Personal connection, warmth, listening |
| **Discovery** | 20% | Questions asked, needs uncovered |
| **Value Proposition** | 20% | Benefits explained, differentiation |
| **Objection Handling** | 20% | Responses to concerns, pivot skill |
| **Close Attempt** | 15% | Asked for commitment, next steps |
| **Professionalism** | 10% | Tone, language, confidence |

### Score Output

```json
{
  "call_id": "uuid",
  "overall_score": 7.2,
  "grade": "B",
  "dimensions": {
    "rapport": { "score": 8, "notes": "Good opener, asked about weekend" },
    "discovery": { "score": 6, "notes": "Only 2 questions asked, missed budget" },
    "value_prop": { "score": 7, "notes": "Mentioned exclusive leads, forgot pricing" },
    "objection_handling": { "score": 8, "notes": "Handled 'too expensive' well" },
    "close": { "score": 5, "notes": "No clear ask for next step" },
    "professionalism": { "score": 9, "notes": "Confident, friendly tone" }
  },
  "highlights": [
    "Great rapport - lead mentioned their kids",
    "Strong objection pivot on pricing"
  ],
  "improvements": [
    "Ask more discovery questions before pitching",
    "Always end with clear next step"
  ],
  "coaching_tip": "Try the 3-question rule: ask 3 questions before presenting value",
  "similar_successful_calls": ["uuid1", "uuid2"]
}
```

### Grade Scale

| Grade | Score | Label |
|-------|-------|-------|
| A+ | 9.5-10 | Exceptional |
| A | 8.5-9.4 | Excellent |
| B+ | 7.5-8.4 | Good |
| B | 6.5-7.4 | Solid |
| C+ | 5.5-6.4 | Needs Work |
| C | 4.5-5.4 | Below Average |
| D | 3.0-4.4 | Poor |
| F | <3.0 | Coaching Required |

---

## Database Schema

```sql
CREATE TABLE IF NOT EXISTS call_scores (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  communication_id UUID REFERENCES communications(id) ON DELETE CASCADE,
  lead_id UUID REFERENCES leads(id) ON DELETE SET NULL,
  user_id UUID REFERENCES auth.users(id) ON DELETE SET NULL,

  -- Scores (1-10)
  overall_score DECIMAL(3,1) NOT NULL,
  grade VARCHAR(2) NOT NULL,
  rapport_score DECIMAL(3,1),
  discovery_score DECIMAL(3,1),
  value_prop_score DECIMAL(3,1),
  objection_score DECIMAL(3,1),
  close_score DECIMAL(3,1),
  professionalism_score DECIMAL(3,1),

  -- Analysis
  dimension_notes JSONB,
  highlights TEXT[],
  improvements TEXT[],
  coaching_tip TEXT,

  -- Metadata
  transcript_length INTEGER,
  call_duration_seconds INTEGER,
  had_objection BOOLEAN DEFAULT FALSE,
  outcome VARCHAR(50),
  scored_at TIMESTAMPTZ DEFAULT NOW(),

  UNIQUE(communication_id)
);

-- Indexes
CREATE INDEX idx_call_scores_user ON call_scores(user_id);
CREATE INDEX idx_call_scores_date ON call_scores(scored_at DESC);
CREATE INDEX idx_call_scores_grade ON call_scores(grade);
CREATE INDEX idx_call_scores_overall ON call_scores(overall_score DESC);
```

---

## Scoring Prompt

```javascript
const SCORING_PROMPT = `
You are a sales call analyst. Score this call transcript on a 1-10 scale.

TRANSCRIPT:
${transcript}

CONTEXT:
- Lead: ${leadName}, ${trade}
- Call Duration: ${duration} minutes
- Outcome: ${outcome}

Score each dimension (1-10):
1. RAPPORT: Did rep build personal connection? Listen actively?
2. DISCOVERY: Did rep ask good questions? Uncover needs/pain?
3. VALUE PROP: Did rep explain benefits clearly? Differentiate?
4. OBJECTION HANDLING: How well did rep address concerns?
5. CLOSE: Did rep ask for commitment? Set clear next step?
6. PROFESSIONALISM: Tone, confidence, language quality?

Respond in JSON format:
{
  "overall_score": 7.2,
  "grade": "B",
  "dimensions": {
    "rapport": { "score": 8, "notes": "..." },
    ...
  },
  "highlights": ["...", "..."],
  "improvements": ["...", "..."],
  "coaching_tip": "..."
}
`;
```

---

## API Endpoints

```
GET  /api/calls/:id/score              # Get call score
POST /api/calls/:id/score              # Trigger scoring (manual)
GET  /api/calls/scores?user=id         # List scores for user
GET  /api/calls/scores/best            # Top 10 calls this month
GET  /api/calls/scores/needs-coaching  # Calls scoring <5
GET  /api/calls/scores/stats           # Aggregate stats
```

---

## Auto-Scoring Integration

```javascript
// In src/routes/voice.js after recording completes
async function onRecordingComplete(callData) {
  // Existing: Save transcript
  await saveTranscript(callData);

  // New: Queue for scoring
  await scoreQueue.add({
    communicationId: callData.id,
    transcript: callData.transcript,
    leadId: callData.leadId,
    userId: callData.userId,
    duration: callData.duration
  });
}

// Score processor (can be async/background)
async function processCallScore(job) {
  const score = await analyzeCallWithAI(job.data);
  await saveCallScore(score);

  // Notify if exceptional or needs coaching
  if (score.grade === 'A+') {
    await notifyExceptionalCall(job.data.userId, score);
  } else if (score.overall_score < 5) {
    await flagForCoaching(job.data.userId, score);
  }
}
```

---

## UI Components

### CallScoreCard.jsx
- Shows on Lead Profile next to call in history
- Grade badge (A/B/C/D/F with color)
- Click to expand full analysis

### CallScoreModal.jsx
- Full breakdown of all dimensions
- Highlights and improvements
- "Watch similar successful calls" links

### CallScoresDashboard.jsx
- Rep's score trends over time
- Best/worst calls list
- Dimension breakdown chart
- Compare to team average

### CoachingQueue.jsx (Manager View)
- List of calls needing review
- Filter by rep, grade, date
- Quick actions: Send feedback, Schedule 1:1

---

## Implementation Steps

### Phase 1: Database + Scoring (3-4 hours)
1. [x] Create migration
2. [x] Create scoring service with GPT-4
3. [x] Create score calculation logic
4. [ ] Add to post-call flow (needs frontend wiring)

### Phase 2: API + Background Job (2-3 hours)
1. [x] Create call scores routes
2. [ ] Set up background scoring queue (future enhancement)
3. [x] Implement scoring processor
4. [ ] Add notification triggers (future enhancement)

### Phase 3: Frontend - Call View (2-3 hours)
1. [x] Create AICallScoreCard
2. [x] Create score modal (integrated into card)
3. [ ] Add to LeadProfile call history (needs integration)

### Phase 4: Frontend - Dashboard (2-3 hours)
1. [ ] Create CallScoresDashboard (use existing CallQualityDashboard)
2. [ ] Add trends chart (use existing trends)
3. [ ] Create CoachingQueue for managers (use existing coaching alerts)

---

## Cost Estimate

| Volume | GPT-4 Cost |
|--------|------------|
| 50 calls/day | ~$15/mo |
| 100 calls/day | ~$30/mo |
| 200 calls/day | ~$60/mo |

Assumes ~500 tokens input, ~300 tokens output per call.

---

## Success Metrics

- Every call scored within 5 minutes of completion
- Rep average scores improve 10% over 30 days
- Coaching interventions on <5 score calls
- Top calls shared for team learning