# Fix Missing API Timeouts Plan

## Problem
External API calls have no timeout configured. If an external service hangs, requests hang indefinitely.

**Impact:**
- Resource exhaustion (connections held open)
- Worker thread blocking
- User-facing requests timeout at load balancer (bad UX)
- Memory leaks from pending promises

## Severity
**CRITICAL** - Can cause cascading failures and service outages.

## Affected Files

### Services Using fetch() Without Timeout
| File | Line | External Service |
|------|------|------------------|
| `src/services/perplexity.js` | 24 | Perplexity AI API |
| `src/services/slack.js` | 36 | Slack Webhooks |
| `src/services/platformSync.js` | 33 | RateRight Platform API |
| `src/services/qualityAudit.js` | 147, 242, 272 | Health checks, OpenAI |

### Services Using OpenAI SDK Without Timeout
| File | Line | Issue |
|------|------|-------|
| `src/services/ai.js` | 9-11 | OpenAI client has no timeout option |

## Solution

### 1. Add Timeout Helper for fetch()
Create a utility function:

```javascript
// src/utils/fetchWithTimeout.js
async function fetchWithTimeout(url, options = {}, timeoutMs = 30000) {
  const controller = new AbortController();
  const timeoutId = setTimeout(() => controller.abort(), timeoutMs);

  try {
    const response = await fetch(url, {
      ...options,
      signal: controller.signal,
    });
    return response;
  } finally {
    clearTimeout(timeoutId);
  }
}

module.exports = { fetchWithTimeout };
```

### 2. Configure OpenAI Client Timeout
```javascript
// src/services/ai.js
openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  timeout: 60000, // 60 seconds (AI calls can be slow)
  maxRetries: 2,  // Retry on timeout
});
```

### 3. Update fetch() Calls

**perplexity.js:**
```javascript
const { fetchWithTimeout } = require('../utils/fetchWithTimeout');

const response = await fetchWithTimeout(PERPLEXITY_API_URL, {
  method: 'POST',
  headers: {...},
  body: JSON.stringify({...}),
}, 45000); // 45 second timeout
```

**slack.js:**
```javascript
const response = await fetchWithTimeout(webhookUrl, {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify(payload),
}, 10000); // 10 second timeout (Slack is fast)
```

**platformSync.js:**
```javascript
const response = await fetchWithTimeout(url, {
  method: 'GET',
  headers: {...},
}, 30000); // 30 second timeout
```

## Timeout Values

| Service | Timeout | Reason |
|---------|---------|--------|
| OpenAI | 60s | AI generation can be slow |
| Perplexity | 45s | Research queries take time |
| Slack | 10s | Should be very fast |
| Platform Sync | 30s | Database queries |
| Health Checks | 5s | Quick validation |

## Implementation Steps
- [x] Phase 1: Create `src/utils/fetchWithTimeout.js` helper ✅
- [x] Phase 2: Add timeout to OpenAI client in `src/services/ai.js` ✅
- [x] Phase 3: Update `src/services/perplexity.js` to use fetchWithTimeout ✅
- [x] Phase 4: Update `src/services/slack.js` to use fetchWithTimeout ✅
- [x] Phase 5: Update `src/services/platformSync.js` to use fetchWithTimeout ✅
- [x] Phase 6: Update `src/services/qualityAudit.js` to use fetchWithTimeout ✅
- [ ] Phase 7: Test timeout behavior (mock slow response) - QA

## Build Progress

### ✅ All Code Changes Complete
**Files created:**
- `src/utils/fetchWithTimeout.js` - fetchWithTimeout helper with isTimeoutError utility

**Files modified:**
- `src/services/ai.js` - OpenAI client timeout (60s) + maxRetries (2)
- `src/services/perplexity.js` - Uses fetchWithTimeout (45s timeout)
- `src/services/slack.js` - Uses fetchWithTimeout (10s timeout)
- `src/services/platformSync.js` - Uses fetchWithTimeout (30s timeout)
- `src/services/qualityAudit.js` - Uses fetchWithTimeout (5s timeout for health checks)

## Files to Create/Modify

| File | Change |
|------|--------|
| `src/utils/fetchWithTimeout.js` | NEW - Timeout wrapper |
| `src/services/ai.js` | Add timeout to OpenAI config |
| `src/services/perplexity.js` | Use fetchWithTimeout |
| `src/services/slack.js` | Use fetchWithTimeout |
| `src/services/platformSync.js` | Use fetchWithTimeout |
| `src/services/qualityAudit.js` | Use fetchWithTimeout |

## Database Migration
None required.

## Success Criteria
1. OpenAI calls timeout after 60s (not hang forever)
2. Slack calls timeout after 10s
3. All fetch() calls use the timeout wrapper
4. Timeout errors are logged and handled gracefully

## Notes for Builder

### fetchWithTimeout Error Handling
```javascript
try {
  const response = await fetchWithTimeout(url, options, 30000);
  // Handle response
} catch (error) {
  if (error.name === 'AbortError') {
    console.error('Request timed out');
    return { success: false, error: 'Request timed out' };
  }
  throw error;
}
```

### OpenAI SDK Timeout
The OpenAI Node.js SDK v4+ supports:
```javascript
new OpenAI({
  apiKey: '...',
  timeout: 60000,      // Request timeout in ms
  maxRetries: 2,       // Retry count
});
```

## Notes for QA
1. Test with network throttling to simulate slow responses
2. Verify timeout errors return proper error messages
3. Check logs show timeout events
4. Ensure UI shows appropriate error to user

## Why This Matters
- **Resource exhaustion:** Hanging requests hold connections, threads, memory
- **Cascading failure:** One slow service can bring down the whole app
- **Bad UX:** Users see spinning forever instead of error message
- **Cost:** Hanging OpenAI requests still count towards rate limits
- **Debugging:** Hard to diagnose hanging requests in production