DB refactor

This commit is contained in:
Max Kowalski
2025-06-27 23:05:46 +02:00
parent 185bb6da58
commit 2dfc49f840
20 changed files with 1607 additions and 339 deletions

View File

@ -1,24 +1,28 @@
# Scheduler Workflow Documentation
## Overview
The LiveDash system has two main schedulers that work together to fetch and process session data:
1. **Session Refresh Scheduler** - Fetches new sessions from CSV files
2. **Processing Scheduler** - Processes session transcripts with AI
1. **Session Refresh Scheduler** - Fetches new sessions from CSV files
2. **Processing Scheduler** - Processes session transcripts with AI
## Current Status (as of latest check)
- **Total sessions**: 107
- **Processed sessions**: 0
- **Sessions with transcript**: 0
- **Ready for processing**: 0
- **Total sessions**: 107
- **Processed sessions**: 0
- **Sessions with transcript**: 0
- **Ready for processing**: 0
## How the `processed` Field Works
The ProcessingScheduler picks up sessions where `processed` is **NOT** `true`, which includes:
- `processed = false`
- `processed = null`
- `processed = false`
- `processed = null`
**Query used:**
```javascript
{ processed: { not: true } } // Either false or null
```
@ -26,50 +30,60 @@ The ProcessingScheduler picks up sessions where `processed` is **NOT** `true`, w
## Complete Workflow
### Step 1: Session Refresh (CSV Fetching)
**What it does:**
- Fetches session data from company CSV URLs
- Creates session records in database with basic metadata
- Sets `transcriptContent = null` initially
- Sets `processed = null` initially
- Fetches session data from company CSV URLs
- Creates session records in database with basic metadata
- Sets `transcriptContent = null` initially
- Sets `processed = null` initially
**Runs:** Every 30 minutes (cron: `*/30 * * * *`)
### Step 2: Transcript Fetching
**What it does:**
- Downloads full transcript content for sessions
- Updates `transcriptContent` field with actual conversation data
- Sessions remain `processed = null` until AI processing
- Downloads full transcript content for sessions
- Updates `transcriptContent` field with actual conversation data
- Sessions remain `processed = null` until AI processing
**Runs:** As part of session refresh process
### Step 3: AI Processing
**What it does:**
- Finds sessions with transcript content where `processed != true`
- Sends transcripts to OpenAI for analysis
- Extracts: sentiment, category, questions, summary, etc.
- Updates session with processed data
- Sets `processed = true`
- Finds sessions with transcript content where `processed != true`
- Sends transcripts to OpenAI for analysis
- Extracts: sentiment, category, questions, summary, etc.
- Updates session with processed data
- Sets `processed = true`
**Runs:** Every hour (cron: `0 * * * *`)
## Manual Trigger Commands
### Check Current Status
```bash
node scripts/manual-triggers.js status
```
### Trigger Session Refresh (Fetch new sessions from CSV)
```bash
node scripts/manual-triggers.js refresh
```
### Trigger AI Processing (Process unprocessed sessions)
```bash
node scripts/manual-triggers.js process
```
### Run Both Schedulers
```bash
node scripts/manual-triggers.js both
```
@ -77,36 +91,42 @@ node scripts/manual-triggers.js both
## Troubleshooting
### No Sessions Being Processed?
1. **Check if sessions have transcripts:**
1. **Check if sessions have transcripts:**
```bash
node scripts/manual-triggers.js status
```
2. **If "Sessions with transcript" is 0:**
- Sessions exist but transcripts haven't been fetched yet
- Run session refresh: `node scripts/manual-triggers.js refresh`
2. **If "Sessions with transcript" is 0:**
- Sessions exist but transcripts haven't been fetched yet
- Run session refresh: `node scripts/manual-triggers.js refresh`
3. **If "Ready for processing" is 0 but "Sessions with transcript" > 0:**
- All sessions with transcripts have already been processed
- Check if `OPENAI_API_KEY` is set in environment
3. **If "Ready for processing" is 0 but "Sessions with transcript" > 0:**
- All sessions with transcripts have already been processed
- Check if `OPENAI_API_KEY` is set in environment
### Common Issues
#### "No sessions found requiring processing"
- All sessions with transcripts have been processed (`processed = true`)
- Or no sessions have transcript content yet
- All sessions with transcripts have been processed (`processed = true`)
- Or no sessions have transcript content yet
#### "OPENAI_API_KEY environment variable is not set"
- Add OpenAI API key to `.env.development` file
- Restart the application
- Add OpenAI API key to `.env.development` file
- Restart the application
#### "Error fetching transcript: Unauthorized"
- CSV credentials are incorrect or expired
- Check company CSV username/password in database
- CSV credentials are incorrect or expired
- Check company CSV username/password in database
## Database Field Mapping
### Before AI Processing
```javascript
{
id: "session-uuid",
@ -120,6 +140,7 @@ node scripts/manual-triggers.js both
```
### After AI Processing
```javascript
{
id: "session-uuid",
@ -141,15 +162,17 @@ node scripts/manual-triggers.js both
## Scheduler Configuration
### Session Refresh Scheduler
- **File**: `lib/scheduler.js`
- **Frequency**: Every 30 minutes
- **Cron**: `*/30 * * * *`
- **File**: `lib/scheduler.js`
- **Frequency**: Every 30 minutes
- **Cron**: `*/30 * * * *`
### Processing Scheduler
- **File**: `lib/processingScheduler.js`
- **Frequency**: Every hour
- **Cron**: `0 * * * *`
- **Batch size**: 10 sessions per run
- **File**: `lib/processingScheduler.js`
- **Frequency**: Every hour
- **Cron**: `0 * * * *`
- **Batch size**: 10 sessions per run
## Environment Variables Required
@ -167,19 +190,22 @@ NEXTAUTH_URL="http://localhost:3000"
## Next Steps for Testing
1. **Trigger session refresh** to fetch transcripts:
1. **Trigger session refresh** to fetch transcripts:
```bash
node scripts/manual-triggers.js refresh
```
2. **Check status** to see if transcripts were fetched:
2. **Check status** to see if transcripts were fetched:
```bash
node scripts/manual-triggers.js status
```
3. **Trigger processing** if transcripts are available:
3. **Trigger processing** if transcripts are available:
```bash
node scripts/manual-triggers.js process
```
4. **View results** in the dashboard session details pages
4. **View results** in the dashboard session details pages