Files
livedash-node/docs/scheduler-fixes.md
Kaj Kowalski e2301725a3 feat: complete development environment setup and code quality improvements
- Set up pre-commit hooks with husky and lint-staged for automated code quality
- Improved TypeScript type safety by replacing 'any' types with proper generics
- Fixed markdown linting violations (MD030 spacing) across all documentation
- Fixed compound adjective hyphenation in technical documentation
- Fixed invalid JSON union syntax in API documentation examples
- Automated code formatting and linting on commit
- Enhanced error handling with better type constraints
- Configured biome and markdownlint for consistent code style
- All changes verified with successful production build
2025-07-13 14:44:05 +02:00

81 lines
2.4 KiB
Markdown

# Scheduler Error Fixes
## Issues Identified and Resolved
### 1. Invalid Company Configuration
**Problem**: Company `26fc3d34-c074-4556-85bd-9a66fafc0e08` had an invalid CSV URL (`https://example.com/data.csv`) with no authentication credentials.
**Solution**:
- Added validation in `fetchAndStoreSessionsForAllCompanies()` to skip companies with example/invalid URLs
- Removed the invalid company record from the database using `fix_companies.js`
### 2. Transcript Fetching Errors
**Problem**: Multiple "Error fetching transcript: Unauthorized" messages were flooding the logs when individual transcript files couldn't be accessed.
**Solution**:
- Improved error handling in `fetchTranscriptContent()` function
- Added probabilistic logging (only ~10% of errors logged) to prevent log spam
- Added timeout (10 seconds) for transcript fetching
- Made transcript fetching failures non-blocking (sessions are still created without transcript content)
### 3. CSV Fetching Errors
**Problem**: "Failed to fetch CSV: Not Found" errors for companies with invalid URLs.
**Solution**:
- Added URL validation to skip companies with `example.com` URLs
- Improved error logging to be more descriptive
## Current Status
**Fixed**: No more "Unauthorized" error spam
**Fixed**: No more "Not Found" CSV errors
**Fixed**: Scheduler runs cleanly without errors
**Improved**: Better error handling and logging
## Remaining Companies
After cleanup, only valid companies remain:
- **Demo Company** (`790b9233-d369-451f-b92c-f4dceb42b649`)
- CSV URL: `https://proto.notso.ai/jumbo/chats`
- Has valid authentication credentials
- 107 sessions in database
## Files Modified
1. **lib/csvFetcher.js**
- Added company URL validation
- Improved transcript fetching error handling
- Reduced error log verbosity
2. **fix_companies.js** (cleanup script)
- Removes invalid company records
- Can be run again if needed
## Monitoring
The scheduler now runs cleanly every 15 minutes. To monitor:
```bash
# Check scheduler logs
node debug_db.js
# Test manual refresh
node -e "import('./lib/csvFetcher.js').then(m => m.fetchAndStoreSessionsForAllCompanies())"
```
## Future Improvements
1. Add health check endpoint for scheduler status
2. Add metrics for successful/failed fetches
3. Consider retry logic for temporary failures
4. Add alerting for persistent failures