Enhance data integration and transcript parsing

- Improved date parsing in fetch_and_store_chat_data to support multiple formats and added error logging for unparseable dates.
- Enhanced parse_and_store_transcript_messages to handle empty transcripts and expanded message pattern recognition for both User and Assistant.
- Implemented intelligent splitting of transcripts based on detected patterns and timestamps, with fallback mechanisms for unrecognized formats.
- Updated documentation for Celery and Redis setup, troubleshooting, and project structure.
- Added markdown linting configuration and scripts for code formatting.
- Updated Nginx configuration to change the web server port.
- Added xlsxwriter dependency for Excel file handling in project requirements.
This commit is contained in:
2025-05-18 19:18:31 +00:00
parent 8bbbb109bd
commit f0ae061fa7
24 changed files with 1672 additions and 931 deletions

View File

@ -6,10 +6,10 @@ This document explains how to set up and use Redis and Celery for background tas
The data integration module uses Celery to handle:
- Periodic data fetching from external APIs
- Processing and storing CSV data
- Downloading and parsing transcript files
- Manual data refresh triggered by users
- Periodic data fetching from external APIs
- Processing and storing CSV data
- Downloading and parsing transcript files
- Manual data refresh triggered by users
## Installation
@ -31,32 +31,33 @@ redis-cli ping # Should output PONG
After installation, check if Redis is properly configured:
1. Open Redis configuration file:
1. Open Redis configuration file:
```bash
sudo nano /etc/redis/redis.conf
```
```bash
sudo nano /etc/redis/redis.conf
```
2. Ensure the following settings:
2. Ensure the following settings:
```bash
# For development (localhost only)
bind 127.0.0.1
```bash
# For development (localhost only)
bind 127.0.0.1
# For production (accept connections from specific IP)
# bind 127.0.0.1 your.server.ip.address
# For production (accept connections from specific IP)
# bind 127.0.0.1 your.server.ip.address
# Protected mode (recommended)
protected-mode yes
# Protected mode (recommended)
protected-mode yes
# Port
port 6379
```
# Port
port 6379
```
3. Restart Redis after any changes:
```bash
sudo systemctl restart redis-server
```
3. Restart Redis after any changes:
```bash
sudo systemctl restart redis-server
```
#### macOS
@ -79,7 +80,7 @@ If Redis is not available, the application will automatically fall back to using
Set these environment variables in your `.env` file or deployment environment:
```env
```sh
# Redis Configuration
REDIS_HOST=localhost
REDIS_PORT=6379
@ -126,28 +127,29 @@ docker-compose up -d
Development requires multiple terminal windows:
1. **Django Development Server**:
1. **Django Development Server**:
```bash
make run
```
```bash
make run
```
2. **Redis Server** (if needed):
2. **Redis Server** (if needed):
```bash
make run-redis
```
```bash
make run-redis
```
3. **Celery Worker**:
3. **Celery Worker**:
```bash
make celery
```
```bash
make celery
```
4. **Celery Beat** (for scheduled tasks):
```bash
make celery-beat
```
4. **Celery Beat** (for scheduled tasks):
```bash
make celery-beat
```
Or use the combined command:
@ -161,12 +163,12 @@ make run-all
If you see connection errors:
1. Check that Redis is running: `redis-cli ping` should return `PONG`
2. Verify firewall settings are not blocking port 6379
3. Check Redis binding in `/etc/redis/redis.conf` (should be `bind 127.0.0.1` for local dev)
1. Check that Redis is running: `redis-cli ping` should return `PONG`
2. Verify firewall settings are not blocking port 6379
3. Check Redis binding in `/etc/redis/redis.conf` (should be `bind 127.0.0.1` for local dev)
### Celery Workers Not Processing Tasks
1. Ensure the worker is running with the correct app name: `celery -A dashboard_project worker`
2. Check the Celery logs for errors
3. Verify broker URL settings in both code and environment variables
1. Ensure the worker is running with the correct app name: `celery -A dashboard_project worker`
2. Check the Celery logs for errors
3. Verify broker URL settings in both code and environment variables