refactor: fix biome linting issues and update project documentation

- Fix 36+ biome linting issues reducing errors/warnings from 227 to 191
- Replace explicit 'any' types with proper TypeScript interfaces
- Fix React hooks dependencies and useCallback patterns
- Resolve unused variables and parameter assignment issues
- Improve accessibility with proper label associations
- Add comprehensive API documentation for admin and security features
- Update README.md with accurate PostgreSQL setup and current tech stack
- Create complete documentation for audit logging, CSP monitoring, and batch processing
- Fix outdated project information and missing developer workflows
This commit is contained in:
2025-07-11 21:50:53 +02:00
committed by Kaj Kowalski
parent 3e9e75e854
commit 1eea2cc3e4
121 changed files with 28687 additions and 4895 deletions

View File

@ -0,0 +1,531 @@
# Batch Processing Monitoring Dashboard
This document describes the batch processing monitoring dashboard and API endpoints for tracking OpenAI Batch API operations in the LiveDash application.
## Overview
The Batch Monitoring Dashboard provides real-time visibility into the OpenAI Batch API processing pipeline, including job status tracking, cost analysis, and performance monitoring. This system enables 50% cost reduction on AI processing while maintaining comprehensive oversight.
## Features
### Real-time Monitoring
- **Job Status Tracking**: Monitor batch jobs from creation to completion
- **Queue Management**: View pending, running, and completed batch queues
- **Processing Metrics**: Track throughput, success rates, and error patterns
- **Cost Analysis**: Monitor API costs and savings compared to individual requests
### Performance Analytics
- **Batch Efficiency**: Analyze batch size optimization and processing times
- **Success Rates**: Track completion and failure rates across different job types
- **Resource Utilization**: Monitor API quota usage and rate limiting
- **Historical Trends**: View processing patterns over time
### Administrative Controls
- **Manual Intervention**: Pause, resume, or cancel batch operations
- **Priority Management**: Adjust processing priorities for urgent requests
- **Error Handling**: Review and retry failed batch operations
- **Configuration Management**: Adjust batch parameters and thresholds
## API Endpoints
### Batch Monitoring API
Retrieve comprehensive batch processing metrics and status information.
```http
GET /api/admin/batch-monitoring
```
#### Query Parameters
| Parameter | Type | Description | Default | Example |
|-----------|------|-------------|---------|---------|
| `timeRange` | string | Time range for metrics | `24h` | `?timeRange=7d` |
| `status` | string | Filter by batch status | - | `?status=completed` |
| `jobType` | string | Filter by job type | - | `?jobType=ai_analysis` |
| `includeDetails` | boolean | Include detailed job information | `false` | `?includeDetails=true` |
| `page` | number | Page number for pagination | 1 | `?page=2` |
| `limit` | number | Records per page (max 100) | 50 | `?limit=25` |
#### Example Request
```javascript
const response = await fetch('/api/admin/batch-monitoring?' + new URLSearchParams({
timeRange: '24h',
status: 'completed',
includeDetails: 'true'
}));
const data = await response.json();
```
#### Response Format
```json
{
"success": true,
"data": {
"summary": {
"totalJobs": 156,
"completedJobs": 142,
"failedJobs": 8,
"pendingJobs": 6,
"totalRequests": 15600,
"processedRequests": 14200,
"costSavings": {
"currentPeriod": 234.56,
"projectedMonthly": 7038.45,
"savingsPercentage": 48.2
},
"averageProcessingTime": 1800000,
"successRate": 95.2
},
"queues": {
"pending": 12,
"processing": 3,
"completed": 142,
"failed": 8
},
"performance": {
"throughput": {
"requestsPerHour": 650,
"jobsPerHour": 6.5,
"averageBatchSize": 100
},
"efficiency": {
"batchUtilization": 87.3,
"processingEfficiency": 92.1,
"errorRate": 4.8
}
},
"jobs": [
{
"id": "batch-job-123",
"batchId": "batch_abc123",
"status": "completed",
"jobType": "ai_analysis",
"requestCount": 100,
"completedCount": 98,
"failedCount": 2,
"createdAt": "2024-01-01T10:00:00Z",
"startedAt": "2024-01-01T10:05:00Z",
"completedAt": "2024-01-01T10:35:00Z",
"processingTimeMs": 1800000,
"costEstimate": 12.50,
"errorSummary": [
{
"error": "token_limit_exceeded",
"count": 2,
"percentage": 2.0
}
]
}
]
}
}
```
## Dashboard Components
### BatchMonitoringDashboard Component
The main dashboard component (`components/admin/BatchMonitoringDashboard.tsx`) provides:
#### Key Metrics Cards
```tsx
// Real-time overview cards
<MetricCard
title="Total Jobs"
value={data.summary.totalJobs}
change={"+12 from yesterday"}
trend="up"
/>
<MetricCard
title="Success Rate"
value={`${data.summary.successRate}%`}
change={"+2.1% from last week"}
trend="up"
/>
<MetricCard
title="Cost Savings"
value={`$${data.summary.costSavings.currentPeriod}`}
change={`${data.summary.costSavings.savingsPercentage}% vs individual API`}
trend="up"
/>
```
#### Queue Status Visualization
```tsx
// Visual representation of batch job queues
<QueueStatusChart
pending={data.queues.pending}
processing={data.queues.processing}
completed={data.queues.completed}
failed={data.queues.failed}
/>
```
#### Performance Charts
```tsx
// Processing throughput over time
<ThroughputChart
data={data.performance.throughput}
timeRange={timeRange}
/>
// Cost savings trend
<CostSavingsChart
savings={data.summary.costSavings}
historical={data.historical}
/>
```
#### Job Management Table
```tsx
// Detailed job listing with actions
<BatchJobTable
jobs={data.jobs}
onRetry={handleRetryJob}
onCancel={handleCancelJob}
onViewDetails={handleViewDetails}
/>
```
## Usage Examples
### Monitor Batch Performance
```javascript
async function monitorBatchPerformance() {
const response = await fetch('/api/admin/batch-monitoring?timeRange=24h');
const data = await response.json();
const performance = data.data.performance;
// Check if performance is within acceptable ranges
if (performance.efficiency.errorRate > 10) {
console.warn('High error rate detected:', performance.efficiency.errorRate + '%');
// Get failed jobs for analysis
const failedJobs = await fetch('/api/admin/batch-monitoring?status=failed');
const failures = await failedJobs.json();
// Analyze common failure patterns
const errorSummary = failures.data.jobs.reduce((acc, job) => {
job.errorSummary?.forEach(error => {
acc[error.error] = (acc[error.error] || 0) + error.count;
});
return acc;
}, {});
console.log('Error patterns:', errorSummary);
}
}
```
### Cost Savings Analysis
```javascript
async function analyzeCostSavings() {
const response = await fetch('/api/admin/batch-monitoring?timeRange=30d&includeDetails=true');
const data = await response.json();
const savings = data.data.summary.costSavings;
return {
currentSavings: savings.currentPeriod,
projectedAnnual: savings.projectedMonthly * 12,
savingsRate: savings.savingsPercentage,
totalProcessed: data.data.summary.processedRequests,
averageCostPerRequest: savings.currentPeriod / data.data.summary.processedRequests
};
}
```
### Retry Failed Jobs
```javascript
async function retryFailedJobs() {
// Get failed jobs
const response = await fetch('/api/admin/batch-monitoring?status=failed');
const data = await response.json();
const retryableJobs = data.data.jobs.filter(job => {
// Only retry jobs that failed due to temporary issues
const hasRetryableErrors = job.errorSummary?.some(error =>
['rate_limit_exceeded', 'temporary_error', 'timeout'].includes(error.error)
);
return hasRetryableErrors;
});
// Retry jobs individually
for (const job of retryableJobs) {
try {
await fetch(`/api/admin/batch-monitoring/${job.id}/retry`, {
method: 'POST'
});
console.log(`Retried job ${job.id}`);
} catch (error) {
console.error(`Failed to retry job ${job.id}:`, error);
}
}
}
```
### Real-time Dashboard Updates
```javascript
function useRealtimeBatchMonitoring() {
const [data, setData] = useState(null);
const [isLoading, setIsLoading] = useState(true);
useEffect(() => {
const fetchData = async () => {
try {
const response = await fetch('/api/admin/batch-monitoring?timeRange=1h');
const result = await response.json();
setData(result.data);
} catch (error) {
console.error('Failed to fetch batch monitoring data:', error);
} finally {
setIsLoading(false);
}
};
// Initial fetch
fetchData();
// Update every 30 seconds
const interval = setInterval(fetchData, 30000);
return () => clearInterval(interval);
}, []);
return { data, isLoading };
}
```
## Configuration
### Batch Processing Settings
Configure batch processing parameters in environment variables:
```bash
# Batch Processing Configuration
BATCH_PROCESSING_ENABLED="true"
BATCH_CREATE_INTERVAL="*/5 * * * *" # Create batches every 5 minutes
BATCH_STATUS_CHECK_INTERVAL="*/2 * * * *" # Check status every 2 minutes
BATCH_RESULT_PROCESSING_INTERVAL="*/1 * * * *" # Process results every minute
# Batch Size and Limits
BATCH_MAX_REQUESTS="1000" # Maximum requests per batch
BATCH_TIMEOUT_HOURS="24" # Batch timeout in hours
BATCH_MIN_SIZE="10" # Minimum batch size
# Monitoring Configuration
BATCH_MONITORING_RETENTION_DAYS="30" # How long to keep monitoring data
BATCH_ALERT_THRESHOLD_ERROR_RATE="10" # Alert if error rate exceeds 10%
BATCH_ALERT_THRESHOLD_PROCESSING_TIME="3600" # Alert if processing takes >1 hour
```
### Dashboard Refresh Settings
```javascript
// Configure dashboard update intervals
const DASHBOARD_CONFIG = {
refreshInterval: 30000, // 30 seconds
alertRefreshInterval: 10000, // 10 seconds for alerts
detailRefreshInterval: 60000, // 1 minute for detailed views
maxRetries: 3, // Maximum retry attempts
retryDelay: 5000 // Delay between retries
};
```
## Alerts and Notifications
### Automated Alerts
The system automatically generates alerts for:
```javascript
const alertConditions = {
highErrorRate: {
threshold: 10, // Error rate > 10%
severity: 'high',
notification: 'immediate'
},
longProcessingTime: {
threshold: 3600000, // > 1 hour
severity: 'medium',
notification: 'hourly'
},
lowThroughput: {
threshold: 0.5, // < 0.5 jobs per hour
severity: 'medium',
notification: 'daily'
},
batchFailure: {
threshold: 1, // Any complete batch failure
severity: 'critical',
notification: 'immediate'
}
};
```
### Custom Alert Configuration
```javascript
// Configure custom alerts through the admin interface
async function configureAlerts(alertConfig) {
const response = await fetch('/api/admin/batch-monitoring/alerts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
errorRateThreshold: alertConfig.errorRate,
processingTimeThreshold: alertConfig.processingTime,
notificationChannels: alertConfig.channels,
alertSuppression: alertConfig.suppression
})
});
return response.json();
}
```
## Troubleshooting
### Common Issues
#### High Error Rates
```javascript
// Investigate high error rates
async function investigateErrors() {
const response = await fetch('/api/admin/batch-monitoring?status=failed&includeDetails=true');
const data = await response.json();
// Group errors by type
const errorAnalysis = data.data.jobs.reduce((acc, job) => {
job.errorSummary?.forEach(error => {
if (!acc[error.error]) {
acc[error.error] = { count: 0, jobs: [] };
}
acc[error.error].count += error.count;
acc[error.error].jobs.push(job.id);
});
return acc;
}, {});
console.log('Error analysis:', errorAnalysis);
return errorAnalysis;
}
```
#### Slow Processing
```javascript
// Analyze processing bottlenecks
async function analyzePerformance() {
const response = await fetch('/api/admin/batch-monitoring?timeRange=24h&includeDetails=true');
const data = await response.json();
const slowJobs = data.data.jobs
.filter(job => job.processingTimeMs > 3600000) // > 1 hour
.sort((a, b) => b.processingTimeMs - a.processingTimeMs);
console.log('Slowest jobs:', slowJobs.slice(0, 5));
// Analyze patterns
const avgByType = slowJobs.reduce((acc, job) => {
if (!acc[job.jobType]) {
acc[job.jobType] = { total: 0, count: 0 };
}
acc[job.jobType].total += job.processingTimeMs;
acc[job.jobType].count++;
return acc;
}, {});
Object.keys(avgByType).forEach(type => {
avgByType[type].average = avgByType[type].total / avgByType[type].count;
});
return avgByType;
}
```
### Performance Optimization
#### Batch Size Optimization
```javascript
// Analyze optimal batch sizes
async function optimizeBatchSizes() {
const response = await fetch('/api/admin/batch-monitoring?timeRange=7d&includeDetails=true');
const data = await response.json();
// Group by batch size ranges
const sizePerformance = data.data.jobs.reduce((acc, job) => {
const sizeRange = Math.floor(job.requestCount / 50) * 50; // Group by 50s
if (!acc[sizeRange]) {
acc[sizeRange] = {
jobs: 0,
totalTime: 0,
totalRequests: 0,
successRate: 0
};
}
acc[sizeRange].jobs++;
acc[sizeRange].totalTime += job.processingTimeMs;
acc[sizeRange].totalRequests += job.requestCount;
acc[sizeRange].successRate += job.completedCount / job.requestCount;
return acc;
}, {});
// Calculate averages
Object.keys(sizePerformance).forEach(range => {
const perf = sizePerformance[range];
perf.avgTimePerRequest = perf.totalTime / perf.totalRequests;
perf.avgSuccessRate = perf.successRate / perf.jobs;
});
return sizePerformance;
}
```
## Integration with Existing Systems
### Security Audit Integration
All batch monitoring activities are logged through the security audit system:
```javascript
// Automatic audit logging for monitoring activities
await securityAuditLogger.logPlatformAdmin(
'batch_monitoring_access',
AuditOutcome.SUCCESS,
context,
'Admin accessed batch monitoring dashboard'
);
```
### Rate Limiting Integration
Monitoring API endpoints use the existing rate limiting system:
```javascript
// Protected by admin rate limiting
const rateLimitResult = await rateLimiter.check(
`admin-batch-monitoring:${userId}`,
60, // 60 requests
60 * 1000 // per minute
);
```
## Related Documentation
- [Batch Processing Optimizations](./batch-processing-optimizations.md)
- [Security Monitoring](./security-monitoring.md)
- [Admin Audit Logs API](./admin-audit-logs-api.md)
- [OpenAI Batch API Integration](../lib/batchProcessor.ts)
The batch monitoring dashboard provides comprehensive visibility into the AI processing pipeline, enabling administrators to optimize performance, monitor costs, and ensure reliable operation of the batch processing system.