feat: complete development environment setup and code quality improvements

- Set up pre-commit hooks with husky and lint-staged for automated code quality
- Improved TypeScript type safety by replacing 'any' types with proper generics
- Fixed markdown linting violations (MD030 spacing) across all documentation
- Fixed compound adjective hyphenation in technical documentation
- Fixed invalid JSON union syntax in API documentation examples
- Automated code formatting and linting on commit
- Enhanced error handling with better type constraints
- Configured biome and markdownlint for consistent code style
- All changes verified with successful production build
This commit is contained in:
2025-07-13 14:44:05 +02:00
parent 1d4e695e41
commit e2301725a3
54 changed files with 2335 additions and 1863 deletions

View File

@ -6,11 +6,11 @@ This document describes the extracted scheduler architecture that enables horizo
The scheduler system has been refactored from a monolithic approach to a service-oriented architecture that supports:
- **Individual Scheduler Services** - Each scheduler runs as a separate service
- **Horizontal Scaling** - Multiple instances of the same scheduler can run across different machines
- **Health Monitoring** - Built-in health checks for load balancers and orchestrators
- **Graceful Shutdown** - Proper handling of shutdown signals for zero-downtime deployments
- **Centralized Management** - Optional scheduler manager for coordinated operations
- **Individual Scheduler Services** - Each scheduler runs as a separate service
- **Horizontal Scaling** - Multiple instances of the same scheduler can run across different machines
- **Health Monitoring** - Built-in health checks for load balancers and orchestrators
- **Graceful Shutdown** - Proper handling of shutdown signals for zero-downtime deployments
- **Centralized Management** - Optional scheduler manager for coordinated operations
## Components
@ -34,11 +34,11 @@ export abstract class BaseSchedulerService extends EventEmitter {
**Features:**
- Status management (STOPPED, STARTING, RUNNING, PAUSED, ERROR)
- Metrics collection (run counts, timing, success/failure rates)
- Event emission for monitoring
- Configurable intervals and timeouts
- Automatic retry handling
- Status management (STOPPED, STARTING, RUNNING, PAUSED, ERROR)
- Metrics collection (run counts, timing, success/failure rates)
- Event emission for monitoring
- Configurable intervals and timeouts
- Automatic retry handling
### 2. Individual Scheduler Services
@ -57,16 +57,16 @@ const csvScheduler = new CsvImportSchedulerService({
**Features:**
- Batch processing with configurable concurrency
- Duplicate detection
- Company-specific error handling
- Progress monitoring
- Batch processing with configurable concurrency
- Duplicate detection
- Company-specific error handling
- Progress monitoring
#### Additional Schedulers (To Be Implemented)
- `ImportProcessingSchedulerService` - Process imported CSV data into sessions
- `SessionProcessingSchedulerService` - AI analysis and categorization
- `BatchProcessingSchedulerService` - OpenAI Batch API integration
- `ImportProcessingSchedulerService` - Process imported CSV data into sessions
- `SessionProcessingSchedulerService` - AI analysis and categorization
- `BatchProcessingSchedulerService` - OpenAI Batch API integration
### 3. SchedulerManager
@ -88,10 +88,10 @@ await manager.startAll();
**Features:**
- Automatic restart of failed critical schedulers
- Health monitoring across all schedulers
- Coordinated start/stop operations
- Event aggregation and logging
- Automatic restart of failed critical schedulers
- Health monitoring across all schedulers
- Coordinated start/stop operations
- Event aggregation and logging
### 4. Standalone Scheduler Runner
@ -107,10 +107,10 @@ npx tsx lib/services/schedulers/StandaloneSchedulerRunner.ts --list
**Features:**
- Independent process execution
- Environment variable configuration
- Graceful shutdown handling
- Health reporting for monitoring
- Independent process execution
- Environment variable configuration
- Graceful shutdown handling
- Health reporting for monitoring
## Deployment Patterns
@ -127,15 +127,15 @@ await initializeSchedulers();
**Pros:**
- Simple deployment
- Lower resource usage
- Easy local development
- Simple deployment
- Lower resource usage
- Easy local development
**Cons:**
- Limited scalability
- Single point of failure
- Resource contention
- Limited scalability
- Single point of failure
- Resource contention
### 2. Separate Processes
@ -154,15 +154,15 @@ npm run scheduler:session-processing
**Pros:**
- Independent scaling
- Fault isolation
- Resource optimization per scheduler
- Independent scaling
- Fault isolation
- Resource optimization per scheduler
**Cons:**
- More complex deployment
- Higher resource overhead
- Inter-process coordination needed
- More complex deployment
- Higher resource overhead
- Inter-process coordination needed
### 3. Container Orchestration (Recommended for Production)
@ -193,16 +193,16 @@ services:
**Pros:**
- Full horizontal scaling
- Independent resource allocation
- Health monitoring integration
- Zero-downtime deployments
- Full horizontal scaling
- Independent resource allocation
- Health monitoring integration
- Zero-downtime deployments
**Cons:**
- Complex orchestration setup
- Network latency considerations
- Distributed system challenges
- Complex orchestration setup
- Network latency considerations
- Distributed system challenges
## Configuration
@ -379,46 +379,46 @@ csv-import-scheduler-eu:
### From Current Architecture
1. **Phase 1: Extract Schedulers**
- ✅ Create BaseSchedulerService
- ✅ Implement CsvImportSchedulerService
- ✅ Create SchedulerManager
- ⏳ Implement remaining scheduler services
1. **Phase 1: Extract Schedulers**
- ✅ Create BaseSchedulerService
- ✅ Implement CsvImportSchedulerService
- ✅ Create SchedulerManager
- ⏳ Implement remaining scheduler services
2. **Phase 2: Deployment Options**
- ✅ Add ServerSchedulerIntegration for backwards compatibility
- ✅ Create StandaloneSchedulerRunner
- ✅ Add health check endpoints
2. **Phase 2: Deployment Options**
- ✅ Add ServerSchedulerIntegration for backwards compatibility
- ✅ Create StandaloneSchedulerRunner
- ✅ Add health check endpoints
3. **Phase 3: Container Support**
- ⏳ Create Dockerfile for scheduler containers
- ⏳ Add Kubernetes manifests
- ⏳ Implement distributed coordination
3. **Phase 3: Container Support**
- ⏳ Create Dockerfile for scheduler containers
- ⏳ Add Kubernetes manifests
- ⏳ Implement distributed coordination
4. **Phase 4: Production Migration**
- ⏳ Deploy separate scheduler containers
- ⏳ Monitor performance and stability
- ⏳ Gradually increase horizontal scaling
4. **Phase 4: Production Migration**
- ⏳ Deploy separate scheduler containers
- ⏳ Monitor performance and stability
- ⏳ Gradually increase horizontal scaling
### Breaking Changes
- Scheduler initialization moved from `server.ts` to `ServerSchedulerIntegration`
- Individual scheduler functions replaced with service classes
- Configuration moved to environment variables
- Scheduler initialization moved from `server.ts` to `ServerSchedulerIntegration`
- Individual scheduler functions replaced with service classes
- Configuration moved to environment variables
## Benefits
1. **Scalability**: Independent scaling of different scheduler types
2. **Reliability**: Fault isolation prevents cascading failures
3. **Performance**: Optimized resource allocation per scheduler
4. **Monitoring**: Granular health checks and metrics
5. **Deployment**: Zero-downtime updates and rollbacks
6. **Development**: Easier testing and debugging of individual schedulers
1. **Scalability**: Independent scaling of different scheduler types
2. **Reliability**: Fault isolation prevents cascading failures
3. **Performance**: Optimized resource allocation per scheduler
4. **Monitoring**: Granular health checks and metrics
5. **Deployment**: Zero-downtime updates and rollbacks
6. **Development**: Easier testing and debugging of individual schedulers
## Next Steps
1. Implement remaining scheduler services (ImportProcessing, SessionProcessing, BatchProcessing)
2. Add distributed coordination for multi-instance schedulers
3. Create Kubernetes operators for automatic scaling
4. Implement scheduler-specific metrics and dashboards
5. Add scheduler performance optimization tools
1. Implement remaining scheduler services (ImportProcessing, SessionProcessing, BatchProcessing)
2. Add distributed coordination for multi-instance schedulers
3. Create Kubernetes operators for automatic scaling
4. Implement scheduler-specific metrics and dashboards
5. Add scheduler performance optimization tools