Replaces the fragile regex-based HTML cleaning logic with a proper HTML parser using `golang.org/x/net/html`. The previous implementation was unreliable and could not correctly handle malformed tags, script content, or a wide range of HTML entities.
This new approach provides several key improvements:
- Skips the content of `
Adds a comprehensive Taskfile.yml to centralize all project scripts for building, testing, linting, and Docker image management.
The GitHub Actions CI workflow is refactored to utilize these `task` commands, resulting in a cleaner, more readable, and maintainable configuration. This approach ensures consistency between local development and CI environments.
- Implement tests for the app service, including course processing from file and URI.
- Create mock implementations for CourseParser and Exporter to facilitate testing.
- Add tests for HTML cleaner service to validate HTML content cleaning functionality.
- Develop tests for the parser service, covering course fetching and loading from files.
- Introduce tests for utility functions in the main package, ensuring URI validation and string joining.
- Include benchmarks for performance evaluation of key functions.