Replaces the fragile regex-based HTML cleaning logic with a proper HTML parser using `golang.org/x/net/html`. The previous implementation was unreliable and could not correctly handle malformed tags, script content, or a wide range of HTML entities.
This new approach provides several key improvements:
- Skips the content of `
The `strings.Title` function is deprecated because it does not handle Unicode punctuation correctly.
This change replaces its usage in the DOCX, HTML, and Markdown exporters with the recommended `golang.org/x/text/cases` package. This ensures more robust and accurate title-casing for item headings.
- Implement tests for the app service, including course processing from file and URI.
- Create mock implementations for CourseParser and Exporter to facilitate testing.
- Add tests for HTML cleaner service to validate HTML content cleaning functionality.
- Develop tests for the parser service, covering course fetching and loading from files.
- Introduce tests for utility functions in the main package, ensuring URI validation and string joining.
- Include benchmarks for performance evaluation of key functions.
Updates code of conduct formatting and adds Dependabot schedule for
Monday at 07:00 in the Europe/Amsterdam timezone.
Introduces release config for automatic note generation categorized
by type and adds a LICENSE file with MIT License.
Renames Go module for better clarity and updates README with badges
for better project tracking and visibility.