Files
articulate-parser/go.mod
Kaj Kowalski e6977d3374 refactor(html_cleaner): adopt robust HTML parsing for content cleaning
Replaces the fragile regex-based HTML cleaning logic with a proper HTML parser using `golang.org/x/net/html`. The previous implementation was unreliable and could not correctly handle malformed tags, script content, or a wide range of HTML entities.

This new approach provides several key improvements:
- Skips the content of `
2025-11-06 04:26:51 +01:00

15 lines
281 B
Modula-2

module github.com/kjanat/articulate-parser
go 1.24.0
require (
github.com/fumiama/go-docx v0.0.0-20250506085032-0c30fd09304b
golang.org/x/net v0.46.0
golang.org/x/text v0.30.0
)
require (
github.com/fumiama/imgsz v0.0.4 // indirect
golang.org/x/image v0.32.0 // indirect
)