26 Commits

Author SHA1 Message Date
a1a49a75b7 chore: Enhance developer tooling and documentation
Adds `actionlint` to the pre-commit configuration to validate GitHub Actions workflows.

Significantly expands the `AGENTS.md` file with a comprehensive summary of new features and changes in Go 1.24 and 1.25, along with actionable recommendations for the project.

Additionally, normalizes markdown list formatting across various documentation files for consistency.
2025-11-07 07:50:09 +01:00
8d606706e2 chore(tooling): Improve CI pipeline and expand pre-commit hooks
Expands the pre-commit configuration with a wider range of hooks to enforce file quality, validation, security, and Git safety checks.

The CI pipeline is updated to:
- Correct the `golangci-lint` format command to `fmt`.
- Enable CGO for test execution to support the race detector.
- Improve the robustness of test report parsing scripts.

Additionally, this commit includes minor stylistic and formatting cleanups across various project files.
2025-11-07 07:30:11 +01:00
e7de5d044a chore(ci): remove Go versions below 1.24 from CI matrix
Remove CI test runs for Go 1.21.x, 1.22.x, and 1.23.x as the minimum
supported version is 1.24.0 (as defined in go.mod).

This change:
- Removes outdated Go versions from the test matrix
- Aligns CI testing with the minimum supported version
- Reduces CI execution time by removing unnecessary test runs
- Maintains testing coverage for supported versions (1.24.x, 1.25.x)

fix(ci): remove impossible dependencies from docker job

The docker job runs on push events to master/develop branches, but it
was depending on docker-test and dependency-review jobs which only run
on pull_request events. This created an impossible dependency chain
that prevented the docker job from ever running.

This change:
- Removes docker-test and dependency-review from docker job dependencies
- Keeps only the test job as a dependency (which runs on both events)
- Allows docker build & push to run correctly on push events
- Maintains PR-specific checks (docker-test, dependency-review) for PRs

chore(tooling): add pre-commit configuration

Introduces a `.pre-commit-config.yaml` file to automate code quality checks before commits.

This configuration includes standard hooks for file hygiene (e.g., trailing whitespace, end-of-file fixes) and integrates `golangci-lint` to lint and format Go code on staged files. This helps enforce code style and catch issues early in the development process.
2025-11-07 07:06:10 +01:00
bd308e4dfc refactor(exporter): rewrite HTML exporter to use Go templates
Replaces the manual string-building implementation of the HTML exporter with a more robust and maintainable solution using Go's `html/template` package. This improves readability, security, and separation of concerns.

- HTML structure and CSS styles are moved into their own files and embedded into the binary using `go:embed`.
- A new data preparation layer adapts the course model for the template, simplifying rendering logic.
- Tests are updated to reflect the new implementation, removing obsolete test cases for the old string-building methods.

Additionally, this commit:
- Adds an `AGENTS.md` file with development and contribution guidelines.
- Updates `.golangci.yml` to allow standard Go patterns for interface package naming.
2025-11-07 06:33:38 +01:00
227f88cb9b chore(lint): fix golangci-lint issues
- Remove duplicate package comments (godoclint)
- Improve code style (gocritic: assignOp, elseif, emptyStringTest)
- Extract repeated format strings to constants (goconst)
- Fix naming conventions: OriginalUrl -> OriginalURL (revive)
- Wrap external errors with context (wrapcheck)
- Disable gocognit for test files in .golangci.yml

Remaining issues by design:
- funlen: getDefaultCSS (CSS content)
- revive: interfaces package name (meaningful in context)
2025-11-06 16:50:44 +01:00
b56c9fa29f refactor: Align with Go conventions and improve maintainability
Renames the `OriginalUrl` field to `OriginalURL` across media models to adhere to Go's common initialisms convention. The `json` tag is unchanged to maintain API compatibility.

Introduces constants for exporter formats (e.g., `FormatMarkdown`, `FormatDocx`) to eliminate the use of magic strings, enhancing type safety and making the code easier to maintain.

Additionally, this commit includes several minor code quality improvements:
- Wraps file-writing errors in exporters to provide more context.
- Removes redundant package-level comments from test files.
- Applies various minor linting fixes throughout the codebase.
2025-11-06 16:48:00 +01:00
d8e4d97841 chore: Apply modern Go idioms and perform code cleanup
This commit introduces a series of small refactorings and style fixes across the codebase to improve consistency and leverage modern Go features.

Key changes include:
- Adopting the Go 1.22 `reflect.TypeFor` generic function.
- Replacing `interface{}` with the `any` type alias for better readability.
- Using the explicit `http.NoBody` constant for HTTP requests.
- Updating octal literals for file permissions to the `0o` prefix syntax.
- Standardizing comment formatting and fixing minor typos.
- Removing redundant blank lines and organizing imports.
2025-11-06 15:59:11 +01:00
fe588dadda chore(ci): add linting and refine workflow dependencies
Adds a golangci-lint job to the CI pipeline to enforce code quality and style. The test job is now dependent on the new linting job.

The final image build job is also updated to depend on the successful completion of the test, docker-test, and dependency-review jobs, ensuring more checks pass before publishing.

Additionally, Go 1.25 is added to the testing matrix.
2025-11-06 15:56:29 +01:00
68c6f4e408 chore!: prepare for v1.0.0 release
Bumps the application version to 1.0.0, signaling the first stable release. This version consolidates several new features and breaking API changes.

This commit also includes various code quality improvements:
- Modernizes tests to use t.Setenv for safer environment variable handling.
- Addresses various linter warnings (gosec, errcheck).
- Updates loop syntax to use Go 1.22's range-over-integer feature.

BREAKING CHANGE: The public API has been updated for consistency and to introduce new features like context support and structured logging.
- `GetSupportedFormat()` is renamed to `SupportedFormat()`.
- `GetSupportedFormats()` is renamed to `SupportedFormats()`.
- `FetchCourse()` now requires a `context.Context` parameter.
- `NewArticulateParser()` constructor signature has been updated.
2025-11-06 05:59:52 +01:00
37927a36b6 refactor(core)!: Add context, config, and structured logging
Introduces `context.Context` to the `FetchCourse` method and its call chain, allowing for cancellable network requests and timeouts. This improves application robustness when fetching remote course data.

A new configuration package centralizes application settings, loading them from environment variables with sensible defaults for base URL, request timeout, and logging.

Standard `log` and `fmt` calls are replaced with a structured logging system built on `slog`, supporting both JSON and human-readable text formats.

This change also includes:
- Extensive benchmarks and example tests.
- Simplified Go doc comments across several packages.

BREAKING CHANGE: The `NewArticulateParser` constructor signature has been updated to accept a logger, base URL, and timeout, which are now supplied via the new configuration system.
2025-11-06 05:14:14 +01:00
e6977d3374 refactor(html_cleaner): adopt robust HTML parsing for content cleaning
Replaces the fragile regex-based HTML cleaning logic with a proper HTML parser using `golang.org/x/net/html`. The previous implementation was unreliable and could not correctly handle malformed tags, script content, or a wide range of HTML entities.

This new approach provides several key improvements:
- Skips the content of `
2025-11-06 04:26:51 +01:00
2790064ad5 refactor: Standardize method names and introduce context propagation
Removes the `Get` prefix from exporter methods (e.g., GetSupportedFormat -> SupportedFormat) to better align with Go conventions for simple accessors.

Introduces `context.Context` propagation through the application, starting from `ProcessCourseFromURI` down to the HTTP request in the parser. This makes network operations cancellable and allows for setting deadlines, improving application robustness.

Additionally, optimizes the HTML cleaner by pre-compiling regular expressions for a minor performance gain.
2025-11-06 04:26:41 +01:00
65469ea52e chore: Improve code quality and address linter feedback
This commit introduces several improvements across the codebase, primarily focused on enhancing performance, robustness, and developer experience based on static analysis feedback.

- Replaces `WriteString(fmt.Sprintf())` with the more performant `fmt.Fprintf` in the HTML and Markdown exporters.
- Enhances deferred `Close()` operations to log warnings on failure instead of silently ignoring potential I/O issues.
- Explicitly discards non-critical errors in test suites, particularly during file cleanup, to satisfy linters and clarify intent.
- Suppresses command echoing in `Taskfile.yml` for cleaner output during development tasks.
2025-11-06 04:17:00 +01:00
2db2e0b1a3 feat(task): add golangci-lint tasks and fix Windows checks
Introduces new tasks for running `golangci-lint`, a powerful and fast Go linter. This includes `lint:golangci` for checking the codebase and `lint:golangci:fix` for automatically applying fixes.

Additionally, this commit corrects the command used for checking the existence of executables on Windows. The change from `where` to `where.exe` ensures better cross-platform compatibility and reliability within the Taskfile.
2025-11-06 04:00:53 +01:00
6317ce268b refactor(exporters): replace deprecated strings.Title with cases.Title
The `strings.Title` function is deprecated because it does not handle Unicode punctuation correctly.

This change replaces its usage in the DOCX, HTML, and Markdown exporters with the recommended `golang.org/x/text/cases` package. This ensures more robust and accurate title-casing for item headings.
2025-11-06 03:55:07 +01:00
59f2de9d22 chore(build): introduce go-task for project automation
Adds a comprehensive Taskfile.yml to centralize all project scripts for building, testing, linting, and Docker image management.

The GitHub Actions CI workflow is refactored to utilize these `task` commands, resulting in a cleaner, more readable, and maintainable configuration. This approach ensures consistency between local development and CI environments.
2025-11-06 03:52:21 +01:00
f8fecc3967 chore: clean up CI workflows by removing unused release job and updating permissions 2025-11-05 22:38:56 +01:00
af15bcccd4 chore: update CI actions, Go 1.25, Alpine 3.22
Updates CI to latest major actions (checkout v5, setup-go v6, upload-artifact v5, CodeQL v4) for security and compatibility.
Uses stable major tag for autofix action.
Updates Docker images to Go 1.25 and Alpine 3.22 to leverage newer toolchain and patched bases.

Updates open-pull-requests-limit to 2 in dependabot.yml and upgrade CodeQL action to v4
2025-11-05 22:38:28 +01:00
422b56aa86 deps: Bump Go to 1.24 and x/image to v0.32 2025-11-05 22:22:17 +01:00
903ee92e4c Update ci.yml
- Added docker hub to the login.
- Removed some cache BS.
2025-05-29 00:19:25 +02:00
9c51c0d9e3 Reorganizes badges in README for clarity
Switches CI and Docker badges to clarify workflow separation.
Promotes Docker image visibility by rearranging badge positions.
2025-05-28 23:50:54 +02:00
ec5c8c099c Update labels and bump version to 0.4.0
Standardizes dependabot labels to include 'dependencies/' prefix
for better organization and clarity.

Bumps application version to 0.4.0 to reflect recent changes
and improvements.
2025-05-28 23:31:16 +02:00
9eaf7dfcf2 docker(deps): bump golang in the docker-images group (#4)
Bumps the docker-images group with 1 update: golang.


Updates `golang` from 1.23-alpine to 1.24-alpine

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24-alpine
  dependency-type: direct:production
  dependency-group: docker-images
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-28 23:09:09 +02:00
b7f23b2387 Add Docker support and GitHub Container Registry CI workflow (#3)
* Add comprehensive Docker support with multi-stage builds
* Set up GitHub Container Registry integration
* Enhance CI/CD workflows with Docker build and push capabilities
* Add --help and --version flags to main application
* Update documentation with Docker usage examples
* Implement security best practices for container deployment
2025-05-28 23:04:43 +02:00
a0003983c4 [autofix.ci] apply automated fixes 2025-05-28 12:24:31 +00:00
1c1460ff04 Refactors main function and enhances test suite
Refactors the main function for improved testability by extracting
the core logic into a new run function. Updates argument handling
and error reporting to use return codes instead of os.Exit.

Adds comprehensive test coverage for main functionality,
including integration tests and validation against edge cases.

Enhances README with updated code coverage and feature improvement lists.

Addresses improved maintainability and testability of the application.

Bumps version to 0.3.1
2025-05-28 14:23:56 +02:00
63 changed files with 5606 additions and 1828 deletions

68
.dockerignore Normal file
View File

@ -0,0 +1,68 @@
# Git
.git
.gitignore
.gitattributes
# CI/CD
.github
.codecov.yml
# Documentation
README.md
*.md
docs/
# Build artifacts
build/
dist/
*.exe
*.tar.gz
*.zip
# Test files
*_test.go
test_*.go
test/
coverage.out
coverage.html
*.log
# Development
.vscode/
.idea/
*.swp
*.swo
*~
# OS specific
.DS_Store
Thumbs.db
# Output and temporary files
output/
tmp/
temp/
# Node.js (if any)
node_modules/
npm-debug.log
yarn-error.log
# Python (if any)
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
env/
venv/
# Scripts (build scripts not needed in container)
scripts/
# Sample files
articulate-sample.json
test_input.json
# License
LICENSE

View File

@ -8,14 +8,60 @@ updates:
day: 'monday'
time: '07:00'
timezone: 'Europe/Amsterdam'
open-pull-requests-limit: 10
open-pull-requests-limit: 2
labels:
- 'dependencies'
- 'github-actions'
- 'dependencies/github-actions'
commit-message:
prefix: 'ci'
include: 'scope'
# Check for updates to Docker
- package-ecosystem: 'docker'
directory: '/'
schedule:
interval: 'weekly'
day: 'monday'
time: '07:00'
timezone: 'Europe/Amsterdam'
open-pull-requests-limit: 2
labels:
- 'dependencies'
- 'dependencies/docker'
commit-message:
prefix: 'docker'
include: 'scope'
groups:
docker:
patterns:
- '*'
update-types:
- 'minor'
- 'patch'
# Check for updates to Docker Compose
- package-ecosystem: 'docker-compose'
directory: '/'
schedule:
interval: 'weekly'
day: 'monday'
time: '07:00'
timezone: 'Europe/Amsterdam'
open-pull-requests-limit: 2
labels:
- 'dependencies'
- 'dependencies/docker-compose'
commit-message:
prefix: 'docker'
include: 'scope'
groups:
docker:
patterns:
- '*'
update-types:
- 'minor'
- 'patch'
# Check for updates to Go modules
- package-ecosystem: 'gomod'
directory: '/'
@ -24,10 +70,10 @@ updates:
day: 'monday'
time: '07:00'
timezone: 'Europe/Amsterdam'
open-pull-requests-limit: 10
open-pull-requests-limit: 2
labels:
- 'dependencies'
- 'go'
- 'dependencies/go'
commit-message:
prefix: 'deps'
include: 'scope'

View File

@ -10,16 +10,36 @@ jobs:
autofix:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version-file: 'go.mod'
- name: Checkout code
uses: actions/checkout@v5
# goimports works like gofmt, but also fixes imports.
# see https://pkg.go.dev/golang.org/x/tools/cmd/goimports
- run: go install golang.org/x/tools/cmd/goimports@latest
- run: goimports -w .
# of course we can also do just this instead:
# - run: gofmt -w .
- name: Install Task
uses: go-task/setup-task@v1
- uses: autofix-ci/action@551dded8c6cc8a1054039c8bc0b8b48c51dfc6ef
- uses: actions/setup-go@v6
with: { go-version-file: 'go.mod' }
- name: Setup go deps
run: |
# Install golangci-lint
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/HEAD/install.sh | sh -s -- -b "$(go env GOPATH)/bin"
# Install go-task dependencies
go install golang.org/x/tools/cmd/goimports@latest
- name: Run goimports
run: goimports -w .
- name: Run golangci-lint autofix
run: golangci-lint run --fix
- name: Run golangci-lint format
run: golangci-lint fmt
- name: Run go mod tidy
run: go mod tidy
- name: Run gopls modernize
run: task modernize
- uses: autofix-ci/action@v1

View File

@ -2,213 +2,289 @@ name: CI
on:
push:
branches: [ "master", "develop" ]
tags:
- "v*.*.*"
branches: ['master', 'develop']
pull_request:
branches: [ "master", "develop" ]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
golangci:
name: lint
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
steps:
- uses: actions/checkout@v5
- uses: actions/setup-go@v6
with:
go-version: stable
- name: golangci-lint
uses: golangci/golangci-lint-action@v8
with: { version: latest }
test:
name: Test
needs: [golangci]
runs-on: ubuntu-latest
permissions:
contents: write
strategy:
matrix:
go:
- 1.21.x
- 1.22.x
- 1.23.x
- 1.24.x
- 1.25.x
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: Set up Go ${{ matrix.go }}
uses: actions/setup-go@v5
uses: actions/setup-go@v6
with:
go-version: ${{ matrix.go }}
check-latest: true
cache-dependency-path: "**/*.sum"
- name: Install Task
uses: go-task/setup-task@v1
- name: Show build info
run: task info
- name: Download dependencies
run: go mod download && echo "Download successful" || go mod tidy && echo "Tidy successful" || return 1
- name: Verify dependencies
run: go mod verify
run: task deps
- name: Build
run: go build -v ./...
run: task build
- name: Run tests with enhanced reporting
id: test
env:
CGO_ENABLED: 1
run: |
echo "## 🔧 Test Environment" >> $GITHUB_STEP_SUMMARY
echo "- **Go Version:** ${{ matrix.go }}" >> $GITHUB_STEP_SUMMARY
echo "- **OS:** ubuntu-latest" >> $GITHUB_STEP_SUMMARY
echo "- **Timestamp:** $(date -u)" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
{
cat << EOF
## 🔧 Test Environment
- **Go Version:** ${{ matrix.go }}
- **OS:** ubuntu-latest
- **Timestamp:** $(date -u)
EOF
} >> "$GITHUB_STEP_SUMMARY"
echo "Running tests with coverage..."
go test -v -race -coverprofile=coverage.out ./... 2>&1 | tee test-output.log
task test:coverage 2>&1 | tee test-output.log
# Extract test results for summary
TEST_STATUS=$?
TOTAL_TESTS=$(grep -c "=== RUN" test-output.log || echo "0")
PASSED_TESTS=$(grep -c "--- PASS:" test-output.log || echo "0")
FAILED_TESTS=$(grep -c "--- FAIL:" test-output.log || echo "0")
SKIPPED_TESTS=$(grep -c "--- SKIP:" test-output.log || echo "0")
PASSED_TESTS=$(grep -c -- "--- PASS:" test-output.log || echo "0")
FAILED_TESTS=$(grep -c -- "--- FAIL:" test-output.log || echo "0")
SKIPPED_TESTS=$(grep -c -- "--- SKIP:" test-output.log || echo "0")
# Generate test summary
echo "## 🧪 Test Results (Go ${{ matrix.go }})" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY
echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
echo "| Total Tests | $TOTAL_TESTS |" >> $GITHUB_STEP_SUMMARY
echo "| Passed | ✅ $PASSED_TESTS |" >> $GITHUB_STEP_SUMMARY
echo "| Failed | ❌ $FAILED_TESTS |" >> $GITHUB_STEP_SUMMARY
echo "| Skipped | ⏭️ $SKIPPED_TESTS |" >> $GITHUB_STEP_SUMMARY
echo "| Status | $([ $TEST_STATUS -eq 0 ] && echo "✅ PASSED" || echo "❌ FAILED") |" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
{
cat << EOF
## 🧪 Test Results (Go ${{ matrix.go }})
# Add package breakdown
echo "### 📦 Package Test Results" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Package | Status |" >> $GITHUB_STEP_SUMMARY
echo "|---------|--------|" >> $GITHUB_STEP_SUMMARY
| Metric | Value |
| ----------- | ------------------------------------------------------------- |
| Total Tests | $TOTAL_TESTS |
| Passed | $PASSED_TESTS |
| Failed | $FAILED_TESTS |
| Skipped | $SKIPPED_TESTS |
| Status | $([ "$TEST_STATUS" -eq 0 ] && echo "PASSED" || echo "FAILED") |
### 📦 Package Test Results
| Package | Status |
|---------|--------|
EOF
# Extract package results
grep "^ok\|^FAIL" test-output.log | while read line; do
grep "^ok\|^FAIL" test-output.log | while read -r line; do
if [[ $line == ok* ]]; then
pkg=$(echo $line | awk '{print $2}')
echo "| $pkg | ✅ PASS |" >> $GITHUB_STEP_SUMMARY
pkg=$(echo "$line" | awk '{print $2}')
echo "| $pkg | ✅ PASS |"
elif [[ $line == FAIL* ]]; then
pkg=$(echo $line | awk '{print $2}')
echo "| $pkg | ❌ FAIL |" >> $GITHUB_STEP_SUMMARY
pkg=$(echo "$line" | awk '{print $2}')
echo "| $pkg | ❌ FAIL |"
fi
done
echo "" >> $GITHUB_STEP_SUMMARY
echo ""
# Add detailed results if tests failed
if [ $TEST_STATUS -ne 0 ]; then
echo "### ❌ Failed Tests Details" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
grep -A 10 "--- FAIL:" test-output.log | head -100 >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
if [ "$TEST_STATUS" -ne 0 ]; then
cat << 'EOF'
### ❌ Failed Tests Details
```
EOF
grep -A 10 -- "--- FAIL:" test-output.log | head -100
cat << 'EOF'
```
EOF
fi
} >> "$GITHUB_STEP_SUMMARY"
# Set outputs for other steps
echo "test-status=$TEST_STATUS" >> $GITHUB_OUTPUT
echo "total-tests=$TOTAL_TESTS" >> $GITHUB_OUTPUT
echo "passed-tests=$PASSED_TESTS" >> $GITHUB_OUTPUT
echo "failed-tests=$FAILED_TESTS" >> $GITHUB_OUTPUT
{
echo "test-status=$TEST_STATUS"
echo "total-tests=$TOTAL_TESTS"
echo "passed-tests=$PASSED_TESTS"
echo "failed-tests=$FAILED_TESTS"
} >> "$GITHUB_OUTPUT"
# Exit with the original test status
exit $TEST_STATUS
exit "$TEST_STATUS"
- name: Generate coverage report
if: always()
run: |
if [ -f coverage.out ]; then
go tool cover -html=coverage.out -o coverage.html
COVERAGE=$(go tool cover -func=coverage.out | grep total | awk '{print $3}')
if [ -f coverage/coverage.out ]; then
COVERAGE=$(go tool cover -func=coverage/coverage.out | grep total | awk '{print $3}')
echo "## 📊 Code Coverage (Go ${{ matrix.go }})" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "**Total Coverage: $COVERAGE**" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
{
cat << EOF
## 📊 Code Coverage (Go ${{ matrix.go }})
# Add coverage by package
echo "### 📋 Coverage by Package" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Package | Coverage |" >> $GITHUB_STEP_SUMMARY
echo "|---------|----------|" >> $GITHUB_STEP_SUMMARY
**Total Coverage: $COVERAGE**
go tool cover -func=coverage.out | grep -v total | while read line; do
<details>
<summary>Click to expand 📋 Coverage by Package details</summary>
| Package | Coverage |
| ------- | -------- |
EOF
# Create temporary file for package coverage aggregation
temp_coverage=$(mktemp)
# Extract package-level coverage data
go tool cover -func=coverage/coverage.out | grep -v total | while read -r line; do
if [[ $line == *".go:"* ]]; then
pkg=$(echo $line | awk '{print $1}' | cut -d'/' -f1-3)
coverage=$(echo $line | awk '{print $3}')
echo "| $pkg | $coverage |" >> $GITHUB_STEP_SUMMARY
fi
done | sort -u
# Extract package path from file path (everything before the filename)
filepath=$(echo "$line" | awk '{print $1}')
pkg_path=$(dirname "$filepath" | sed 's|github.com/kjanat/articulate-parser/||; s|^\./||')
coverage=$(echo "$line" | awk '{print $3}' | sed 's/%//')
echo "" >> $GITHUB_STEP_SUMMARY
# Use root package if no subdirectory
[[ "$pkg_path" == "." || "$pkg_path" == "" ]] && pkg_path="root"
echo "$pkg_path $coverage" >> "$temp_coverage"
fi
done
# Aggregate coverage by package (average)
awk '{
packages[$1] += $2
counts[$1]++
}
END {
for (pkg in packages) {
avg = packages[pkg] / counts[pkg]
printf "| %s | %.1f%% |\n", pkg, avg
}
}' "$temp_coverage" | sort
rm -f "$temp_coverage"
cat << 'EOF'
</details>
EOF
} >> "$GITHUB_STEP_SUMMARY"
else
echo "## ⚠️ Coverage Report" >> $GITHUB_STEP_SUMMARY
echo "No coverage file generated" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
cat >> "$GITHUB_STEP_SUMMARY" << 'EOF'
## ⚠️ Coverage Report
No coverage file generated
EOF
fi
- name: Upload test artifacts
if: failure()
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v5
with:
name: test-results-go-${{ matrix.go }}
path: |
test-output.log
coverage.out
coverage.html
coverage/
retention-days: 7
- name: Run go vet
- name: Run linters
run: |
echo "## 🔍 Static Analysis (Go ${{ matrix.go }})" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
# Initialize summary
{
cat << EOF
## 🔍 Static Analysis (Go ${{ matrix.go }})
VET_OUTPUT=$(go vet ./... 2>&1 || echo "")
EOF
# Run go vet
VET_OUTPUT=$(task lint:vet 2>&1 || echo "")
VET_STATUS=$?
if [ $VET_STATUS -eq 0 ]; then
echo "✅ **go vet:** No issues found" >> $GITHUB_STEP_SUMMARY
if [ "$VET_STATUS" -eq 0 ]; then
echo "✅ **go vet:** No issues found"
else
echo "❌ **go vet:** Issues found" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
echo "$VET_OUTPUT" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
cat << 'EOF'
❌ **go vet:** Issues found
```
EOF
echo "$VET_OUTPUT"
echo '```'
fi
echo "" >> $GITHUB_STEP_SUMMARY
echo ""
exit $VET_STATUS
# Run go fmt check
FMT_OUTPUT=$(task lint:fmt 2>&1 || echo "")
FMT_STATUS=$?
- name: Run go fmt
run: |
FMT_OUTPUT=$(gofmt -s -l . 2>&1 || echo "")
if [ -z "$FMT_OUTPUT" ]; then
echo "✅ **go fmt:** All files properly formatted" >> $GITHUB_STEP_SUMMARY
if [ "$FMT_STATUS" -eq 0 ]; then
echo "✅ **go fmt:** All files properly formatted"
else
echo "❌ **go fmt:** Files need formatting" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
echo "$FMT_OUTPUT" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
exit 1
cat << 'EOF'
❌ **go fmt:** Files need formatting
```
EOF
echo "$FMT_OUTPUT"
echo '```'
fi
} >> "$GITHUB_STEP_SUMMARY"
# Exit with error if any linter failed
[ "$VET_STATUS" -eq 0 ] && [ "$FMT_STATUS" -eq 0 ] || exit 1
- name: Job Summary
if: always()
run: |
echo "## 📋 Job Summary (Go ${{ matrix.go }})" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Step | Status |" >> $GITHUB_STEP_SUMMARY
echo "|------|--------|" >> $GITHUB_STEP_SUMMARY
echo "| Dependencies | ✅ Success |" >> $GITHUB_STEP_SUMMARY
echo "| Build | ✅ Success |" >> $GITHUB_STEP_SUMMARY
echo "| Tests | ${{ steps.test.outcome == 'success' && '✅ Success' || '❌ Failed' }} |" >> $GITHUB_STEP_SUMMARY
echo "| Coverage | ${{ job.status == 'success' && '✅ Generated' || '⚠️ Partial' }} |" >> $GITHUB_STEP_SUMMARY
echo "| Static Analysis | ${{ job.status == 'success' && '✅ Clean' || '❌ Issues' }} |" >> $GITHUB_STEP_SUMMARY
echo "| Code Formatting | ${{ job.status == 'success' && 'Clean' || 'Issues' }} |" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
cat >> "$GITHUB_STEP_SUMMARY" << 'EOF'
## 📋 Job Summary (Go ${{ matrix.go }})
| Step | Status |
| --------------- | --------------------------------------------------------------- |
| Dependencies | Success |
| Build | Success |
| Tests | ${{ steps.test.outcome == 'success' && 'Success' || 'Failed' }} |
| Coverage | ${{ job.status == 'success' && 'Generated' || 'Partial' }} |
| Static Analysis | ${{ job.status == 'success' && 'Clean' || 'Issues' }} |
| Code Formatting | ${{ job.status == 'success' && 'Clean' || 'Issues' }} |
EOF
- name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v5
with:
files: ./coverage/coverage.out
flags: Go ${{ matrix.go }}
slug: kjanat/articulate-parser
token: ${{ secrets.CODECOV_TOKEN }}
@ -220,6 +296,54 @@ jobs:
flags: Go ${{ matrix.go }}
token: ${{ secrets.CODECOV_TOKEN }}
docker-test:
name: Docker Build Test
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: go.mod
check-latest: true
- name: Install Task
uses: go-task/setup-task@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build Docker image using Task
run: task docker:build
- name: Test Docker image using Task
run: |
{
cat << 'EOF'
## 🧪 Docker Image Tests
EOF
# Run Task docker test
task docker:test
echo "**Testing help command:**"
echo '```terminaloutput'
docker run --rm articulate-parser:latest --help
echo '```'
echo ""
# Test image size
IMAGE_SIZE=$(docker image inspect articulate-parser:latest --format='{{.Size}}' | numfmt --to=iec-i --suffix=B)
echo "**Image size:** $IMAGE_SIZE"
echo ""
} >> "$GITHUB_STEP_SUMMARY"
dependency-review:
name: Dependency Review
runs-on: ubuntu-latest
@ -228,7 +352,7 @@ jobs:
if: github.event_name == 'pull_request'
steps:
- name: 'Checkout Repository'
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: 'Dependency Review'
uses: actions/dependency-review-action@v4
@ -236,126 +360,131 @@ jobs:
fail-on-severity: moderate
comment-summary-in-pr: always
release:
name: Release
docker:
name: Docker Build & Push
runs-on: ubuntu-latest
if: github.ref_type == 'tag'
permissions:
contents: write
needs: [ "test" ]
contents: read
packages: write
needs: [test]
if: |
github.event_name == 'push' && (github.ref == 'refs/heads/master' ||
github.ref == 'refs/heads/develop' ||
startsWith(github.ref, 'refs/heads/feature/docker'))
steps:
- uses: actions/checkout@v4
- name: Checkout repository
uses: actions/checkout@v5
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
fetch-depth: 0
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Set up Go
uses: actions/setup-go@v5
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
go-version-file: 'go.mod'
check-latest: true
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Run tests
run: |
echo "## 🚀 Release Tests" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
go test -v ./... 2>&1 | tee release-test-output.log
TEST_STATUS=$?
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
TOTAL_TESTS=$(grep -c "=== RUN" release-test-output.log || echo "0")
PASSED_TESTS=$(grep -c "--- PASS:" release-test-output.log || echo "0")
FAILED_TESTS=$(grep -c "--- FAIL:" release-test-output.log || echo "0")
echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY
echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
echo "| Total Tests | $TOTAL_TESTS |" >> $GITHUB_STEP_SUMMARY
echo "| Passed | ✅ $PASSED_TESTS |" >> $GITHUB_STEP_SUMMARY
echo "| Failed | ❌ $FAILED_TESTS |" >> $GITHUB_STEP_SUMMARY
echo "| Status | $([ $TEST_STATUS -eq 0 ] && echo "✅ PASSED" || echo "❌ FAILED") |" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
exit $TEST_STATUS
- name: Install UPX
run: |
sudo apt-get update
sudo apt-get install -y upx
- name: Build binaries
run: |
echo "## 🔨 Build Process" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
# Set the build time environment variable
BUILD_TIME=$(date -u +'%Y-%m-%dT%H:%M:%SZ')
# Add run permissions to the build script
chmod +x ./scripts/build.sh
# Display help information for the build script
./scripts/build.sh --help
echo "**Build Configuration:**" >> $GITHUB_STEP_SUMMARY
echo "- Version: ${{ github.ref_name }}" >> $GITHUB_STEP_SUMMARY
echo "- Build Time: $BUILD_TIME" >> $GITHUB_STEP_SUMMARY
echo "- Git Commit: ${{ github.sha }}" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
# Build for all platforms
./scripts/build.sh \
--verbose \
-ldflags "-s -w -X github.com/kjanat/articulate-parser/internal/version.Version=${{ github.ref_name }} -X github.com/kjanat/articulate-parser/internal/version.BuildTime=$BUILD_TIME -X github.com/kjanat/articulate-parser/internal/version.GitCommit=${{ github.sha }}"
- name: Compress binaries with UPX
run: |
echo "## 📦 Binary Compression" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "Compressing binaries with UPX..."
cd build/
# Get original sizes
echo "**Original sizes:**" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
ls -lah >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
# Compress all binaries except Darwin (macOS) binaries as UPX doesn't work well with recent macOS versions
for binary in articulate-parser-*; do
if [[ "$binary" == *"darwin"* ]]; then
echo "Skipping UPX compression for $binary (macOS compatibility)"
else
echo "Compressing $binary..."
upx --best --lzma "$binary" || {
echo "Warning: UPX compression failed for $binary, keeping original"
}
fi
done
echo "**Final sizes:**" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
ls -lah >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
- name: Upload a Build Artifact
uses: actions/upload-artifact@v4.6.2
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
name: build-artifacts
path: build/
if-no-files-found: ignore
retention-days: 1
compression-level: 9
overwrite: true
include-hidden-files: true
images: |
${{ env.IMAGE_NAME }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
type=raw,value=latest,enable={{is_default_branch}}
labels: |
org.opencontainers.image.title=Articulate Parser
org.opencontainers.image.description=A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats including Markdown HTML and DOCX. Supports media extraction content cleaning and batch processing for educational content conversion.
org.opencontainers.image.vendor=kjanat
org.opencontainers.image.licenses=MIT
org.opencontainers.image.url=https://github.com/${{ github.repository }}
org.opencontainers.image.source=https://github.com/${{ github.repository }}
org.opencontainers.image.documentation=https://github.com/${{ github.repository }}/blob/master/DOCKER.md
- name: Create Release
uses: softprops/action-gh-release@v2
- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
files: build/*
generate_release_notes: true
draft: false
prerelease: ${{ startsWith(github.ref, 'refs/tags/v0.') }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
context: .
# Multi-architecture build - Docker automatically provides TARGETOS, TARGETARCH, etc.
# Based on Go's supported platforms from 'go tool dist list'
platforms: |
linux/amd64
linux/arm64
linux/arm/v7
linux/386
linux/ppc64le
linux/s390x
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
annotations: ${{ steps.meta.outputs.labels }}
build-args: |
VERSION=${{ github.ref_type == 'tag' && github.ref_name || github.sha }}
BUILD_TIME=${{ github.event.head_commit.timestamp }}
GIT_COMMIT=${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
outputs: type=image,name=target,annotation-index.org.opencontainers.image.description=A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats including Markdown HTML and DOCX. Supports media extraction content cleaning and batch processing for educational content conversion.
sbom: true
provenance: true
- name: Generate Docker summary
run: |
cat >> "$GITHUB_STEP_SUMMARY" << 'EOF'
## 🐳 Docker Build Summary
**Image:** `ghcr.io/${{ github.repository }}`
**Tags built:**
```text
${{ steps.meta.outputs.tags }}
```
**Features:**
- **Platforms:** linux/amd64, linux/arm64, linux/arm/v7, linux/386, linux/ppc64le, linux/s390x
- **Architecture optimization:** Native compilation for each platform
- **Multi-arch image description:** Enabled
- **SBOM (Software Bill of Materials):** Generated
- **Provenance attestation:** Generated
- **Security scanning:** Ready for vulnerability analysis
**Usage:**
```bash
# Pull the image
docker pull ghcr.io/${{ github.repository }}:latest
# Run with help
docker run --rm ghcr.io/${{ github.repository }}:latest --help
# Process a local file (mount current directory)
docker run --rm -v $(pwd):/workspace ghcr.io/${{ github.repository }}:latest /workspace/input.json markdown /workspace/output.md
```
EOF
# Security and quality analysis workflows
codeql-analysis:
name: CodeQL Analysis
uses: ./.github/workflows/codeql.yml
permissions:
security-events: write
packages: read
actions: read
contents: read

View File

@ -11,13 +11,17 @@
#
name: "CodeQL"
# This workflow is configured to be called by other workflows for more controlled execution
# This allows integration with the main CI pipeline and avoids redundant runs
# To enable automatic execution, uncomment the triggers below:
on:
push:
branches: [ "master" ]
pull_request:
branches: [ "master" ]
workflow_call:
schedule:
- cron: '44 16 * * 6'
# push:
# branches: [ "master" ]
# pull_request:
# branches: [ "master" ]
jobs:
analyze:
@ -57,7 +61,7 @@ jobs:
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v5
# Add any setup steps before running the `github/codeql-action/init` action.
# This includes steps like installing compilers or runtimes (`actions/setup-node`
@ -67,7 +71,7 @@ jobs:
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
@ -95,6 +99,6 @@ jobs:
exit 1
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@v4
with:
category: "/language:${{matrix.language}}"

26
.github/workflows/dependency-review.yml vendored Normal file
View File

@ -0,0 +1,26 @@
name: Dependency Review
# This workflow is designed to be called by other workflows rather than triggered automatically
# This allows for more controlled execution and integration with other CI/CD processes
# To enable automatic execution on pull requests, uncomment the line below:
# on: [pull_request]
on: [workflow_call]
permissions:
contents: read
# Required to post security advisories
security-events: write
pull-requests: write
jobs:
dependency-review:
runs-on: ubuntu-latest
steps:
- name: 'Checkout Repository'
uses: actions/checkout@v5
- name: 'Dependency Review'
uses: actions/dependency-review-action@v4
with:
fail-on-severity: moderate
comment-summary-in-pr: always

156
.github/workflows/release.yml vendored Normal file
View File

@ -0,0 +1,156 @@
name: Release
on:
push:
tags:
- "v*.*.*"
workflow_call:
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
permissions:
contents: write
packages: write
jobs:
release:
name: Create Release
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: "go.mod"
check-latest: true
- name: Run tests
run: go test -v ./...
- name: Install UPX
run: |
sudo apt-get update
sudo apt-get install -y upx
- name: Build binaries
run: |
# Set the build time environment variable using git commit timestamp
BUILD_TIME=$(git log -1 --format=%cd --date=iso-strict)
# Add run permissions to the build script
chmod +x ./scripts/build.sh
# Build for all platforms
./scripts/build.sh \
--verbose \
-ldflags "-s -w -X github.com/kjanat/articulate-parser/internal/version.Version=${{ github.ref_name }} -X github.com/kjanat/articulate-parser/internal/version.BuildTime=$BUILD_TIME -X github.com/kjanat/articulate-parser/internal/version.GitCommit=${{ github.sha }}"
- name: Compress binaries
run: |
cd build/
for binary in articulate-parser-*; do
echo "Compressing $binary..."
upx --best "$binary" || {
echo "Warning: UPX compression failed for $binary, keeping original"
}
done
- name: Create Release
uses: softprops/action-gh-release@v2
with:
files: |
build/articulate-parser-linux-amd64
build/articulate-parser-linux-arm64
build/articulate-parser-windows-amd64.exe
build/articulate-parser-windows-arm64.exe
build/articulate-parser-darwin-amd64
build/articulate-parser-darwin-arm64
generate_release_notes: true
draft: false
# Mark pre-1.0 versions (v0.x.x) as prerelease since they are considered unstable
# This helps users understand that these releases may have breaking changes
prerelease: ${{ startsWith(github.ref, 'refs/tags/v0.') }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
docker:
name: Docker Build & Push
runs-on: ubuntu-latest
needs: ['release']
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: |
${{ env.IMAGE_NAME }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
type=raw,value=latest,enable={{is_default_branch}}
labels: |
org.opencontainers.image.title=Articulate Parser
org.opencontainers.image.description=A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats including Markdown HTML and DOCX. Supports media extraction content cleaning and batch processing for educational content conversion.
org.opencontainers.image.vendor=kjanat
org.opencontainers.image.licenses=MIT
org.opencontainers.image.url=https://github.com/${{ github.repository }}
org.opencontainers.image.source=https://github.com/${{ github.repository }}
org.opencontainers.image.documentation=https://github.com/${{ github.repository }}/blob/master/DOCKER.md
- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
platforms: |
linux/amd64
linux/arm64
linux/arm/v7
linux/386
linux/ppc64le
linux/s390x
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
annotations: ${{ steps.meta.outputs.labels }}
build-args: |
VERSION=${{ github.ref_name }}
BUILD_TIME=${{ github.event.head_commit.timestamp || github.event.repository.pushed_at }}
GIT_COMMIT=${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
outputs: type=image,name=target,annotation-index.org.opencontainers.image.description=A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats including Markdown HTML and DOCX. Supports media extraction content cleaning and batch processing for educational content conversion.
sbom: true
provenance: true

16
.gitignore vendored
View File

@ -48,9 +48,12 @@ build/
# Test coverage files
coverage.out
coverage.txt
coverage.html
coverage.*
coverage
*.cover
*.coverprofile
main_coverage
# Other common exclusions
*.exe
@ -61,3 +64,16 @@ coverage
*.test
*.out
/tmp/
.github/copilot-instructions.md
# Editors
.vscode/
.idea/
.task/
**/*.local.*
.claude/
NUL

388
.golangci.yml Normal file
View File

@ -0,0 +1,388 @@
# golangci-lint configuration for articulate-parser
# https://golangci-lint.run/usage/configuration/
version: "2"
# Options for analysis running
run:
# Timeout for total work
timeout: 5m
# Skip directories (not allowed in config v2, will use issues exclude instead)
# Go version
go: "1.24"
# Include test files
tests: true
# Use Go module mode
modules-download-mode: readonly
# Output configuration
output:
# Format of output
formats:
text:
print-issued-lines: true
print-linter-name: true
# Sort results
sort-order:
- linter
- severity
- file
# Show statistics
show-stats: true
# Issues configuration
issues:
# Maximum issues count per one linter
max-issues-per-linter: 0
# Maximum count of issues with the same text
max-same-issues: 3
# Show only new issues
new: false
# Fix found issues (if linter supports)
fix: false
# Formatters configuration
formatters:
enable:
- gofmt
- goimports
- gofumpt
settings:
# gofmt settings
gofmt:
simplify: true
# goimports settings
goimports:
local-prefixes:
- github.com/kjanat/articulate-parser
# gofumpt settings
gofumpt:
module-path: github.com/kjanat/articulate-parser
extra-rules: true
# Linters configuration
linters:
# Default set of linters
default: none
# Enable specific linters
enable:
# Default/standard linters
- errcheck # Check for unchecked errors
- govet # Go vet
- ineffassign # Detect ineffectual assignments
- staticcheck # Staticcheck
- unused # Find unused code
# Code quality
- revive # Fast, configurable linter
- gocritic # Opinionated Go source code linter
- godot # Check comment periods
- godox # Detect TODO/FIXME comments
- gocognit # Cognitive complexity
- gocyclo # Cyclomatic complexity
- funlen # Function length
- maintidx # Maintainability index
# Security
- gosec # Security problems
# Performance
- prealloc # Find slice preallocation opportunities
- bodyclose # Check HTTP response body closed
# Style and formatting
- goconst # Find repeated strings
- misspell # Find misspellings
- whitespace # Find unnecessary blank lines
- unconvert # Remove unnecessary type conversions
- dupword # Check for duplicate words
# Error handling
- errorlint # Error handling improvements
- wrapcheck # Check error wrapping
# Testing
- testifylint # Testify usage
- tparallel # Detect improper t.Parallel() usage
- thelper # Detect test helpers without t.Helper()
# Best practices
- exhaustive # Check exhaustiveness of enum switches
- nolintlint # Check nolint directives
- nakedret # Find naked returns
- nilnil # Check for redundant nil checks
- noctx # Check sending HTTP requests without context
- contextcheck # Check context propagation
- asciicheck # Check for non-ASCII identifiers
- bidichk # Check for dangerous unicode sequences
- durationcheck # Check for multiplied durations
- makezero # Find slice declarations with non-zero length
- nilerr # Find code returning nil with non-nil error
- predeclared # Find code shadowing predeclared identifiers
- promlinter # Check Prometheus metrics naming
- reassign # Check reassignment of package variables
- usestdlibvars # Use variables from stdlib
- wastedassign # Find wasted assignments
# Documentation
- godoclint # Check godoc comments
# New
- modernize # Suggest simplifications using new Go features
# Exclusion rules for linters
exclusions:
rules:
# Exclude some linters from test files
- path: _test\.go
linters:
- gosec
- dupl
- errcheck
- goconst
- funlen
- goerr113
- gocognit
# Exclude benchmarks from some linters
- path: _bench_test\.go
linters:
- gosec
- dupl
- errcheck
- goconst
- funlen
- goerr113
- wrapcheck
# Exclude example tests
- path: _example_test\.go
linters:
- gosec
- errcheck
- funlen
- goerr113
- wrapcheck
- revive
# Exclude linters for main.go
- path: ^main\.go$
linters:
- forbidigo
# Exclude certain linters for generated files
- path: internal/version/version.go
linters:
- gochecknoglobals
- gochecknoinits
# Exclude var-naming for interfaces package (standard Go pattern for interface definitions)
- path: internal/interfaces/
text: "var-naming: avoid meaningless package names"
linters:
- revive
# Allow fmt.Print* in main package
- path: ^main\.go$
text: "use of fmt.Print"
linters:
- forbidigo
# Exclude common false positives
- text: "Error return value of .((os\\.)?std(out|err)\\..*|.*Close|.*Flush|os\\.Remove(All)?|.*print(f|ln)?|os\\.(Un)?Setenv). is not checked"
linters:
- errcheck
# Exclude error wrapping suggestions for well-known errors
- text: "non-wrapping format verb for fmt.Errorf"
linters:
- errorlint
# Linters settings
settings:
# errcheck settings
errcheck:
check-type-assertions: true
check-blank: false
# govet settings
govet:
enable-all: true
disable:
- fieldalignment # Too many false positives
- shadow # Can be noisy
# goconst settings
goconst:
min-len: 3
min-occurrences: 3
# godot settings
godot:
scope: toplevel
exclude:
- "^fixme:"
- "^todo:"
capital: true
period: true
# godox settings
godox:
keywords:
- TODO
- FIXME
- HACK
- BUG
- XXX
# misspell settings
misspell:
locale: US
# funlen settings
funlen:
lines: 100
statements: 50
# gocognit settings
gocognit:
min-complexity: 20
# gocyclo settings
gocyclo:
min-complexity: 15
# gocritic settings
gocritic:
enabled-tags:
- diagnostic
- style
- performance
- experimental
disabled-checks:
- ifElseChain
- singleCaseSwitch
- commentedOutCode
settings:
hugeParam:
sizeThreshold: 512
rangeValCopy:
sizeThreshold: 512
# gosec settings
gosec:
severity: medium
confidence: medium
excludes:
- G104 # Handled by errcheck
- G304 # File path provided as taint input
# revive settings
revive:
severity: warning
rules:
- name: blank-imports
- name: context-as-argument
- name: context-keys-type
- name: dot-imports
- name: empty-block
- name: error-naming
- name: error-return
- name: error-strings
- name: errorf
- name: exported
- name: if-return
- name: increment-decrement
- name: indent-error-flow
- name: package-comments
- name: range
- name: receiver-naming
- name: time-naming
- name: unexported-return
- name: var-declaration
- name: var-naming
# errorlint settings
errorlint:
errorf: true
errorf-multi: true
asserts: true
comparison: true
# wrapcheck settings
wrapcheck:
ignore-sigs:
- .Errorf(
- errors.New(
- errors.Unwrap(
- errors.Join(
- .WithMessage(
- .WithMessagef(
- .WithStack(
ignore-package-globs:
- github.com/kjanat/articulate-parser/*
# exhaustive settings
exhaustive:
check:
- switch
- map
default-signifies-exhaustive: true
# nolintlint settings
nolintlint:
allow-unused: false
require-explanation: true
require-specific: true
# stylecheck settings
staticcheck:
checks: ["all", "-ST1000", "-ST1003", "-ST1016", "-ST1020", "-ST1021", "-ST1022"]
# maintidx settings
maintidx:
under: 20
# testifylint settings
testifylint:
enable-all: true
disable:
- float-compare
# thelper settings
thelper:
test:
first: true
name: true
begin: true
benchmark:
first: true
name: true
begin: true
# Severity rules
severity:
default: warning
rules:
- linters:
- gosec
severity: error
- linters:
- errcheck
- staticcheck
severity: error
- linters:
- godox
severity: info

75
.pre-commit-config.yaml Normal file
View File

@ -0,0 +1,75 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
# File quality
- id: trailing-whitespace
exclude: '^\.github/ISSUE_TEMPLATE/.*\.yml$'
- id: end-of-file-fixer
- id: mixed-line-ending
args: ['--fix=lf']
# File validation
- id: check-yaml
- id: check-json
- id: check-toml
# Security
- id: detect-private-key
# Git safety
- id: check-merge-conflict
- id: check-case-conflict
- id: no-commit-to-branch
args: ['--branch=master', '--branch=main']
# File structure
- id: check-added-large-files
- id: check-symlinks
- id: check-executables-have-shebangs
- repo: local
hooks:
- id: actionlint
name: Lint GitHub Actions workflow files
description: Runs actionlint to lint GitHub Actions workflow files
language: golang
types: ["yaml"]
files: ^\.github/workflows/
entry: actionlint
minimum_pre_commit_version: 3.0.0
- repo: https://github.com/golangci/golangci-lint
rev: v2.6.1
hooks:
- id: golangci-lint
name: golangci-lint
description: Fast linters runner for Go. Note that only modified files are linted, so linters like 'unused' that need to scan all files won't work as expected.
entry: golangci-lint run --new-from-rev HEAD --fix
types: [go]
language: golang
require_serial: true
pass_filenames: false
# - id: golangci-lint-full
# name: golangci-lint-full
# description: Fast linters runner for Go. Runs on all files in the module. Use this hook if you use pre-commit in CI.
# entry: golangci-lint run --fix
# types: [go]
# language: golang
# require_serial: true
# pass_filenames: false
- id: golangci-lint-fmt
name: golangci-lint-fmt
description: Fast linters runner for Go. Formats all files in the repo.
entry: golangci-lint fmt
types: [go]
language: golang
require_serial: true
pass_filenames: false
- id: golangci-lint-config-verify
name: golangci-lint-config-verify
description: Verifies the configuration file
entry: golangci-lint config verify
files: '\.golangci\.(?:yml|yaml|toml|json)'
language: golang
pass_filenames: false

427
AGENTS.md Normal file
View File

@ -0,0 +1,427 @@
# Agent Guidelines for articulate-parser
## Build/Test Commands
- **Build**: `task build` or `go build -o bin/articulate-parser main.go`
- **Run tests**: `task test` or `go test -race -timeout 5m ./...`
- **Run single test**: `go test -v -race -run ^TestName$ ./path/to/package`
- **Test with coverage**:
- `task test:coverage` or
- `go test -race -coverprofile=coverage/coverage.out -covermode=atomic ./...`
- **Lint**: `task lint` (runs vet, fmt check, staticcheck, golangci-lint)
- **Format**: `task fmt` or `gofmt -s -w .`
- **CI checks**: `task ci` (deps, lint, test with coverage, build)
## Code Style Guidelines
### Imports
- Use `goimports` with local prefix: `github.com/kjanat/articulate-parser`
- Order: stdlib, external, internal packages
- Group related imports together
### Formatting
- Use `gofmt -s` (simplify) and `gofumpt` with extra rules
- Function length: max 100 lines, 50 statements
- Cyclomatic complexity: max 15
- Cognitive complexity: max 20
### Types & Naming
- Use interface-based design (see `internal/interfaces/`)
- Export types/functions with clear godoc comments ending with period
- Use descriptive names: `ArticulateParser`, `MarkdownExporter`
- Receiver names: short (1-2 chars), consistent per type
### Error Handling
- Always wrap errors with context: `fmt.Errorf("operation failed: %w", err)`
- Use `%w` verb for error wrapping to preserve error chain
- Check all error returns (enforced by `errcheck`)
- Document error handling rationale in defer blocks when ignoring close errors
### Comments
- All exported types/functions require godoc comments
- End sentences with periods (`godot` linter enforced)
- Mark known issues with TODO/FIXME/HACK/BUG/XXX
### Security
- Use `#nosec` with justification for deliberate security exceptions (G304 for CLI file paths, G306 for export file permissions)
- Run `gosec` and `govulncheck` for security audits
### Testing
- Enable race detection: `-race` flag
- Use table-driven tests where applicable
- Mark test helpers with `t.Helper()`
- Benchmarks in `*_bench_test.go`, examples in `*_example_test.go`
### Dependencies
- Minimal external dependencies (currently: go-docx, golang.org/x/net, golang.org/x/text)
- Run `task deps:tidy` after adding/removing dependencies
---
## Go 1.24 & 1.25 New Features Reference
This project uses Go 1.24+. Below is a comprehensive summary of new features and changes in Go 1.24 and 1.25 that may be relevant for development and maintenance.
### Go 1.24 Major Changes (Released February 2025)
#### Language Features
**Generic Type Aliases**
- Type aliases can now be parameterized with type parameters
- Example: `type List[T any] = []T`
- Can be disabled via `GOEXPERIMENT=noaliastypeparams` (removed in 1.25)
#### Tooling Enhancements
**Module Tool Dependencies**
- New `tool` directive in go.mod tracks executable dependencies
- Use `go get -tool <package>` to add tool dependencies
- Use `go install tool` and `go get tool` to manage them
- Eliminates need for blank imports in `tools.go` files
**Build Output Formatting**
- Both `go build` and `go test` support `-json` flag for structured JSON output
- New action types distinguish build output from test results
**Authentication**
- New `GOAUTH` environment variable provides flexible authentication for private modules
**Automatic Version Tracking**
- `go build` automatically sets main module version in binaries based on VCS tags
- Adds `+dirty` suffix for uncommitted changes
**Cgo Performance Improvements**
- New `#cgo noescape` annotation: Prevents escape analysis overhead for C function calls
- New `#cgo nocallback` annotation: Indicates C function won't call back to Go
**Toolchain Tracing**
- `GODEBUG=toolchaintrace=1` enables tracing of toolchain selection
#### Runtime & Performance
**Performance Improvements**
- **2-3% CPU overhead reduction** across benchmark suites
- New Swiss Tables-based map implementation (faster lookups)
- Disable via `GOEXPERIMENT=noswissmap`
- More efficient small object allocation
- Redesigned runtime-internal mutexes
- Disable via `GOEXPERIMENT=nospinbitmutex`
#### Compiler & Linker
**Method Receiver Restrictions**
- Methods on cgo-generated types now prevented (both directly and through aliases)
**Build IDs**
- Linkers generate GNU build IDs (ELF) and UUIDs (macOS) by default
- Disable via `-B none` flag
#### Standard Library Additions
**File System Safety - `os.Root`**
- New `os.Root` type enables directory-limited operations
- Prevents path escape and symlink breakouts
- Essential for sandboxed file operations
**Cryptography Expansion**
- `crypto/mlkem`: ML-KEM-768/1024 post-quantum key exchange (FIPS 203)
- `crypto/hkdf`: HMAC-based Extract-and-Expand KDF (RFC 5869)
- `crypto/pbkdf2`: Password-based key derivation (RFC 8018)
- `crypto/sha3`: SHA-3 and SHAKE functions (FIPS 202)
**FIPS 140-3 Support**
- New `GOFIPS140` environment variable enables FIPS mode
- New `fips140` GODEBUG setting for cryptographic module compliance
**Weak References - `weak` Package**
- New `weak` package provides low-level weak pointers
- Enables memory-efficient structures like weak maps and caches
- Useful for preventing memory leaks in cache implementations
**Testing Improvements**
- `testing.B.Loop()`: Cleaner syntax replacing manual `b.N` iteration
- Prevents compiler from optimizing away benchmarked code
- New `testing/synctest` package (experimental) for testing concurrent code with fake clocks
**Iterator Support**
- Multiple packages now offer iterator-returning variants:
- `bytes`: Iterator-based functions
- `strings`: Iterator-based functions
- `go/types`: Iterator support
#### Security Enhancements
**TLS Post-Quantum Cryptography**
- `X25519MLKEM768` hybrid key exchange enabled by default in TLS
- Provides quantum-resistant security
**Encrypted Client Hello (ECH)**
- TLS servers can enable ECH via `Config.EncryptedClientHelloKeys`
- Protects client identity during TLS handshake
**RSA Key Validation**
- Keys smaller than 1024 bits now rejected by default
- Use `GODEBUG=rsa1024min=0` to revert (testing only)
**Constant-Time Execution**
- New `crypto/subtle.WithDataIndependentTiming()` enables architecture-specific timing guarantees
- Helps prevent timing attacks
#### Deprecations & Removals
- `runtime.GOROOT()`: Deprecated; use system path instead
- `crypto/cipher` OFB/CFB modes: Deprecated (unauthenticated encryption)
- `x509sha1` GODEBUG: Removed; SHA-1 certificates no longer verified
- Experimental `X25519Kyber768Draft00`: Removed
#### Platform Changes
- **Linux**: Now requires kernel 3.2+ (enforced)
- **macOS**: Go 1.24 is final release supporting Big Sur
- **Windows/ARM 32-bit**: Marked broken
- **WebAssembly**:
- New `go:wasmexport` directive
- Reactor/library builds supported via `-buildmode=c-shared`
#### Bootstrap Requirements
- Go 1.24 requires Go 1.22.6+ for bootstrapping
- Go 1.26 will require Go 1.24+
---
### Go 1.25 Major Changes (Released August 2025)
#### Language Changes
- No breaking language changes
- "Core types" concept removed from specification (replaced with clearer prose)
#### Tooling Improvements
**Go Command Enhancements**
- `go build -asan`: Now defaults to leak detection at program exit
- New `go.mod ignore` directive: Specify directories for go command to ignore
- `go doc -http`: Starts documentation server and opens in browser
- `go version -m -json`: Prints JSON-encoded BuildInfo structures
- Module path resolution now supports subdirectories using `<meta>` syntax
- New `work` package pattern matches all packages in work/workspace modules
- Removed automatic toolchain line additions when updating `go` version
**Vet Analyzers**
- **"waitgroup"**: Detects misplaced `sync.WaitGroup.Add` calls
- **"hostport"**: Warns against using `fmt.Sprintf` for constructing addresses
- Recommends `net.JoinHostPort` instead
#### Runtime Enhancements
**Container-Aware GOMAXPROCS**
- Linux now respects cgroup CPU bandwidth limits
- All OSes periodically update GOMAXPROCS if CPU availability changes
- Disable via environment variables or GODEBUG settings
- Critical for containerized applications
**New Garbage Collector - "Green Tea GC"**
- Experimental `GOEXPERIMENT=greenteagc` enables new GC
- **10-40% reduction in garbage collection overhead**
- Significant for GC-sensitive applications
**Trace Flight Recorder**
- New `runtime/trace.FlightRecorder` API
- Captures execution traces in in-memory ring buffer
- Essential for debugging rare events and production issues
**Other Runtime Changes**
- Simplified unhandled panic output
- VMA names on Linux identify memory purpose (debugging aid)
- New `SetDefaultGOMAXPROCS` function resets GOMAXPROCS to defaults
#### Compiler Fixes & Improvements
**Critical Nil Pointer Bug Fix**
- Fixed Go 1.21 regression where nil pointer checks were incorrectly delayed
- ⚠️ **May cause previously passing code to now panic** (correct behavior)
- Review code for assumptions about delayed nil checks
**DWARF5 Support**
- Debug information now uses DWARF version 5
- Reduces binary size and linking time
- Better debugging experience
**Faster Slices**
- Expanded stack allocation for slice backing stores
- Improved slice performance
#### Linker
- New `-funcalign=N` option specifies function entry alignment
#### Standard Library Highlights
**New Packages**
1. **`testing/synctest`** (Promoted from Experimental)
- Concurrent code testing with virtualized time
- Control time progression in tests
- Essential for testing time-dependent concurrent code
2. **`encoding/json/v2`** (Experimental)
- **Substantially better decoding performance**
- Improved API design
- Backward compatible with v1
**Major Package Updates**
| Package | Key Changes |
|---------|------------|
| `crypto` | New `MessageSigner` interface and `SignMessage` function |
| `crypto/ecdsa` | New raw key parsing/serialization functions |
| `crypto/rsa` | **Key generation now 3x faster** |
| `crypto/sha1` | **Hashing 2x faster on amd64 with SHA-NI** |
| `crypto/tls` | New `CurveID` field; SHA-1 algorithms disallowed in TLS 1.2 |
| `net` | Windows now supports file-to-connection conversion; IPv6 multicast improvements |
| `net/http` | **New `CrossOriginProtection` middleware for CSRF defense** |
| `os` | Windows async I/O support; `Root` type expanded with 12 new methods |
| `sync` | **New `WaitGroup.Go` method for convenient goroutine creation** |
| `testing` | New `Attr`, `Output` methods; `AllocsPerRun` panics with parallel tests |
| `unique` | More eager and parallel reclamation of interned values |
#### Performance Notes
**Performance Improvements**
- ECDSA and Ed25519 signing **4x faster** in FIPS 140-3 mode
- SHA3 hashing **2x faster** on Apple M processors
- AMD64 fused multiply-add instructions in v3+ mode
- ⚠️ **Changes floating-point results** (within IEEE 754 spec)
**Performance Regressions**
- SHA-1, SHA-256, SHA-512 slower without AVX2
- Most servers post-2015 support AVX2
#### Platform Changes
- **macOS**: Requires version 12 Monterey or later
- **Windows**: 32-bit windows/arm port marked for removal in Go 1.26
- **Loong64**: Race detector now supported
- **RISC-V**:
- Plugin build mode support
- New `GORISCV64=rva23u64` environment variable value
#### Deprecations
- `go/ast` functions: `FilterPackage`, `PackageExports`, `MergePackageFiles`
- `go/parser.ParseDir` function
- Old `testing/synctest` API (when `GOEXPERIMENT=synctest` set)
---
### Actionable Recommendations for This Project
#### Immediate Opportunities
1. **Replace `sync.WaitGroup` patterns with `WaitGroup.Go()`** (Go 1.25)
```go
// Old pattern
wg.Add(1)
go func() {
defer wg.Done()
// work
}()
// New pattern (Go 1.25)
wg.Go(func() {
// work
})
```
2. **Use `testing.B.Loop()` in benchmarks** (Go 1.24)
```go
// Old pattern
for i := 0; i < b.N; i++ {
// benchmark code
}
// New pattern (Go 1.24)
for b.Loop() {
// benchmark code
}
```
3. **Consider `os.Root` for file operations** (Go 1.24)
- Prevents path traversal vulnerabilities
- Safer for user-provided file paths
4. **Enable Green Tea GC for testing** (Go 1.25)
- Test with `GOEXPERIMENT=greenteagc`
- May reduce GC overhead by 10-40%
5. **Leverage container-aware GOMAXPROCS** (Go 1.25)
- No changes needed; automatic in containers
- Improves resource utilization
6. **Review floating-point operations** (Go 1.25)
- AMD64 v3+ uses FMA instructions
- May change floating-point results (within spec)
7. **Watch nil pointer checks** (Go 1.25)
- Compiler bug fix may expose latent nil pointer bugs
- Review crash reports carefully
#### Future Considerations
1. **Evaluate `encoding/json/v2`** when stable
- Better performance for JSON operations
- Currently experimental in Go 1.25
2. **Adopt tool directives** in go.mod
- Cleaner dependency management for build tools
- Remove `tools.go` if present
3. **Enable FIPS mode if required**
- Use `GOFIPS140=1` for compliance
- Performance improvements in Go 1.25 (4x faster signing)
4. **Use `runtime/trace.FlightRecorder`** for production debugging
- Capture traces of rare events
- Minimal overhead when not triggered

82
DOCKER.md Normal file
View File

@ -0,0 +1,82 @@
# Articulate Parser - Docker
A powerful command-line tool for parsing and processing articulate data files, now available as a lightweight Docker container.
## Quick Start
### Pull from GitHub Container Registry
```bash
docker pull ghcr.io/kjanat/articulate-parser:latest
```
### Run with Articulate Rise URL
```bash
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/ markdown /data/output.md
```
### Run with local files
```bash
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest /data/input.json markdown /data/output.md
```
## Usage
### Basic File Processing
```bash
# Process from Articulate Rise URL
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/ markdown /data/output.md
# Process a local JSON file
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest /data/document.json markdown /data/output.md
# Process with specific format and output
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest /data/input.json docx /data/output.docx
```
### Display Help and Version
```bash
# Show help information
docker run --rm ghcr.io/kjanat/articulate-parser:latest --help
# Show version
docker run --rm ghcr.io/kjanat/articulate-parser:latest --version
```
## Available Tags
- `latest` - Latest stable release
- `v1.x.x` - Specific version tags
- `main` - Latest development build
## Image Details
- **Base Image**: `scratch` (minimal attack surface)
- **Architecture**: Multi-arch support (amd64, arm64)
- **Size**: < 10MB (optimized binary)
- **Security**: Runs as non-root user
- **Features**: SBOM and provenance attestation included
## Development
### Local Build
```bash
docker build -t articulate-parser .
```
### Docker Compose
```bash
docker-compose up --build
```
## Repository
- **Source**: [github.com/kjanat/articulate-parser](https://github.com/kjanat/articulate-parser)
- **Issues**: [Report bugs or request features](https://github.com/kjanat/articulate-parser/issues)
- **License**: See repository for license details

78
Dockerfile Normal file
View File

@ -0,0 +1,78 @@
# Build stage
FROM golang:1.25-alpine AS builder
# Install git and ca-certificates (needed for fetching dependencies and HTTPS)
RUN apk add --no-cache git ca-certificates tzdata file
# Create a non-root user for the final stage
RUN adduser -D -u 1000 appuser
# Set the working directory
WORKDIR /app
# Copy go mod files
COPY go.mod go.sum ./
# Download dependencies
RUN go mod download
# Copy source code
COPY . .
# Build the application
# Disable CGO for a fully static binary
# Use linker flags to reduce binary size and embed version info
ARG VERSION=dev
ARG BUILD_TIME
ARG GIT_COMMIT
# Docker buildx automatically provides these for multi-platform builds
ARG BUILDPLATFORM
ARG TARGETPLATFORM
ARG TARGETOS
ARG TARGETARCH
ARG TARGETVARIANT
# Debug: Show build information
RUN echo "Building for platform: $TARGETPLATFORM (OS: $TARGETOS, Arch: $TARGETARCH, Variant: $TARGETVARIANT)" \
&& echo "Build platform: $BUILDPLATFORM" \
&& echo "Go version: $(go version)"
RUN CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build \
-ldflags="-s -w -X github.com/kjanat/articulate-parser/internal/version.Version=${VERSION} -X github.com/kjanat/articulate-parser/internal/version.BuildTime=${BUILD_TIME} -X github.com/kjanat/articulate-parser/internal/version.GitCommit=${GIT_COMMIT}" \
-o articulate-parser \
./main.go
# Verify the binary architecture
RUN file /app/articulate-parser || echo "file command not available"
# Final stage - minimal runtime image
FROM scratch
# Copy CA certificates for HTTPS requests
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
# Copy timezone data
COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo
# Add a minimal /etc/passwd file to support non-root user
COPY --from=builder /etc/passwd /etc/passwd
# Copy the binary
COPY --from=builder /app/articulate-parser /articulate-parser
# Switch to non-root user (appuser with UID 1000)
USER appuser
# Set the binary as entrypoint
ENTRYPOINT ["/articulate-parser"]
# Default command shows help
CMD ["--help"]
# Add labels for metadata
LABEL org.opencontainers.image.title="Articulate Parser"
LABEL org.opencontainers.image.description="A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats (Markdown, HTML, DOCX). Supports media extraction, content cleaning, and batch processing for educational content conversion."
LABEL org.opencontainers.image.vendor="kjanat"
LABEL org.opencontainers.image.licenses="MIT"
LABEL org.opencontainers.image.source="https://github.com/kjanat/articulate-parser"
LABEL org.opencontainers.image.documentation="https://github.com/kjanat/articulate-parser/blob/master/DOCKER.md"

78
Dockerfile.dev Normal file
View File

@ -0,0 +1,78 @@
# Development Dockerfile with shell access
# Uses Alpine instead of scratch for debugging
# Build stage - same as production
FROM golang:1.25-alpine AS builder
# Install git and ca-certificates (needed for fetching dependencies and HTTPS)
RUN apk add --no-cache git ca-certificates tzdata file
# Create a non-root user
RUN adduser -D -u 1000 appuser
# Set the working directory
WORKDIR /app
# Copy go mod files
COPY go.mod go.sum ./
# Download dependencies
RUN go mod download
# Copy source code
COPY . .
# Build the application
# Disable CGO for a fully static binary
# Use linker flags to reduce binary size and embed version info
ARG VERSION=dev
ARG BUILD_TIME
ARG GIT_COMMIT
# Docker buildx automatically provides these for multi-platform builds
ARG BUILDPLATFORM
ARG TARGETPLATFORM
ARG TARGETOS
ARG TARGETARCH
ARG TARGETVARIANT
# Debug: Show build information
RUN echo "Building for platform: $TARGETPLATFORM (OS: $TARGETOS, Arch: $TARGETARCH, Variant: $TARGETVARIANT)" \
&& echo "Build platform: $BUILDPLATFORM" \
&& echo "Go version: $(go version)"
RUN CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build \
-ldflags="-s -w -X github.com/kjanat/articulate-parser/internal/version.Version=${VERSION} -X github.com/kjanat/articulate-parser/internal/version.BuildTime=${BUILD_TIME} -X github.com/kjanat/articulate-parser/internal/version.GitCommit=${GIT_COMMIT}" \
-o articulate-parser \
./main.go
# Verify the binary architecture
RUN file /app/articulate-parser || echo "file command not available"
# Development stage - uses Alpine for shell access
FROM alpine:3
# Install minimal dependencies
RUN apk add --no-cache ca-certificates tzdata
# Copy the binary
COPY --from=builder /app/articulate-parser /articulate-parser
# Copy the non-root user configuration
COPY --from=builder /etc/passwd /etc/passwd
# Switch to non-root user
USER appuser
# Set the binary as entrypoint
ENTRYPOINT ["/articulate-parser"]
# Default command shows help
CMD ["--help"]
# Add labels for metadata
LABEL org.opencontainers.image.title="Articulate Parser (Dev)"
LABEL org.opencontainers.image.description="Development version of Articulate Parser with shell access"
LABEL org.opencontainers.image.vendor="kjanat"
LABEL org.opencontainers.image.licenses="MIT"
LABEL org.opencontainers.image.source="https://github.com/kjanat/articulate-parser"
LABEL org.opencontainers.image.documentation="https://github.com/kjanat/articulate-parser/blob/master/DOCKER.md"

164
README.md
View File

@ -2,6 +2,18 @@
A Go-based parser that converts Articulate Rise e-learning content to various formats including Markdown, HTML, and Word documents.
[![Go version](https://img.shields.io/github/go-mod/go-version/kjanat/articulate-parser?logo=Go&logoColor=white)][gomod]
[![Go Doc](https://godoc.org/github.com/kjanat/articulate-parser?status.svg)][Package documentation]
[![Go Report Card](https://goreportcard.com/badge/github.com/kjanat/articulate-parser)][Go report]
[![Tag](https://img.shields.io/github/v/tag/kjanat/articulate-parser?sort=semver&label=Tag)][Tags] <!-- [![Release Date](https://img.shields.io/github/release-date/kjanat/articulate-parser?label=Release%20date)][Latest release] -->
[![License](https://img.shields.io/github/license/kjanat/articulate-parser?label=License)][MIT License] <!-- [![Commit activity](https://img.shields.io/github/commit-activity/m/kjanat/articulate-parser?label=Commit%20activity)][Commits] -->
[![Last commit](https://img.shields.io/github/last-commit/kjanat/articulate-parser?label=Last%20commit)][Commits]
[![GitHub Issues or Pull Requests](https://img.shields.io/github/issues/kjanat/articulate-parser?label=Issues)][Issues]
[![Docker Image](https://img.shields.io/badge/docker-ghcr.io-blue?logo=docker&logoColor=white)][Docker image] <!-- [![Docker Size](https://img.shields.io/docker/image-size/kjanat/articulate-parser?logo=docker&label=Image%20Size)][Docker image] -->
[![Docker](https://img.shields.io/github/actions/workflow/status/kjanat/articulate-parser/docker.yml?logo=docker&label=Docker)][Docker workflow]
[![CI](https://img.shields.io/github/actions/workflow/status/kjanat/articulate-parser/ci.yml?logo=github&label=CI)][Build]
[![Codecov](https://img.shields.io/codecov/c/gh/kjanat/articulate-parser?token=eHhaHY8nut&logo=codecov&logoColor=%23F01F7A&label=Codecov)][Codecov]
## System Architecture
```mermaid
@ -73,16 +85,6 @@ The system follows **Clean Architecture** principles with clear separation of co
- **📤 Export Layer**: Factory pattern for format-specific exporters
- **📊 Data Layer**: Domain models representing course structure
[![Go version](https://img.shields.io/github/go-mod/go-version/kjanat/articulate-parser?logo=Go&logoColor=white)][gomod]
[![Go Doc](https://godoc.org/github.com/kjanat/articulate-parser?status.svg)][Package documentation]
[![Go Report Card](https://goreportcard.com/badge/github.com/kjanat/articulate-parser)][Go report]
[![Tag](https://img.shields.io/github/v/tag/kjanat/articulate-parser?sort=semver&label=Tag)][Tags] <!-- [![Release Date](https://img.shields.io/github/release-date/kjanat/articulate-parser?label=Release%20date)][Latest release] -->
[![License](https://img.shields.io/github/license/kjanat/articulate-parser?label=License)][MIT License] <!-- [![Commit activity](https://img.shields.io/github/commit-activity/m/kjanat/articulate-parser?label=Commit%20activity)][Commits] -->
[![Last commit](https://img.shields.io/github/last-commit/kjanat/articulate-parser?label=Last%20commit)][Commits]
[![GitHub Issues or Pull Requests](https://img.shields.io/github/issues/kjanat/articulate-parser?label=Issues)][Issues]
[![CI](https://img.shields.io/github/actions/workflow/status/kjanat/articulate-parser/ci.yml?logo=github&label=CI)][Build]
[![Codecov](https://img.shields.io/codecov/c/gh/kjanat/articulate-parser?token=eHhaHY8nut&logo=codecov&logoColor=%23F01F7A&label=Codecov)][Codecov]
## Features
- Parse Articulate Rise JSON data from URLs or local files
@ -101,7 +103,7 @@ The system follows **Clean Architecture** principles with clear separation of co
### Prerequisites
- Go, I don't know the version, but I use go1.24.2 right now, and it works, see the [CI][Build] workflow where it is tested.
- Go, I don't know the version, but I have [![Go version](https://img.shields.io/github/go-mod/go-version/kjanat/articulate-parser?label=)][gomod] configured right now, and it works, see the [CI][Build] workflow where it is tested.
### Install from source
@ -200,6 +202,130 @@ Then run:
./articulate-parser input.json md output.md
```
## Docker
The application is available as a Docker image from GitHub Container Registry.
### 🐳 Docker Image Information
- **Registry**: `ghcr.io/kjanat/articulate-parser`
- **Platforms**: linux/amd64, linux/arm64
- **Base Image**: Scratch (minimal footprint)
- **Size**: ~15-20MB compressed
### Quick Start
```bash
# Pull the latest image
docker pull ghcr.io/kjanat/articulate-parser:latest
# Show help
docker run --rm ghcr.io/kjanat/articulate-parser:latest --help
```
### Available Tags
| Tag | Description | Use Case |
|-----|-------------|----------|
| `latest` | Latest stable release from master branch | Production use |
| `edge` | Latest development build from master branch | Testing new features |
| `v1.x.x` | Specific version releases | Production pinning |
| `develop` | Development branch builds | Development/testing |
| `feature/docker-ghcr` | Feature branch builds | Feature testing |
| `master` | Latest master branch build | Continuous integration |
### Usage Examples
#### Process a local file
```bash
# Mount current directory and process a local JSON file
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
/workspace/input.json markdown /workspace/output.md
```
#### Process from URL
```bash
# Mount output directory and process from Articulate Rise URL
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
"https://rise.articulate.com/share/xyz" docx /workspace/output.docx
```
#### Export to different formats
```bash
# Export to HTML
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
/workspace/course.json html /workspace/course.html
# Export to Word Document
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
/workspace/course.json docx /workspace/course.docx
# Export to Markdown
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
/workspace/course.json md /workspace/course.md
```
#### Batch Processing
```bash
# Process multiple files in a directory
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
bash -c "for file in /workspace/*.json; do
/articulate-parser \"\$file\" md \"\${file%.json}.md\"
done"
```
### Docker Compose
For local development, you can use the provided `docker-compose.yml`:
```bash
# Build and run with default help command
docker-compose up articulate-parser
# Process files using mounted volumes
docker-compose up parser-with-files
```
### Building Locally
```bash
# Build the Docker image locally
docker build -t articulate-parser:local .
# Run the local image
docker run --rm articulate-parser:local --help
# Build with specific version
docker build --build-arg VERSION=local --build-arg BUILD_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ) -t articulate-parser:local .
```
### Environment Variables
The Docker image supports the following build-time arguments:
| Argument | Description | Default |
|----------|-------------|---------|
| `VERSION` | Version string embedded in the binary | `dev` |
| `BUILD_TIME` | Build timestamp | Current time |
| `GIT_COMMIT` | Git commit hash | Current commit |
### Docker Security
- **Non-root execution**: The application runs as a non-privileged user
- **Minimal attack surface**: Built from scratch base image
- **No shell access**: Only the application binary is available
- **Read-only filesystem**: Container filesystem is read-only except for mounted volumes
## Development
### Code Quality
@ -291,7 +417,7 @@ The parser includes error handling for:
<!-- ## Code coverage
![Sunburst](https://codecov.io/gh/kjanat/articulate-parser/graphs/tree.svg?token=eHhaHY8nut)
![Sunburst](https://codecov.io/gh/kjanat/articulate-parser/graphs/sunburst.svg?token=eHhaHY8nut)
![Grid](https://codecov.io/gh/kjanat/articulate-parser/graphs/tree.svg?token=eHhaHY8nut)
@ -315,12 +441,12 @@ The parser includes error handling for:
Potential improvements could include:
- PDF export support
- Media file downloading
- ~~HTML export with preserved styling~~**Completed**
- SCORM package support
- Batch processing capabilities
- Custom template support
- [ ] PDF export support
- [ ] Media file downloading
- [x] ~~HTML export with preserved styling~~
- [ ] SCORM package support
- [ ] Batch processing capabilities
- [ ] Custom template support
## License
@ -329,6 +455,8 @@ This is a utility tool for educational content conversion. Please ensure you hav
[Build]: https://github.com/kjanat/articulate-parser/actions/workflows/ci.yml
[Codecov]: https://codecov.io/gh/kjanat/articulate-parser
[Commits]: https://github.com/kjanat/articulate-parser/commits/master/
[Docker workflow]: https://github.com/kjanat/articulate-parser/actions/workflows/docker.yml
[Docker image]: https://github.com/kjanat/articulate-parser/pkgs/container/articulate-parser
[Go report]: https://goreportcard.com/report/github.com/kjanat/articulate-parser
[gomod]: go.mod
[Issues]: https://github.com/kjanat/articulate-parser/issues

602
Taskfile.yml Normal file
View File

@ -0,0 +1,602 @@
# yaml-language-server: $schema=https://taskfile.dev/schema.json
# Articulate Parser - Task Automation
# https://taskfile.dev
version: '3'
# Global output settings
output: prefixed
# Shell settings (only applied on Unix-like systems)
# Note: These are ignored on Windows where PowerShell/cmd is used
set: [errexit, pipefail]
shopt: [globstar]
# Watch mode interval
interval: 500ms
# Global variables
vars:
APP_NAME: articulate-parser
MAIN_FILE: main.go
OUTPUT_DIR: bin
COVERAGE_DIR: coverage
TEST_TIMEOUT: 5m
# Version info
VERSION:
sh: git describe --tags --always --dirty 2>/dev/null || echo "dev"
GIT_COMMIT:
sh: git rev-parse --short HEAD 2>/dev/null || echo "unknown"
BUILD_TIME: '{{now | date "2006-01-02T15:04:05Z07:00"}}'
# Go settings
CGO_ENABLED: 0
GO_FLAGS: -v
LDFLAGS: >-
-s -w
-X github.com/kjanat/articulate-parser/internal/version.Version={{.VERSION}}
-X github.com/kjanat/articulate-parser/internal/version.BuildTime={{.BUILD_TIME}}
-X github.com/kjanat/articulate-parser/internal/version.GitCommit={{.GIT_COMMIT}}
# Platform detection (using Task built-in variables)
GOOS:
sh: go env GOOS
GOARCH:
sh: go env GOARCH
EXE_EXT: '{{if eq OS "windows"}}.exe{{end}}'
# Environment variables
env:
CGO_ENABLED: '{{.CGO_ENABLED}}'
GO111MODULE: on
# Load .env files if present
dotenv: ['.env', '.env.local']
# Task definitions
tasks:
# Default task - show help
default:
desc: Show available tasks
cmds:
- task --list
silent: true
# Development tasks
dev:
desc: Run the application in development mode (with hot reload)
aliases: [run, start]
interactive: true
watch: true
sources:
- '**/*.go'
- go.mod
- go.sum
cmds:
- task: build
- '{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} --help'
# Build tasks
build:
desc: Build the application binary
aliases: [b]
deps: [clean-bin]
sources:
- '**/*.go'
- go.mod
- go.sum
generates:
- '{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}}'
cmds:
- task: mkdir
vars: { DIR: '{{.OUTPUT_DIR}}' }
- go build {{.GO_FLAGS}} -ldflags="{{.LDFLAGS}}" -o {{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} {{.MAIN_FILE}}
method: checksum
build:all:
desc: Build binaries for all major platforms
aliases: [build-all, cross-compile]
deps: [clean-bin]
cmds:
- task: mkdir
vars: { DIR: '{{.OUTPUT_DIR}}' }
- for:
matrix:
GOOS: [linux, darwin, windows]
GOARCH: [amd64, arm64]
task: build:platform
vars:
TARGET_GOOS: '{{.ITEM.GOOS}}'
TARGET_GOARCH: '{{.ITEM.GOARCH}}'
- echo "Built binaries for all platforms in {{.OUTPUT_DIR}}/"
build:platform:
internal: true
vars:
TARGET_EXT: '{{if eq .TARGET_GOOS "windows"}}.exe{{end}}'
OUTPUT_FILE: '{{.OUTPUT_DIR}}/{{.APP_NAME}}-{{.TARGET_GOOS}}-{{.TARGET_GOARCH}}{{.TARGET_EXT}}'
env:
GOOS: '{{.TARGET_GOOS}}'
GOARCH: '{{.TARGET_GOARCH}}'
cmds:
- echo "Building {{.OUTPUT_FILE}}..."
- go build {{.GO_FLAGS}} -ldflags="{{.LDFLAGS}}" -o "{{.OUTPUT_FILE}}" {{.MAIN_FILE}}
# Install task
install:
desc: Install the binary to $GOPATH/bin
deps: [test]
cmds:
- go install -ldflags="{{.LDFLAGS}}" {{.MAIN_FILE}}
- echo "Installed {{.APP_NAME}} to $(go env GOPATH)/bin"
# Testing tasks
test:
desc: Run all tests
aliases: [t]
cmds:
- go test {{.GO_FLAGS}} -race -timeout {{.TEST_TIMEOUT}} ./...
test:coverage:
desc: Run tests with coverage report
aliases: [cover, cov]
deps: [clean-coverage]
cmds:
- task: mkdir
vars: { DIR: '{{.COVERAGE_DIR}}' }
- go test {{.GO_FLAGS}} -race -coverprofile={{.COVERAGE_DIR}}/coverage.out -covermode=atomic -timeout {{.TEST_TIMEOUT}} ./...
- go tool cover -html={{.COVERAGE_DIR}}/coverage.out -o {{.COVERAGE_DIR}}/coverage.html
- go tool cover -func={{.COVERAGE_DIR}}/coverage.out
- echo "Coverage report generated at {{.COVERAGE_DIR}}/coverage.html"
test:verbose:
desc: Run tests with verbose output
aliases: [tv]
cmds:
- go test -v -race -timeout {{.TEST_TIMEOUT}} ./...
test:watch:
desc: Run tests in watch mode
aliases: [tw]
watch: true
sources:
- '**/*.go'
cmds:
- task: test
test:bench:
desc: Run benchmark tests
aliases: [bench]
cmds:
- go test -bench=. -benchmem -timeout {{.TEST_TIMEOUT}} ./...
test:integration:
desc: Run integration tests
status:
- '{{if eq OS "windows"}}if not exist "main_test.go" exit 1{{else}}test ! -f "main_test.go"{{end}}'
cmds:
- go test -v -race -tags=integration -timeout {{.TEST_TIMEOUT}} ./...
# Code quality tasks
lint:
desc: Run all linters
silent: true
aliases: [l]
cmds:
- task: lint:vet
- task: lint:fmt
- task: lint:staticcheck
- task: lint:golangci
lint:vet:
desc: Run go vet
silent: true
cmds:
- go vet ./...
lint:fmt:
desc: Check code formatting
silent: true
vars:
UNFORMATTED:
sh: gofmt -s -l .
cmds:
- |
{{if ne .UNFORMATTED ""}}
echo "❌ The following files need formatting:"
echo "{{.UNFORMATTED}}"
exit 1
{{else}}
echo "All files are properly formatted"
{{end}}
lint:staticcheck:
desc: Run staticcheck (install if needed)
silent: true
vars:
HAS_STATICCHECK:
sh: '{{if eq OS "windows"}}where.exe staticcheck 2>NUL{{else}}command -v staticcheck 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_STATICCHECK ""}}echo "Installing staticcheck..." && go install honnef.co/go/tools/cmd/staticcheck@latest{{end}}'
- staticcheck ./...
ignore_error: true
lint:golangci:
desc: Run golangci-lint (install if needed)
silent: true
aliases: [golangci, golangci-lint]
vars:
HAS_GOLANGCI:
sh: '{{if eq OS "windows"}}where.exe golangci-lint 2>NUL{{else}}command -v golangci-lint 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GOLANGCI ""}}echo "Installing golangci-lint..." && go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest{{end}}'
- golangci-lint run ./...
- echo "✅ golangci-lint passed"
lint:golangci:fix:
desc: Run golangci-lint with auto-fix
silent: true
aliases: [golangci-fix]
vars:
HAS_GOLANGCI:
sh: '{{if eq OS "windows"}}where.exe golangci-lint 2>NUL{{else}}command -v golangci-lint 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GOLANGCI ""}}echo "Installing golangci-lint..." && go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest{{end}}'
- golangci-lint run --fix ./...
- echo "golangci-lint fixes applied"
fmt:
desc: Format all Go files
silent: true
aliases: [format]
cmds:
- gofmt -s -w .
- echo "Formatted all Go files"
modernize:
desc: Modernize Go code to use modern idioms
silent: true
aliases: [modern]
cmds:
- go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./...
- echo "Code modernized"
# Dependency management
deps:
desc: Download and verify dependencies
aliases: [mod]
cmds:
- go mod download
- go mod verify
- echo "Dependencies downloaded and verified"
deps:tidy:
desc: Tidy go.mod and go.sum
aliases: [tidy]
cmds:
- go mod tidy
- echo "Dependencies tidied"
deps:update:
desc: Update all dependencies to latest versions
aliases: [update]
cmds:
- go get -u ./...
- go mod tidy
- echo "Dependencies updated"
deps:graph:
desc: Display dependency graph
cmds:
- go mod graph
# Docker tasks
docker:build:
desc: Build Docker image
aliases: [db]
cmds:
- |
docker build \
--build-arg VERSION={{.VERSION}} \
--build-arg BUILD_TIME={{.BUILD_TIME}} \
--build-arg GIT_COMMIT={{.GIT_COMMIT}} \
-t {{.APP_NAME}}:{{.VERSION}} \
-t {{.APP_NAME}}:latest \
.
- >
echo "Docker image built: {{.APP_NAME}}:{{.VERSION}}"
docker:build:dev:
desc: Build development Docker image
cmds:
- docker build -f Dockerfile.dev -t {{.APP_NAME}}:dev .
- >
echo "Development Docker image built: {{.APP_NAME}}:dev"
docker:run:
desc: Run Docker container
aliases: [dr]
deps: ['docker:build']
cmds:
- docker run --rm {{.APP_NAME}}:{{.VERSION}} --help
docker:test:
desc: Test Docker image
deps: ['docker:build']
cmds:
- docker run --rm {{.APP_NAME}}:{{.VERSION}} --version
- echo "Docker image tested successfully"
docker:compose:up:
desc: Start services with docker-compose
cmds:
- docker-compose up -d
docker:compose:down:
desc: Stop services with docker-compose
cmds:
- docker-compose down
# Cleanup tasks
clean:
desc: Clean all generated files
aliases: [c]
cmds:
- task: clean-bin
- task: clean-coverage
- task: clean-cache
- echo "All generated files cleaned"
clean-bin:
desc: Remove built binaries
internal: true
cmds:
- task: rmdir
vars: { DIR: '{{.OUTPUT_DIR}}' }
clean-coverage:
desc: Remove coverage files
internal: true
cmds:
- task: rmdir
vars: { DIR: '{{.COVERAGE_DIR}}' }
clean-cache:
desc: Clean Go build and test cache
cmds:
- go clean -cache -testcache -modcache
- echo "Go caches cleaned"
# CI/CD tasks
ci:
desc: Run all CI checks (test, lint, build)
cmds:
- task: deps
- task: lint
- task: test:coverage
- task: build
- echo "All CI checks passed"
ci:local:
desc: Run CI checks locally with detailed output
cmds:
- echo "🔍 Running local CI checks..."
- echo ""
- echo "📦 Checking dependencies..."
- task: deps
- echo ""
- echo "🔧 Running linters..."
- task: lint
- echo ""
- echo "🧪 Running tests with coverage..."
- task: test:coverage
- echo ""
- echo "🏗️ Building application..."
- task: build:all
- echo ""
- echo "All CI checks completed successfully!"
# Release tasks
release:check:
desc: Check if ready for release
cmds:
- task: ci
- git diff --exit-code
- git diff --cached --exit-code
- echo "Ready for release"
release:tag:
desc: Tag a new release (requires VERSION env var)
requires:
vars: [VERSION]
preconditions:
- sh: 'git diff --exit-code'
msg: 'Working directory is not clean'
- sh: 'git diff --cached --exit-code'
msg: 'Staging area is not clean'
cmds:
- git tag -a v{{.VERSION}} -m "Release v{{.VERSION}}"
- echo "Tagged v{{.VERSION}}"
- >
echo "Push with: git push origin v{{.VERSION}}"
# Documentation tasks
docs:serve:
desc: Serve documentation locally
vars:
HAS_GODOC:
sh: '{{if eq OS "windows"}}where.exe godoc 2>NUL{{else}}command -v godoc 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GODOC ""}}echo "Installing godoc..." && go install golang.org/x/tools/cmd/godoc@latest{{end}}'
- echo "📚 Serving documentation at http://localhost:6060"
- godoc -http=:6060
interactive: true
docs:coverage:
desc: Open coverage report in browser
deps: ['test:coverage']
cmds:
- '{{if eq OS "darwin"}}open {{.COVERAGE_DIR}}/coverage.html{{else if eq OS "windows"}}start {{.COVERAGE_DIR}}/coverage.html{{else}}xdg-open {{.COVERAGE_DIR}}/coverage.html 2>/dev/null || echo "Please open {{.COVERAGE_DIR}}/coverage.html in your browser"{{end}}'
# Info tasks
info:
desc: Display build information
vars:
GO_VERSION:
sh: go version
cmds:
- task: info:print
silent: true
info:print:
internal: true
silent: true
vars:
GO_VERSION:
sh: go version
cmds:
- echo "Application Info:"
- echo " Name{{":"}} {{.APP_NAME}}"
- echo " Version{{":"}} {{.VERSION}}"
- echo " Git Commit{{":"}} {{.GIT_COMMIT}}"
- echo " Build Time{{":"}} {{.BUILD_TIME}}"
- echo ""
- echo "Go Environment{{":"}}"
- echo " Go Version{{":"}} {{.GO_VERSION}}"
- echo " GOOS{{":"}} {{.GOOS}}"
- echo " GOARCH{{":"}} {{.GOARCH}}"
- echo " CGO{{":"}} {{.CGO_ENABLED}}"
- echo ""
- echo "Paths{{":"}}"
- echo " Output Dir{{":"}} {{.OUTPUT_DIR}}"
- echo " Coverage{{":"}} {{.COVERAGE_DIR}}"
# Security tasks
security:check:
desc: Run security checks with gosec
vars:
HAS_GOSEC:
sh: '{{if eq OS "windows"}}where.exe gosec 2>NUL{{else}}command -v gosec 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GOSEC ""}}echo "Installing gosec..." && go install github.com/securego/gosec/v2/cmd/gosec@latest{{end}}'
- gosec ./...
ignore_error: true
security:audit:
desc: Audit dependencies for vulnerabilities
vars:
HAS_GOVULNCHECK:
sh: '{{if eq OS "windows"}}where.exe govulncheck 2>NUL{{else}}command -v govulncheck 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GOVULNCHECK ""}}echo "Installing govulncheck..." && go install golang.org/x/vuln/cmd/govulncheck@latest{{end}}'
- govulncheck ./...
# Example/Demo tasks
demo:markdown:
desc: Demo - Convert sample to Markdown
status:
- '{{if eq OS "windows"}}if not exist "articulate-sample.json" exit 1{{else}}test ! -f "articulate-sample.json"{{end}}'
deps: [build]
cmds:
- '{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} articulate-sample.json md output-demo.md'
- echo "Demo Markdown created{{:}} output-demo.md"
- defer:
task: rmfile
vars: { FILE: 'output-demo.md' }
demo:html:
desc: Demo - Convert sample to HTML
status:
- '{{if eq OS "windows"}}if not exist "articulate-sample.json" exit 1{{else}}test ! -f "articulate-sample.json"{{end}}'
deps: [build]
cmds:
- '{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} articulate-sample.json html output-demo.html'
- echo "Demo HTML created{{:}} output-demo.html"
- defer:
task: rmfile
vars: { FILE: 'output-demo.html' }
demo:docx:
desc: Demo - Convert sample to DOCX
status:
- '{{if eq OS "windows"}}if not exist "articulate-sample.json" exit 1{{else}}test ! -f "articulate-sample.json"{{end}}'
deps: [build]
cmds:
- '{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} articulate-sample.json docx output-demo.docx'
- echo "Demo DOCX created{{:}} output-demo.docx"
- defer:
task: rmfile
vars: { FILE: 'output-demo.docx' }
# Performance profiling
profile:cpu:
desc: Run CPU profiling
cmds:
- go test -cpuprofile=cpu.prof -bench=. ./...
- go tool pprof -http=:8080 cpu.prof
- defer:
task: rmfile
vars: { FILE: 'cpu.prof' }
profile:mem:
desc: Run memory profiling
cmds:
- go test -memprofile=mem.prof -bench=. ./...
- go tool pprof -http=:8080 mem.prof
- defer:
task: rmfile
vars: { FILE: 'mem.prof' }
# Git hooks
hooks:install:
desc: Install git hooks
cmds:
- task: mkdir
vars: { DIR: '.git/hooks' }
- '{{if eq OS "windows"}}echo "#!/bin/sh" > .git/hooks/pre-commit && echo "task lint:fmt" >> .git/hooks/pre-commit{{else}}cat > .git/hooks/pre-commit << ''EOF''{{printf "\n"}}#!/bin/sh{{printf "\n"}}task lint:fmt{{printf "\n"}}EOF{{printf "\n"}}chmod +x .git/hooks/pre-commit{{end}}'
- echo "Git hooks installed"
# Quick shortcuts
qa:
desc: Quick quality assurance (fmt + lint + test)
aliases: [q, quick]
cmds:
- task: fmt
- task: lint
- task: test
- echo "Quick QA passed"
all:
desc: Build everything (clean + deps + test + build:all + docker:build)
cmds:
- task: clean
- task: deps:tidy
- task: test:coverage
- task: build:all
- task: docker:build
- echo "Full build completed!"
# Cross-platform helper tasks
mkdir:
internal: true
requires:
vars: [DIR]
cmds:
- '{{if eq OS "windows"}}powershell -Command "New-Item -ItemType Directory -Force -Path ''{{.DIR}}'' | Out-Null"{{else}}mkdir -p "{{.DIR}}"{{end}}'
silent: true
rmdir:
internal: true
requires:
vars: [DIR]
cmds:
- '{{if eq OS "windows"}}powershell -Command "if (Test-Path ''{{.DIR}}'') { Remove-Item -Recurse -Force ''{{.DIR}}'' }"{{else}}rm -rf "{{.DIR}}" 2>/dev/null || true{{end}}'
silent: true
rmfile:
internal: true
requires:
vars: [FILE]
cmds:
- '{{if eq OS "windows"}}powershell -Command "if (Test-Path ''{{.FILE}}'') { Remove-Item -Force ''{{.FILE}}'' }"{{else}}rm -f "{{.FILE}}"{{end}}'
silent: true

39
docker-compose.yml Normal file
View File

@ -0,0 +1,39 @@
services:
articulate-parser: &articulate-parser
build:
context: .
dockerfile: Dockerfile
args:
VERSION: "dev"
BUILD_TIME: "2024-01-01T00:00:00Z"
GIT_COMMIT: "dev"
image: articulate-parser:local
volumes:
# Mount current directory to /workspace for file access
- .:/workspace
working_dir: /workspace
# Override entrypoint for interactive use
entrypoint: ["/articulate-parser"]
# Default to showing help
command: ["--help"]
# Service for processing files with volume mounts
parser-with-files:
<<: *articulate-parser
volumes:
- ./input:/input:ro
- ./output:/output
command: ["/input/sample.json", "markdown", "/output/result.md"]
# Service for development - with shell access
parser-dev:
build:
context: .
dockerfile: Dockerfile.dev
image: articulate-parser:dev
volumes:
- .:/workspace
working_dir: /workspace
entrypoint: ["/bin/sh"]
command: ["-c", "while true; do sleep 30; done"]
# Uses Dockerfile.dev with Alpine base instead of scratch for shell access

10
go.mod
View File

@ -1,10 +1,14 @@
module github.com/kjanat/articulate-parser
go 1.23.0
go 1.24.0
require github.com/fumiama/go-docx v0.0.0-20250506085032-0c30fd09304b
require (
github.com/fumiama/go-docx v0.0.0-20250506085032-0c30fd09304b
golang.org/x/net v0.46.0
golang.org/x/text v0.30.0
)
require (
github.com/fumiama/imgsz v0.0.4 // indirect
golang.org/x/image v0.27.0 // indirect
golang.org/x/image v0.32.0 // indirect
)

8
go.sum
View File

@ -2,5 +2,9 @@ github.com/fumiama/go-docx v0.0.0-20250506085032-0c30fd09304b h1:/mxSugRc4SgN7Xg
github.com/fumiama/go-docx v0.0.0-20250506085032-0c30fd09304b/go.mod h1:ssRF0IaB1hCcKIObp3FkZOsjTcAHpgii70JelNb4H8M=
github.com/fumiama/imgsz v0.0.4 h1:Lsasu2hdSSFS+vnD+nvR1UkiRMK7hcpyYCC0FzgSMFI=
github.com/fumiama/imgsz v0.0.4/go.mod h1:bISOQVTlw9sRytPwe8ir7tAaEmyz9hSNj9n8mXMBG0E=
golang.org/x/image v0.27.0 h1:C8gA4oWU/tKkdCfYT6T2u4faJu3MeNS5O8UPWlPF61w=
golang.org/x/image v0.27.0/go.mod h1:xbdrClrAUway1MUTEZDq9mz/UpRwYAkFFNUslZtcB+g=
golang.org/x/image v0.32.0 h1:6lZQWq75h7L5IWNk0r+SCpUJ6tUVd3v4ZHnbRKLkUDQ=
golang.org/x/image v0.32.0/go.mod h1:/R37rrQmKXtO6tYXAjtDLwQgFLHmhW+V6ayXlxzP2Pc=
golang.org/x/net v0.46.0 h1:giFlY12I07fugqwPuWJi68oOnpfqFnJIJzaIIm2JVV4=
golang.org/x/net v0.46.0/go.mod h1:Q9BGdFy1y4nkUwiLvT5qtyhAnEHgnQ/zd8PfU6nc210=
golang.org/x/text v0.30.0 h1:yznKA/E9zq54KzlzBEAWn1NXSQ8DIp/NYMy88xJjl4k=
golang.org/x/text v0.30.0/go.mod h1:yDdHFIX9t+tORqspjENWgzaCVXgk0yYnYuSZ8UzzBVM=

77
internal/config/config.go Normal file
View File

@ -0,0 +1,77 @@
// Package config provides configuration management for the articulate-parser application.
// It supports loading configuration from environment variables and command-line flags.
package config
import (
"log/slog"
"os"
"strconv"
"time"
)
// Config holds all configuration values for the application.
type Config struct {
// Parser configuration
BaseURL string
RequestTimeout time.Duration
// Logging configuration
LogLevel slog.Level
LogFormat string // "json" or "text"
}
// Default configuration values.
const (
DefaultBaseURL = "https://rise.articulate.com"
DefaultRequestTimeout = 30 * time.Second
DefaultLogLevel = slog.LevelInfo
DefaultLogFormat = "text"
)
// Load creates a new Config with values from environment variables.
// Falls back to defaults if environment variables are not set.
func Load() *Config {
return &Config{
BaseURL: getEnv("ARTICULATE_BASE_URL", DefaultBaseURL),
RequestTimeout: getDurationEnv("ARTICULATE_REQUEST_TIMEOUT", DefaultRequestTimeout),
LogLevel: getLogLevelEnv("LOG_LEVEL", DefaultLogLevel),
LogFormat: getEnv("LOG_FORMAT", DefaultLogFormat),
}
}
// getEnv retrieves an environment variable or returns the default value.
func getEnv(key, defaultValue string) string {
if value := os.Getenv(key); value != "" {
return value
}
return defaultValue
}
// getDurationEnv retrieves a duration from environment variable or returns default.
// The environment variable should be in seconds (e.g., "30" for 30 seconds).
func getDurationEnv(key string, defaultValue time.Duration) time.Duration {
if value := os.Getenv(key); value != "" {
if seconds, err := strconv.Atoi(value); err == nil {
return time.Duration(seconds) * time.Second
}
}
return defaultValue
}
// getLogLevelEnv retrieves a log level from environment variable or returns default.
// Accepts: "debug", "info", "warn", "error" (case-insensitive).
func getLogLevelEnv(key string, defaultValue slog.Level) slog.Level {
value := os.Getenv(key)
switch value {
case "debug", "DEBUG":
return slog.LevelDebug
case "info", "INFO":
return slog.LevelInfo
case "warn", "WARN", "warning", "WARNING":
return slog.LevelWarn
case "error", "ERROR":
return slog.LevelError
default:
return defaultValue
}
}

View File

@ -0,0 +1,116 @@
package config
import (
"log/slog"
"os"
"testing"
"time"
)
func TestLoad(t *testing.T) {
// Clear environment
os.Clearenv()
cfg := Load()
if cfg.BaseURL != DefaultBaseURL {
t.Errorf("Expected BaseURL '%s', got '%s'", DefaultBaseURL, cfg.BaseURL)
}
if cfg.RequestTimeout != DefaultRequestTimeout {
t.Errorf("Expected timeout %v, got %v", DefaultRequestTimeout, cfg.RequestTimeout)
}
if cfg.LogLevel != DefaultLogLevel {
t.Errorf("Expected log level %v, got %v", DefaultLogLevel, cfg.LogLevel)
}
if cfg.LogFormat != DefaultLogFormat {
t.Errorf("Expected log format '%s', got '%s'", DefaultLogFormat, cfg.LogFormat)
}
}
func TestLoad_WithEnvironmentVariables(t *testing.T) {
// Set environment variables
t.Setenv("ARTICULATE_BASE_URL", "https://test.example.com")
t.Setenv("ARTICULATE_REQUEST_TIMEOUT", "60")
t.Setenv("LOG_LEVEL", "debug")
t.Setenv("LOG_FORMAT", "json")
cfg := Load()
if cfg.BaseURL != "https://test.example.com" {
t.Errorf("Expected BaseURL 'https://test.example.com', got '%s'", cfg.BaseURL)
}
if cfg.RequestTimeout != 60*time.Second {
t.Errorf("Expected timeout 60s, got %v", cfg.RequestTimeout)
}
if cfg.LogLevel != slog.LevelDebug {
t.Errorf("Expected log level Debug, got %v", cfg.LogLevel)
}
if cfg.LogFormat != "json" {
t.Errorf("Expected log format 'json', got '%s'", cfg.LogFormat)
}
}
func TestGetLogLevelEnv(t *testing.T) {
tests := []struct {
name string
value string
expected slog.Level
}{
{"debug lowercase", "debug", slog.LevelDebug},
{"debug uppercase", "DEBUG", slog.LevelDebug},
{"info lowercase", "info", slog.LevelInfo},
{"info uppercase", "INFO", slog.LevelInfo},
{"warn lowercase", "warn", slog.LevelWarn},
{"warn uppercase", "WARN", slog.LevelWarn},
{"warning lowercase", "warning", slog.LevelWarn},
{"error lowercase", "error", slog.LevelError},
{"error uppercase", "ERROR", slog.LevelError},
{"invalid value", "invalid", slog.LevelInfo},
{"empty value", "", slog.LevelInfo},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
os.Clearenv()
if tt.value != "" {
t.Setenv("TEST_LOG_LEVEL", tt.value)
}
result := getLogLevelEnv("TEST_LOG_LEVEL", slog.LevelInfo)
if result != tt.expected {
t.Errorf("Expected %v, got %v", tt.expected, result)
}
})
}
}
func TestGetDurationEnv(t *testing.T) {
tests := []struct {
name string
value string
expected time.Duration
}{
{"valid duration", "45", 45 * time.Second},
{"zero duration", "0", 0},
{"invalid duration", "invalid", 30 * time.Second},
{"empty value", "", 30 * time.Second},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
os.Clearenv()
if tt.value != "" {
t.Setenv("TEST_DURATION", tt.value)
}
result := getDurationEnv("TEST_DURATION", 30*time.Second)
if result != tt.expected {
t.Errorf("Expected %v, got %v", tt.expected, result)
}
})
}
}

View File

@ -0,0 +1,200 @@
package exporters
import (
"path/filepath"
"testing"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// BenchmarkFactory_CreateExporter_Markdown benchmarks markdown exporter creation.
func BenchmarkFactory_CreateExporter_Markdown(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
b.ResetTimer()
for b.Loop() {
_, _ = factory.CreateExporter("markdown")
}
}
// BenchmarkFactory_CreateExporter_All benchmarks creating all exporter types.
func BenchmarkFactory_CreateExporter_All(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
formats := []string{"markdown", "docx", "html"}
b.ResetTimer()
for b.Loop() {
for _, format := range formats {
_, _ = factory.CreateExporter(format)
}
}
}
// BenchmarkAllExporters_Export benchmarks all exporters with the same course.
func BenchmarkAllExporters_Export(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
course := createBenchmarkCourse()
exporters := map[string]struct {
exporter any
ext string
}{
"Markdown": {NewMarkdownExporter(htmlCleaner), ".md"},
"Docx": {NewDocxExporter(htmlCleaner), ".docx"},
"HTML": {NewHTMLExporter(htmlCleaner), ".html"},
}
for name, exp := range exporters {
b.Run(name, func(b *testing.B) {
tempDir := b.TempDir()
exporter := exp.exporter.(interface {
Export(*models.Course, string) error
})
b.ResetTimer()
for b.Loop() {
outputPath := filepath.Join(tempDir, "benchmark"+exp.ext)
_ = exporter.Export(course, outputPath)
}
})
}
}
// BenchmarkExporters_LargeCourse benchmarks exporters with large course data.
func BenchmarkExporters_LargeCourse(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
course := createLargeBenchmarkCourse()
b.Run("Markdown_Large", func(b *testing.B) {
exporter := NewMarkdownExporter(htmlCleaner)
tempDir := b.TempDir()
b.ResetTimer()
for b.Loop() {
outputPath := filepath.Join(tempDir, "large.md")
_ = exporter.Export(course, outputPath)
}
})
b.Run("Docx_Large", func(b *testing.B) {
exporter := NewDocxExporter(htmlCleaner)
tempDir := b.TempDir()
b.ResetTimer()
for b.Loop() {
outputPath := filepath.Join(tempDir, "large.docx")
_ = exporter.Export(course, outputPath)
}
})
b.Run("HTML_Large", func(b *testing.B) {
exporter := NewHTMLExporter(htmlCleaner)
tempDir := b.TempDir()
b.ResetTimer()
for b.Loop() {
outputPath := filepath.Join(tempDir, "large.html")
_ = exporter.Export(course, outputPath)
}
})
}
// createBenchmarkCourse creates a standard-sized course for benchmarking.
func createBenchmarkCourse() *models.Course {
return &models.Course{
ShareID: "benchmark-id",
Author: "Benchmark Author",
Course: models.CourseInfo{
ID: "bench-course",
Title: "Benchmark Course",
Description: "Performance testing course",
NavigationMode: "menu",
Lessons: []models.Lesson{
{
ID: "lesson1",
Title: "Introduction",
Type: "lesson",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "Welcome",
Paragraph: "<p>This is a test paragraph with <strong>HTML</strong> content.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "Item 1"},
{Paragraph: "Item 2"},
{Paragraph: "Item 3"},
},
},
},
},
},
},
}
}
// createLargeBenchmarkCourse creates a large course for stress testing.
func createLargeBenchmarkCourse() *models.Course {
lessons := make([]models.Lesson, 50)
for i := range 50 {
lessons[i] = models.Lesson{
ID: string(rune(i)),
Title: "Lesson " + string(rune(i)),
Type: "lesson",
Description: "<p>This is lesson description with <em>formatting</em>.</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "Section Heading",
Paragraph: "<p>Content with <strong>bold</strong> and <em>italic</em> text.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "Point 1"},
{Paragraph: "Point 2"},
{Paragraph: "Point 3"},
},
},
{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "Quiz Question",
Answers: []models.Answer{
{Title: "Answer A", Correct: false},
{Title: "Answer B", Correct: true},
{Title: "Answer C", Correct: false},
},
Feedback: "Good job!",
},
},
},
},
}
}
return &models.Course{
ShareID: "large-benchmark-id",
Author: "Benchmark Author",
Course: models.CourseInfo{
ID: "large-bench-course",
Title: "Large Benchmark Course",
Description: "Large performance testing course",
Lessons: lessons,
},
}
}

View File

@ -8,6 +8,9 @@ import (
"strings"
"github.com/fumiama/go-docx"
"golang.org/x/text/cases"
"golang.org/x/text/language"
"github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
@ -66,15 +69,24 @@ func (e *DocxExporter) Export(course *models.Course, outputPath string) error {
// Ensure output directory exists and add .docx extension
if !strings.HasSuffix(strings.ToLower(outputPath), ".docx") {
outputPath = outputPath + ".docx"
outputPath += ".docx"
}
// Create the file
// #nosec G304 - Output path is provided by user via CLI argument, which is expected behavior
file, err := os.Create(outputPath)
if err != nil {
return fmt.Errorf("failed to create output file: %w", err)
}
defer file.Close()
// Ensure file is closed even if WriteTo fails. Close errors are logged but not
// fatal since the document content has already been written to disk. A close
// error typically indicates a filesystem synchronization issue that doesn't
// affect the validity of the exported file.
defer func() {
if err := file.Close(); err != nil {
fmt.Fprintf(os.Stderr, "warning: failed to close output file: %v\n", err)
}
}()
// Save the document
_, err = doc.WriteTo(file)
@ -119,7 +131,8 @@ func (e *DocxExporter) exportItem(doc *docx.Docx, item *models.Item) {
// Add item type as heading
if item.Type != "" {
itemPara := doc.AddParagraph()
itemPara.AddText(strings.Title(item.Type)).Size("24").Bold()
caser := cases.Title(language.English)
itemPara.AddText(caser.String(item.Type)).Size("24").Bold()
}
// Add sub-items
@ -180,10 +193,10 @@ func (e *DocxExporter) exportSubItem(doc *docx.Docx, subItem *models.SubItem) {
}
}
// GetSupportedFormat returns the format name this exporter supports.
// SupportedFormat returns the format name this exporter supports.
//
// Returns:
// - A string representing the supported format ("docx")
func (e *DocxExporter) GetSupportedFormat() string {
return "docx"
func (e *DocxExporter) SupportedFormat() string {
return FormatDocx
}

View File

@ -1,4 +1,3 @@
// Package exporters_test provides tests for the docx exporter.
package exporters
import (
@ -30,13 +29,13 @@ func TestNewDocxExporter(t *testing.T) {
}
}
// TestDocxExporter_GetSupportedFormat tests the GetSupportedFormat method.
func TestDocxExporter_GetSupportedFormat(t *testing.T) {
// TestDocxExporter_SupportedFormat tests the SupportedFormat method.
func TestDocxExporter_SupportedFormat(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
expected := "docx"
result := exporter.GetSupportedFormat()
result := exporter.SupportedFormat()
if result != expected {
t.Errorf("Expected format '%s', got '%s'", expected, result)
@ -90,7 +89,6 @@ func TestDocxExporter_Export_AddDocxExtension(t *testing.T) {
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
@ -155,7 +153,6 @@ func TestDocxExporter_ExportLesson(t *testing.T) {
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
@ -222,7 +219,6 @@ func TestDocxExporter_ExportItem(t *testing.T) {
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
@ -276,7 +272,6 @@ func TestDocxExporter_ExportSubItem(t *testing.T) {
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
@ -335,7 +330,7 @@ func TestDocxExporter_ComplexCourse(t *testing.T) {
Caption: "<p>Watch this introductory video</p>",
Media: &models.Media{
Video: &models.VideoMedia{
OriginalUrl: "https://example.com/intro.mp4",
OriginalURL: "https://example.com/intro.mp4",
Duration: 300,
},
},
@ -363,7 +358,7 @@ func TestDocxExporter_ComplexCourse(t *testing.T) {
Caption: "<p>Course overview diagram</p>",
Media: &models.Media{
Image: &models.ImageMedia{
OriginalUrl: "https://example.com/overview.png",
OriginalURL: "https://example.com/overview.png",
},
},
},
@ -409,7 +404,6 @@ func TestDocxExporter_ComplexCourse(t *testing.T) {
// Export course
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
@ -444,7 +438,6 @@ func TestDocxExporter_EmptyCourse(t *testing.T) {
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
@ -493,7 +486,6 @@ func TestDocxExporter_HTMLCleaning(t *testing.T) {
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
@ -516,7 +508,6 @@ func TestDocxExporter_ExistingDocxExtension(t *testing.T) {
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
@ -552,7 +543,6 @@ func TestDocxExporter_CaseInsensitiveExtension(t *testing.T) {
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed for case %d (%s): %v", i, testCase, err)
}
@ -615,12 +605,13 @@ func BenchmarkDocxExporter_Export(b *testing.B) {
// Create temporary directory
tempDir := b.TempDir()
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
outputPath := filepath.Join(tempDir, "benchmark-course.docx")
_ = exporter.Export(course, outputPath)
// Clean up for next iteration
os.Remove(outputPath)
// Clean up for next iteration. Remove errors are ignored because we've already
// benchmarked the export operation; cleanup failures don't affect the benchmark
// measurements or the validity of the next iteration's export.
_ = os.Remove(outputPath)
}
}
@ -641,7 +632,7 @@ func BenchmarkDocxExporter_ComplexCourse(b *testing.B) {
}
// Fill with test data
for i := 0; i < 10; i++ {
for i := range 10 {
lesson := models.Lesson{
ID: "lesson-" + string(rune(i)),
Title: "Lesson " + string(rune(i)),
@ -649,13 +640,13 @@ func BenchmarkDocxExporter_ComplexCourse(b *testing.B) {
Items: make([]models.Item, 5), // 5 items per lesson
}
for j := 0; j < 5; j++ {
for j := range 5 {
item := models.Item{
Type: "text",
Items: make([]models.SubItem, 3), // 3 sub-items per item
}
for k := 0; k < 3; k++ {
for k := range 3 {
item.Items[k] = models.SubItem{
Heading: "<h3>Heading " + string(rune(k)) + "</h3>",
Paragraph: "<p>Paragraph content with <strong>formatting</strong> for performance testing.</p>",
@ -670,10 +661,11 @@ func BenchmarkDocxExporter_ComplexCourse(b *testing.B) {
tempDir := b.TempDir()
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
outputPath := filepath.Join(tempDir, "benchmark-complex.docx")
_ = exporter.Export(course, outputPath)
os.Remove(outputPath)
// Remove errors are ignored because we're only benchmarking the export
// operation itself; cleanup failures don't affect the benchmark metrics.
_ = os.Remove(outputPath)
}
}

View File

@ -0,0 +1,101 @@
// Package exporters_test provides examples for the exporters package.
package exporters_test
import (
"fmt"
"log"
"github.com/kjanat/articulate-parser/internal/exporters"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// ExampleNewFactory demonstrates creating an exporter factory.
func ExampleNewFactory() {
htmlCleaner := services.NewHTMLCleaner()
factory := exporters.NewFactory(htmlCleaner)
// Get supported formats
formats := factory.SupportedFormats()
fmt.Printf("Supported formats: %d\n", len(formats))
// Output: Supported formats: 6
}
// ExampleFactory_CreateExporter demonstrates creating exporters.
func ExampleFactory_CreateExporter() {
htmlCleaner := services.NewHTMLCleaner()
factory := exporters.NewFactory(htmlCleaner)
// Create a markdown exporter
exporter, err := factory.CreateExporter("markdown")
if err != nil {
log.Fatal(err)
}
fmt.Printf("Created: %s exporter\n", exporter.SupportedFormat())
// Output: Created: markdown exporter
}
// ExampleFactory_CreateExporter_caseInsensitive demonstrates case-insensitive format names.
func ExampleFactory_CreateExporter_caseInsensitive() {
htmlCleaner := services.NewHTMLCleaner()
factory := exporters.NewFactory(htmlCleaner)
// All these work (case-insensitive)
formats := []string{"MARKDOWN", "Markdown", "markdown", "MD"}
for _, format := range formats {
exporter, _ := factory.CreateExporter(format)
fmt.Printf("%s -> %s\n", format, exporter.SupportedFormat())
}
// Output:
// MARKDOWN -> markdown
// Markdown -> markdown
// markdown -> markdown
// MD -> markdown
}
// ExampleMarkdownExporter_Export demonstrates exporting to Markdown.
func ExampleMarkdownExporter_Export() {
htmlCleaner := services.NewHTMLCleaner()
exporter := exporters.NewMarkdownExporter(htmlCleaner)
course := &models.Course{
ShareID: "example-id",
Course: models.CourseInfo{
Title: "Example Course",
Description: "<p>Course description</p>",
},
}
// Export to markdown file
err := exporter.Export(course, "output.md")
if err != nil {
log.Fatal(err)
}
fmt.Println("Export complete")
// Output: Export complete
}
// ExampleDocxExporter_Export demonstrates exporting to DOCX.
func ExampleDocxExporter_Export() {
htmlCleaner := services.NewHTMLCleaner()
exporter := exporters.NewDocxExporter(htmlCleaner)
course := &models.Course{
ShareID: "example-id",
Course: models.CourseInfo{
Title: "Example Course",
},
}
// Export to Word document
err := exporter.Export(course, "output.docx")
if err != nil {
log.Fatal(err)
}
fmt.Println("DOCX export complete")
// Output: DOCX export complete
}

View File

@ -1,5 +1,3 @@
// Package exporters provides implementations of the Exporter interface
// for converting Articulate Rise courses into various file formats.
package exporters
import (
@ -10,6 +8,13 @@ import (
"github.com/kjanat/articulate-parser/internal/services"
)
// Format constants for supported export formats.
const (
FormatMarkdown = "markdown"
FormatDocx = "docx"
FormatHTML = "html"
)
// Factory implements the ExporterFactory interface.
// It creates appropriate exporter instances based on the requested format.
type Factory struct {
@ -33,33 +38,22 @@ func NewFactory(htmlCleaner *services.HTMLCleaner) interfaces.ExporterFactory {
}
// CreateExporter creates an exporter for the specified format.
// It returns an appropriate exporter implementation based on the format string.
// Format strings are case-insensitive.
//
// Parameters:
// - format: The desired export format (e.g., "markdown", "docx")
//
// Returns:
// - An implementation of the Exporter interface if the format is supported
// - An error if the format is not supported
// Format strings are case-insensitive (e.g., "markdown", "DOCX").
func (f *Factory) CreateExporter(format string) (interfaces.Exporter, error) {
switch strings.ToLower(format) {
case "markdown", "md":
case FormatMarkdown, "md":
return NewMarkdownExporter(f.htmlCleaner), nil
case "docx", "word":
case FormatDocx, "word":
return NewDocxExporter(f.htmlCleaner), nil
case "html", "htm":
case FormatHTML, "htm":
return NewHTMLExporter(f.htmlCleaner), nil
default:
return nil, fmt.Errorf("unsupported export format: %s", format)
}
}
// GetSupportedFormats returns a list of all supported export formats.
// This includes both primary format names and their aliases.
//
// Returns:
// - A string slice containing all supported format names
func (f *Factory) GetSupportedFormats() []string {
return []string{"markdown", "md", "docx", "word", "html", "htm"}
// SupportedFormats returns a list of all supported export formats,
// including both primary format names and their aliases.
func (f *Factory) SupportedFormats() []string {
return []string{FormatMarkdown, "md", FormatDocx, "word", FormatHTML, "htm"}
}

View File

@ -1,4 +1,3 @@
// Package exporters_test provides tests for the exporter factory.
package exporters
import (
@ -125,7 +124,7 @@ func TestFactory_CreateExporter(t *testing.T) {
}
// Check supported format
supportedFormat := exporter.GetSupportedFormat()
supportedFormat := exporter.SupportedFormat()
if supportedFormat != tc.expectedFormat {
t.Errorf("Expected supported format '%s' for format '%s', got '%s'", tc.expectedFormat, tc.format, supportedFormat)
}
@ -164,7 +163,6 @@ func TestFactory_CreateExporter_CaseInsensitive(t *testing.T) {
for _, tc := range testCases {
t.Run(tc.format, func(t *testing.T) {
exporter, err := factory.CreateExporter(tc.format)
if err != nil {
t.Fatalf("Unexpected error for format '%s': %v", tc.format, err)
}
@ -173,7 +171,7 @@ func TestFactory_CreateExporter_CaseInsensitive(t *testing.T) {
t.Fatalf("CreateExporter returned nil for format '%s'", tc.format)
}
supportedFormat := exporter.GetSupportedFormat()
supportedFormat := exporter.SupportedFormat()
if supportedFormat != tc.expectedFormat {
t.Errorf("Expected supported format '%s' for format '%s', got '%s'", tc.expectedFormat, tc.format, supportedFormat)
}
@ -221,15 +219,15 @@ func TestFactory_CreateExporter_ErrorMessages(t *testing.T) {
}
}
// TestFactory_GetSupportedFormats tests the GetSupportedFormats method.
func TestFactory_GetSupportedFormats(t *testing.T) {
// TestFactory_SupportedFormats tests the SupportedFormats method.
func TestFactory_SupportedFormats(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
formats := factory.GetSupportedFormats()
formats := factory.SupportedFormats()
if formats == nil {
t.Fatal("GetSupportedFormats() returned nil")
t.Fatal("SupportedFormats() returned nil")
}
expected := []string{"markdown", "md", "docx", "word", "html", "htm"}
@ -246,22 +244,22 @@ func TestFactory_GetSupportedFormats(t *testing.T) {
for _, format := range formats {
exporter, err := factory.CreateExporter(format)
if err != nil {
t.Errorf("Format '%s' from GetSupportedFormats() should be creatable, got error: %v", format, err)
t.Errorf("Format '%s' from SupportedFormats() should be creatable, got error: %v", format, err)
}
if exporter == nil {
t.Errorf("Format '%s' from GetSupportedFormats() should create non-nil exporter", format)
t.Errorf("Format '%s' from SupportedFormats() should create non-nil exporter", format)
}
}
}
// TestFactory_GetSupportedFormats_Immutable tests that the returned slice is safe to modify.
func TestFactory_GetSupportedFormats_Immutable(t *testing.T) {
// TestFactory_SupportedFormats_Immutable tests that the returned slice is safe to modify.
func TestFactory_SupportedFormats_Immutable(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
// Get formats twice
formats1 := factory.GetSupportedFormats()
formats2 := factory.GetSupportedFormats()
formats1 := factory.SupportedFormats()
formats2 := factory.SupportedFormats()
// Modify first slice
if len(formats1) > 0 {
@ -270,13 +268,13 @@ func TestFactory_GetSupportedFormats_Immutable(t *testing.T) {
// Check that second call returns unmodified data
if len(formats2) > 0 && formats2[0] == "modified" {
t.Error("GetSupportedFormats() should return independent slices")
t.Error("SupportedFormats() should return independent slices")
}
// Verify original functionality still works
formats3 := factory.GetSupportedFormats()
formats3 := factory.SupportedFormats()
if len(formats3) == 0 {
t.Error("GetSupportedFormats() should still return formats after modification")
t.Error("SupportedFormats() should still return formats after modification")
}
}
@ -436,7 +434,7 @@ func TestFactory_FormatNormalization(t *testing.T) {
t.Fatalf("Failed to create exporter for '%s': %v", tc.input, err)
}
format := exporter.GetSupportedFormat()
format := exporter.SupportedFormat()
if format != tc.expected {
t.Errorf("Expected format '%s' for input '%s', got '%s'", tc.expected, tc.input, format)
}
@ -449,8 +447,7 @@ func BenchmarkFactory_CreateExporter(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
_, _ = factory.CreateExporter("markdown")
}
}
@ -460,19 +457,17 @@ func BenchmarkFactory_CreateExporter_Docx(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
_, _ = factory.CreateExporter("docx")
}
}
// BenchmarkFactory_GetSupportedFormats benchmarks the GetSupportedFormats method.
func BenchmarkFactory_GetSupportedFormats(b *testing.B) {
// BenchmarkFactory_SupportedFormats benchmarks the SupportedFormats method.
func BenchmarkFactory_SupportedFormats(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
b.ResetTimer()
for i := 0; i < b.N; i++ {
_ = factory.GetSupportedFormats()
for b.Loop() {
_ = factory.SupportedFormats()
}
}

View File

@ -1,24 +1,30 @@
// Package exporters provides implementations of the Exporter interface
// for converting Articulate Rise courses into various file formats.
package exporters
import (
"bytes"
_ "embed"
"fmt"
"html"
"html/template"
"io"
"os"
"strings"
"github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
//go:embed html_styles.css
var defaultCSS string
//go:embed html_template.gohtml
var htmlTemplate string
// HTMLExporter implements the Exporter interface for HTML format.
// It converts Articulate Rise course data into a structured HTML document.
// It converts Articulate Rise course data into a structured HTML document using templates.
type HTMLExporter struct {
// htmlCleaner is used to convert HTML content to plain text when needed
htmlCleaner *services.HTMLCleaner
// tmpl holds the parsed HTML template
tmpl *template.Template
}
// NewHTMLExporter creates a new HTMLExporter instance.
@ -30,8 +36,21 @@ type HTMLExporter struct {
// Returns:
// - An implementation of the Exporter interface for HTML format
func NewHTMLExporter(htmlCleaner *services.HTMLCleaner) interfaces.Exporter {
// Parse the template with custom functions
funcMap := template.FuncMap{
"safeHTML": func(s string) template.HTML {
return template.HTML(s) // #nosec G203 - HTML content is from trusted course data
},
"safeCSS": func(s string) template.CSS {
return template.CSS(s) // #nosec G203 - CSS content is from trusted embedded file
},
}
tmpl := template.Must(template.New("html").Funcs(funcMap).Parse(htmlTemplate))
return &HTMLExporter{
htmlCleaner: htmlCleaner,
tmpl: tmpl,
}
}
@ -46,431 +65,41 @@ func NewHTMLExporter(htmlCleaner *services.HTMLCleaner) interfaces.Exporter {
// Returns:
// - An error if writing to the output file fails
func (e *HTMLExporter) Export(course *models.Course, outputPath string) error {
var buf bytes.Buffer
// Write HTML document structure
buf.WriteString("<!DOCTYPE html>\n")
buf.WriteString("<html lang=\"en\">\n")
buf.WriteString("<head>\n")
buf.WriteString(" <meta charset=\"UTF-8\">\n")
buf.WriteString(" <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n")
buf.WriteString(fmt.Sprintf(" <title>%s</title>\n", html.EscapeString(course.Course.Title)))
buf.WriteString(" <style>\n")
buf.WriteString(e.getDefaultCSS())
buf.WriteString(" </style>\n")
buf.WriteString("</head>\n")
buf.WriteString("<body>\n")
// Write course header
buf.WriteString(fmt.Sprintf(" <header>\n <h1>%s</h1>\n", html.EscapeString(course.Course.Title)))
if course.Course.Description != "" {
buf.WriteString(fmt.Sprintf(" <div class=\"course-description\">%s</div>\n", course.Course.Description))
f, err := os.Create(outputPath)
if err != nil {
return fmt.Errorf("failed to create file: %w", err)
}
buf.WriteString(" </header>\n\n")
defer f.Close()
// Add metadata section
buf.WriteString(" <section class=\"course-info\">\n")
buf.WriteString(" <h2>Course Information</h2>\n")
buf.WriteString(" <ul>\n")
buf.WriteString(fmt.Sprintf(" <li><strong>Course ID:</strong> %s</li>\n", html.EscapeString(course.Course.ID)))
buf.WriteString(fmt.Sprintf(" <li><strong>Share ID:</strong> %s</li>\n", html.EscapeString(course.ShareID)))
buf.WriteString(fmt.Sprintf(" <li><strong>Navigation Mode:</strong> %s</li>\n", html.EscapeString(course.Course.NavigationMode)))
if course.Course.ExportSettings != nil {
buf.WriteString(fmt.Sprintf(" <li><strong>Export Format:</strong> %s</li>\n", html.EscapeString(course.Course.ExportSettings.Format)))
}
buf.WriteString(" </ul>\n")
buf.WriteString(" </section>\n\n")
// Process lessons
lessonCounter := 0
for _, lesson := range course.Course.Lessons {
if lesson.Type == "section" {
buf.WriteString(fmt.Sprintf(" <section class=\"course-section\">\n <h2>%s</h2>\n </section>\n\n", html.EscapeString(lesson.Title)))
continue
}
lessonCounter++
buf.WriteString(fmt.Sprintf(" <section class=\"lesson\">\n <h3>Lesson %d: %s</h3>\n", lessonCounter, html.EscapeString(lesson.Title)))
if lesson.Description != "" {
buf.WriteString(fmt.Sprintf(" <div class=\"lesson-description\">%s</div>\n", lesson.Description))
}
// Process lesson items
for _, item := range lesson.Items {
e.processItemToHTML(&buf, item)
}
buf.WriteString(" </section>\n\n")
}
buf.WriteString("</body>\n")
buf.WriteString("</html>\n")
return os.WriteFile(outputPath, buf.Bytes(), 0644)
return e.WriteHTML(f, course)
}
// GetSupportedFormat returns the format name this exporter supports
// WriteHTML writes the HTML content to an io.Writer.
// This allows for better testability and flexibility in output destinations.
//
// Parameters:
// - w: The writer to output HTML content to
// - course: The course data model to export
//
// Returns:
// - An error if writing fails
func (e *HTMLExporter) WriteHTML(w io.Writer, course *models.Course) error {
// Prepare template data
data := prepareTemplateData(course, e.htmlCleaner)
// Execute template
if err := e.tmpl.Execute(w, data); err != nil {
return fmt.Errorf("failed to execute template: %w", err)
}
return nil
}
// SupportedFormat returns the format name this exporter supports
// It indicates the file format that the HTMLExporter can generate.
//
// Returns:
// - A string representing the supported format ("html")
func (e *HTMLExporter) GetSupportedFormat() string {
return "html"
}
// getDefaultCSS returns basic CSS styling for the HTML document
func (e *HTMLExporter) getDefaultCSS() string {
return `
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
line-height: 1.6;
color: #333;
max-width: 800px;
margin: 0 auto;
padding: 20px;
background-color: #f9f9f9;
}
header {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 2rem;
border-radius: 10px;
margin-bottom: 2rem;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
}
header h1 {
margin: 0;
font-size: 2.5rem;
font-weight: 300;
}
.course-description {
margin-top: 1rem;
font-size: 1.1rem;
opacity: 0.9;
}
.course-info {
background: white;
padding: 1.5rem;
border-radius: 8px;
margin-bottom: 2rem;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}
.course-info h2 {
margin-top: 0;
color: #4a5568;
border-bottom: 2px solid #e2e8f0;
padding-bottom: 0.5rem;
}
.course-info ul {
list-style: none;
padding: 0;
}
.course-info li {
margin: 0.5rem 0;
padding: 0.5rem;
background: #f7fafc;
border-radius: 4px;
}
.course-section {
background: #4299e1;
color: white;
padding: 1.5rem;
border-radius: 8px;
margin: 2rem 0;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}
.course-section h2 {
margin: 0;
font-weight: 400;
}
.lesson {
background: white;
padding: 2rem;
border-radius: 8px;
margin: 2rem 0;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
border-left: 4px solid #4299e1;
}
.lesson h3 {
margin-top: 0;
color: #2d3748;
font-size: 1.5rem;
}
.lesson-description {
margin: 1rem 0;
padding: 1rem;
background: #f7fafc;
border-radius: 4px;
border-left: 3px solid #4299e1;
}
.item {
margin: 1.5rem 0;
padding: 1rem;
border-radius: 6px;
background: #fafafa;
border: 1px solid #e2e8f0;
}
.item h4 {
margin-top: 0;
color: #4a5568;
font-size: 1.2rem;
text-transform: capitalize;
}
.text-item {
background: #f0fff4;
border-left: 3px solid #48bb78;
}
.list-item {
background: #fffaf0;
border-left: 3px solid #ed8936;
}
.knowledge-check {
background: #e6fffa;
border-left: 3px solid #38b2ac;
}
.multimedia-item {
background: #faf5ff;
border-left: 3px solid #9f7aea;
}
.interactive-item {
background: #fff5f5;
border-left: 3px solid #f56565;
}
.unknown-item {
background: #f7fafc;
border-left: 3px solid #a0aec0;
}
.answers {
margin: 1rem 0;
}
.answers h5 {
margin: 0.5rem 0;
color: #4a5568;
}
.answers ol {
margin: 0.5rem 0;
padding-left: 1.5rem;
}
.answers li {
margin: 0.3rem 0;
padding: 0.3rem;
}
.correct-answer {
background: #c6f6d5;
border-radius: 3px;
font-weight: bold;
}
.correct-answer::after {
content: " ✓";
color: #38a169;
}
.feedback {
margin: 1rem 0;
padding: 1rem;
background: #edf2f7;
border-radius: 4px;
border-left: 3px solid #4299e1;
font-style: italic;
}
.media-info {
background: #edf2f7;
padding: 1rem;
border-radius: 4px;
margin: 0.5rem 0;
}
.media-info strong {
color: #4a5568;
}
hr {
border: none;
height: 2px;
background: linear-gradient(to right, #667eea, #764ba2);
margin: 2rem 0;
border-radius: 1px;
}
ul {
padding-left: 1.5rem;
}
li {
margin: 0.5rem 0;
}
`
}
// processItemToHTML converts a course item into HTML format
// and appends it to the provided buffer. It handles different item types
// with appropriate HTML formatting.
//
// Parameters:
// - buf: The buffer to write the HTML content to
// - item: The course item to process
func (e *HTMLExporter) processItemToHTML(buf *bytes.Buffer, item models.Item) {
switch strings.ToLower(item.Type) {
case "text":
e.processTextItem(buf, item)
case "list":
e.processListItem(buf, item)
case "knowledgecheck":
e.processKnowledgeCheckItem(buf, item)
case "multimedia":
e.processMultimediaItem(buf, item)
case "image":
e.processImageItem(buf, item)
case "interactive":
e.processInteractiveItem(buf, item)
case "divider":
e.processDividerItem(buf)
default:
e.processUnknownItem(buf, item)
}
}
// processTextItem handles text content with headings and paragraphs
func (e *HTMLExporter) processTextItem(buf *bytes.Buffer, item models.Item) {
buf.WriteString(" <div class=\"item text-item\">\n")
buf.WriteString(" <h4>Text Content</h4>\n")
for _, subItem := range item.Items {
if subItem.Heading != "" {
buf.WriteString(fmt.Sprintf(" <h5>%s</h5>\n", subItem.Heading))
}
if subItem.Paragraph != "" {
buf.WriteString(fmt.Sprintf(" <div>%s</div>\n", subItem.Paragraph))
}
}
buf.WriteString(" </div>\n\n")
}
// processListItem handles list content
func (e *HTMLExporter) processListItem(buf *bytes.Buffer, item models.Item) {
buf.WriteString(" <div class=\"item list-item\">\n")
buf.WriteString(" <h4>List</h4>\n")
buf.WriteString(" <ul>\n")
for _, subItem := range item.Items {
if subItem.Paragraph != "" {
cleanText := e.htmlCleaner.CleanHTML(subItem.Paragraph)
buf.WriteString(fmt.Sprintf(" <li>%s</li>\n", html.EscapeString(cleanText)))
}
}
buf.WriteString(" </ul>\n")
buf.WriteString(" </div>\n\n")
}
// processKnowledgeCheckItem handles quiz questions and answers
func (e *HTMLExporter) processKnowledgeCheckItem(buf *bytes.Buffer, item models.Item) {
buf.WriteString(" <div class=\"item knowledge-check\">\n")
buf.WriteString(" <h4>Knowledge Check</h4>\n")
for _, subItem := range item.Items {
if subItem.Title != "" {
buf.WriteString(fmt.Sprintf(" <p><strong>Question:</strong> %s</p>\n", subItem.Title))
}
if len(subItem.Answers) > 0 {
e.processAnswers(buf, subItem.Answers)
}
if subItem.Feedback != "" {
buf.WriteString(fmt.Sprintf(" <div class=\"feedback\"><strong>Feedback:</strong> %s</div>\n", subItem.Feedback))
}
}
buf.WriteString(" </div>\n\n")
}
// processMultimediaItem handles multimedia content like videos
func (e *HTMLExporter) processMultimediaItem(buf *bytes.Buffer, item models.Item) {
buf.WriteString(" <div class=\"item multimedia-item\">\n")
buf.WriteString(" <h4>Media Content</h4>\n")
for _, subItem := range item.Items {
if subItem.Title != "" {
buf.WriteString(fmt.Sprintf(" <h5>%s</h5>\n", subItem.Title))
}
if subItem.Media != nil {
if subItem.Media.Video != nil {
buf.WriteString(" <div class=\"media-info\">\n")
buf.WriteString(fmt.Sprintf(" <p><strong>Video:</strong> %s</p>\n", html.EscapeString(subItem.Media.Video.OriginalUrl)))
if subItem.Media.Video.Duration > 0 {
buf.WriteString(fmt.Sprintf(" <p><strong>Duration:</strong> %d seconds</p>\n", subItem.Media.Video.Duration))
}
buf.WriteString(" </div>\n")
}
}
if subItem.Caption != "" {
buf.WriteString(fmt.Sprintf(" <div><em>%s</em></div>\n", subItem.Caption))
}
}
buf.WriteString(" </div>\n\n")
}
// processImageItem handles image content
func (e *HTMLExporter) processImageItem(buf *bytes.Buffer, item models.Item) {
buf.WriteString(" <div class=\"item multimedia-item\">\n")
buf.WriteString(" <h4>Image</h4>\n")
for _, subItem := range item.Items {
if subItem.Media != nil && subItem.Media.Image != nil {
buf.WriteString(" <div class=\"media-info\">\n")
buf.WriteString(fmt.Sprintf(" <p><strong>Image:</strong> %s</p>\n", html.EscapeString(subItem.Media.Image.OriginalUrl)))
buf.WriteString(" </div>\n")
}
if subItem.Caption != "" {
buf.WriteString(fmt.Sprintf(" <div><em>%s</em></div>\n", subItem.Caption))
}
}
buf.WriteString(" </div>\n\n")
}
// processInteractiveItem handles interactive content
func (e *HTMLExporter) processInteractiveItem(buf *bytes.Buffer, item models.Item) {
buf.WriteString(" <div class=\"item interactive-item\">\n")
buf.WriteString(" <h4>Interactive Content</h4>\n")
for _, subItem := range item.Items {
if subItem.Title != "" {
buf.WriteString(fmt.Sprintf(" <p><strong>%s</strong></p>\n", subItem.Title))
}
if subItem.Paragraph != "" {
buf.WriteString(fmt.Sprintf(" <div>%s</div>\n", subItem.Paragraph))
}
}
buf.WriteString(" </div>\n\n")
}
// processDividerItem handles divider elements
func (e *HTMLExporter) processDividerItem(buf *bytes.Buffer) {
buf.WriteString(" <hr>\n\n")
}
// processUnknownItem handles unknown or unsupported item types
func (e *HTMLExporter) processUnknownItem(buf *bytes.Buffer, item models.Item) {
if len(item.Items) > 0 {
buf.WriteString(" <div class=\"item unknown-item\">\n")
buf.WriteString(fmt.Sprintf(" <h4>%s Content</h4>\n", strings.Title(item.Type)))
for _, subItem := range item.Items {
e.processGenericSubItem(buf, subItem)
}
buf.WriteString(" </div>\n\n")
}
}
// processGenericSubItem processes sub-items for unknown types
func (e *HTMLExporter) processGenericSubItem(buf *bytes.Buffer, subItem models.SubItem) {
if subItem.Title != "" {
buf.WriteString(fmt.Sprintf(" <p><strong>%s</strong></p>\n", subItem.Title))
}
if subItem.Paragraph != "" {
buf.WriteString(fmt.Sprintf(" <div>%s</div>\n", subItem.Paragraph))
}
}
// processAnswers processes answer choices for quiz questions
func (e *HTMLExporter) processAnswers(buf *bytes.Buffer, answers []models.Answer) {
buf.WriteString(" <div class=\"answers\">\n")
buf.WriteString(" <h5>Answers:</h5>\n")
buf.WriteString(" <ol>\n")
for _, answer := range answers {
cssClass := ""
if answer.Correct {
cssClass = " class=\"correct-answer\""
}
buf.WriteString(fmt.Sprintf(" <li%s>%s</li>\n", cssClass, html.EscapeString(answer.Title)))
}
buf.WriteString(" </ol>\n")
buf.WriteString(" </div>\n")
func (e *HTMLExporter) SupportedFormat() string {
return FormatHTML
}

View File

@ -0,0 +1,173 @@
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
line-height: 1.6;
color: #333;
max-width: 800px;
margin: 0 auto;
padding: 20px;
background-color: #f9f9f9;
}
header {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 2rem;
border-radius: 10px;
margin-bottom: 2rem;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
}
header h1 {
margin: 0;
font-size: 2.5rem;
font-weight: 300;
}
.course-description {
margin-top: 1rem;
font-size: 1.1rem;
opacity: 0.9;
}
.course-info {
background: white;
padding: 1.5rem;
border-radius: 8px;
margin-bottom: 2rem;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}
.course-info h2 {
margin-top: 0;
color: #4a5568;
border-bottom: 2px solid #e2e8f0;
padding-bottom: 0.5rem;
}
.course-info ul {
list-style: none;
padding: 0;
}
.course-info li {
margin: 0.5rem 0;
padding: 0.5rem;
background: #f7fafc;
border-radius: 4px;
}
.course-section {
background: #4299e1;
color: white;
padding: 1.5rem;
border-radius: 8px;
margin: 2rem 0;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}
.course-section h2 {
margin: 0;
font-weight: 400;
}
.lesson {
background: white;
padding: 2rem;
border-radius: 8px;
margin: 2rem 0;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
border-left: 4px solid #4299e1;
}
.lesson h3 {
margin-top: 0;
color: #2d3748;
font-size: 1.5rem;
}
.lesson-description {
margin: 1rem 0;
padding: 1rem;
background: #f7fafc;
border-radius: 4px;
border-left: 3px solid #4299e1;
}
.item {
margin: 1.5rem 0;
padding: 1rem;
border-radius: 6px;
background: #fafafa;
border: 1px solid #e2e8f0;
}
.item h4 {
margin-top: 0;
color: #4a5568;
font-size: 1.2rem;
text-transform: capitalize;
}
.text-item {
background: #f0fff4;
border-left: 3px solid #48bb78;
}
.list-item {
background: #fffaf0;
border-left: 3px solid #ed8936;
}
.knowledge-check {
background: #e6fffa;
border-left: 3px solid #38b2ac;
}
.multimedia-item {
background: #faf5ff;
border-left: 3px solid #9f7aea;
}
.interactive-item {
background: #fff5f5;
border-left: 3px solid #f56565;
}
.unknown-item {
background: #f7fafc;
border-left: 3px solid #a0aec0;
}
.answers {
margin: 1rem 0;
}
.answers h5 {
margin: 0.5rem 0;
color: #4a5568;
}
.answers ol {
margin: 0.5rem 0;
padding-left: 1.5rem;
}
.answers li {
margin: 0.3rem 0;
padding: 0.3rem;
}
.correct-answer {
background: #c6f6d5;
border-radius: 3px;
font-weight: bold;
}
.correct-answer::after {
content: " ✓";
color: #38a169;
}
.feedback {
margin: 1rem 0;
padding: 1rem;
background: #edf2f7;
border-radius: 4px;
border-left: 3px solid #4299e1;
font-style: italic;
}
.media-info {
background: #edf2f7;
padding: 1rem;
border-radius: 4px;
margin: 0.5rem 0;
}
.media-info strong {
color: #4a5568;
}
hr {
border: none;
height: 2px;
background: linear-gradient(to right, #667eea, #764ba2);
margin: 2rem 0;
border-radius: 1px;
}
ul {
padding-left: 1.5rem;
}
li {
margin: 0.5rem 0;
}

View File

@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{{.Course.Title}}</title>
<style>
{{safeCSS .CSS}}
</style>
</head>
<body>
<header>
<h1>{{.Course.Title}}</h1>
{{if .Course.Description}}
<div class="course-description">{{safeHTML .Course.Description}}</div>
{{end}}
</header>
<section class="course-info">
<h2>Course Information</h2>
<ul>
<li><strong>Course ID:</strong> {{.Course.ID}}</li>
<li><strong>Share ID:</strong> {{.ShareID}}</li>
<li><strong>Navigation Mode:</strong> {{.Course.NavigationMode}}</li>
{{if .Course.ExportSettings}}
<li><strong>Export Format:</strong> {{.Course.ExportSettings.Format}}</li>
{{end}}
</ul>
</section>
{{range .Sections}}
{{if eq .Type "section"}}
<section class="course-section">
<h2>{{.Title}}</h2>
</section>
{{else}}
<section class="lesson">
<h3>Lesson {{.Number}}: {{.Title}}</h3>
{{if .Description}}
<div class="lesson-description">{{safeHTML .Description}}</div>
{{end}}
{{range .Items}}
{{template "item" .}}
{{end}}
</section>
{{end}}
{{end}}
</body>
</html>
{{define "item"}}
{{if eq .Type "text"}}{{template "textItem" .}}
{{else if eq .Type "list"}}{{template "listItem" .}}
{{else if eq .Type "knowledgecheck"}}{{template "knowledgeCheckItem" .}}
{{else if eq .Type "multimedia"}}{{template "multimediaItem" .}}
{{else if eq .Type "image"}}{{template "imageItem" .}}
{{else if eq .Type "interactive"}}{{template "interactiveItem" .}}
{{else if eq .Type "divider"}}{{template "dividerItem" .}}
{{else}}{{template "unknownItem" .}}
{{end}}
{{end}}
{{define "textItem"}}
<div class="item text-item">
<h4>Text Content</h4>
{{range .Items}}
{{if .Heading}}
{{safeHTML .Heading}}
{{end}}
{{if .Paragraph}}
<div>{{safeHTML .Paragraph}}</div>
{{end}}
{{end}}
</div>
{{end}}
{{define "listItem"}}
<div class="item list-item">
<h4>List</h4>
<ul>
{{range .Items}}
{{if .Paragraph}}
<li>{{.CleanText}}</li>
{{end}}
{{end}}
</ul>
</div>
{{end}}
{{define "knowledgeCheckItem"}}
<div class="item knowledge-check">
<h4>Knowledge Check</h4>
{{range .Items}}
{{if .Title}}
<p><strong>Question:</strong> {{safeHTML .Title}}</p>
{{end}}
{{if .Answers}}
<div class="answers">
<h5>Answers:</h5>
<ol>
{{range .Answers}}
<li{{if .Correct}} class="correct-answer"{{end}}>{{.Title}}</li>
{{end}}
</ol>
</div>
{{end}}
{{if .Feedback}}
<div class="feedback"><strong>Feedback:</strong> {{safeHTML .Feedback}}</div>
{{end}}
{{end}}
</div>
{{end}}
{{define "multimediaItem"}}
<div class="item multimedia-item">
<h4>Media Content</h4>
{{range .Items}}
{{if .Title}}
<h5>{{.Title}}</h5>
{{end}}
{{if .Media}}
{{if .Media.Video}}
<div class="media-info">
<p><strong>Video:</strong> {{.Media.Video.OriginalURL}}</p>
{{if gt .Media.Video.Duration 0}}
<p><strong>Duration:</strong> {{.Media.Video.Duration}} seconds</p>
{{end}}
</div>
{{end}}
{{end}}
{{if .Caption}}
<div><em>{{.Caption}}</em></div>
{{end}}
{{end}}
</div>
{{end}}
{{define "imageItem"}}
<div class="item multimedia-item">
<h4>Image</h4>
{{range .Items}}
{{if and .Media .Media.Image}}
<div class="media-info">
<p><strong>Image:</strong> {{.Media.Image.OriginalURL}}</p>
</div>
{{end}}
{{if .Caption}}
<div><em>{{.Caption}}</em></div>
{{end}}
{{end}}
</div>
{{end}}
{{define "interactiveItem"}}
<div class="item interactive-item">
<h4>Interactive Content</h4>
{{range .Items}}
{{if .Title}}
<p><strong>{{.Title}}</strong></p>
{{end}}
{{if .Paragraph}}
<div>{{safeHTML .Paragraph}}</div>
{{end}}
{{end}}
</div>
{{end}}
{{define "dividerItem"}}
<hr>
{{end}}
{{define "unknownItem"}}
<div class="item unknown-item">
<h4>{{.TypeTitle}} Content</h4>
{{range .Items}}
{{if .Title}}
<p><strong>{{.Title}}</strong></p>
{{end}}
{{if .Paragraph}}
<div>{{safeHTML .Paragraph}}</div>
{{end}}
{{end}}
</div>
{{end}}

View File

@ -0,0 +1,131 @@
package exporters
import (
"strings"
"golang.org/x/text/cases"
"golang.org/x/text/language"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// Item type constants.
const (
itemTypeText = "text"
itemTypeList = "list"
itemTypeKnowledgeCheck = "knowledgecheck"
itemTypeMultimedia = "multimedia"
itemTypeImage = "image"
itemTypeInteractive = "interactive"
itemTypeDivider = "divider"
)
// templateData represents the data structure passed to the HTML template.
type templateData struct {
Course models.CourseInfo
ShareID string
Sections []templateSection
CSS string
}
// templateSection represents a course section or lesson.
type templateSection struct {
Type string
Title string
Number int
Description string
Items []templateItem
}
// templateItem represents a course item with preprocessed data.
type templateItem struct {
Type string
TypeTitle string
Items []templateSubItem
}
// templateSubItem represents a sub-item with preprocessed data.
type templateSubItem struct {
Heading string
Paragraph string
Title string
Caption string
CleanText string
Answers []models.Answer
Feedback string
Media *models.Media
}
// prepareTemplateData converts a Course model into template-friendly data.
func prepareTemplateData(course *models.Course, htmlCleaner *services.HTMLCleaner) *templateData {
data := &templateData{
Course: course.Course,
ShareID: course.ShareID,
Sections: make([]templateSection, 0, len(course.Course.Lessons)),
CSS: defaultCSS,
}
lessonCounter := 0
for _, lesson := range course.Course.Lessons {
section := templateSection{
Type: lesson.Type,
Title: lesson.Title,
Description: lesson.Description,
}
if lesson.Type != "section" {
lessonCounter++
section.Number = lessonCounter
section.Items = prepareItems(lesson.Items, htmlCleaner)
}
data.Sections = append(data.Sections, section)
}
return data
}
// prepareItems converts model Items to template Items.
func prepareItems(items []models.Item, htmlCleaner *services.HTMLCleaner) []templateItem {
result := make([]templateItem, 0, len(items))
for _, item := range items {
tItem := templateItem{
Type: strings.ToLower(item.Type),
Items: make([]templateSubItem, 0, len(item.Items)),
}
// Set type title for unknown items
if tItem.Type != itemTypeText && tItem.Type != itemTypeList && tItem.Type != itemTypeKnowledgeCheck &&
tItem.Type != itemTypeMultimedia && tItem.Type != itemTypeImage && tItem.Type != itemTypeInteractive &&
tItem.Type != itemTypeDivider {
caser := cases.Title(language.English)
tItem.TypeTitle = caser.String(item.Type)
}
// Process sub-items
for _, subItem := range item.Items {
tSubItem := templateSubItem{
Heading: subItem.Heading,
Paragraph: subItem.Paragraph,
Title: subItem.Title,
Caption: subItem.Caption,
Answers: subItem.Answers,
Feedback: subItem.Feedback,
Media: subItem.Media,
}
// Clean HTML for list items
if tItem.Type == itemTypeList && subItem.Paragraph != "" {
tSubItem.CleanText = htmlCleaner.CleanHTML(subItem.Paragraph)
}
tItem.Items = append(tItem.Items, tSubItem)
}
result = append(result, tItem)
}
return result
}

View File

@ -1,8 +1,6 @@
// Package exporters_test provides tests for the html exporter.
package exporters
import (
"bytes"
"os"
"path/filepath"
"strings"
@ -30,15 +28,19 @@ func TestNewHTMLExporter(t *testing.T) {
if htmlExporter.htmlCleaner == nil {
t.Error("htmlCleaner should not be nil")
}
if htmlExporter.tmpl == nil {
t.Error("template should not be nil")
}
}
// TestHTMLExporter_GetSupportedFormat tests the GetSupportedFormat method.
func TestHTMLExporter_GetSupportedFormat(t *testing.T) {
// TestHTMLExporter_SupportedFormat tests the SupportedFormat method.
func TestHTMLExporter_SupportedFormat(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
expected := "html"
result := exporter.GetSupportedFormat()
result := exporter.SupportedFormat()
if result != expected {
t.Errorf("Expected format '%s', got '%s'", expected, result)
@ -119,6 +121,7 @@ func TestHTMLExporter_Export(t *testing.T) {
}
if !strings.Contains(contentStr, "font-family") {
t.Logf("Generated HTML (first 500 chars):\n%s", contentStr[:min(500, len(contentStr))])
t.Error("Output should contain CSS font-family")
}
}
@ -139,409 +142,7 @@ func TestHTMLExporter_Export_InvalidPath(t *testing.T) {
}
}
// TestHTMLExporter_ProcessTextItem tests the processTextItem method.
func TestHTMLExporter_ProcessTextItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h1>Test Heading</h1>",
Paragraph: "<p>Test paragraph with <strong>bold</strong> text.</p>",
},
{
Paragraph: "<p>Another paragraph.</p>",
},
},
}
exporter.processTextItem(&buf, item)
result := buf.String()
if !strings.Contains(result, "text-item") {
t.Error("Should contain text-item CSS class")
}
if !strings.Contains(result, "Text Content") {
t.Error("Should contain text content heading")
}
if !strings.Contains(result, "<h1>Test Heading</h1>") {
t.Error("Should preserve HTML heading")
}
if !strings.Contains(result, "<strong>bold</strong>") {
t.Error("Should preserve HTML formatting in paragraph")
}
}
// TestHTMLExporter_ProcessListItem tests the processListItem method.
func TestHTMLExporter_ProcessListItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>First item</p>"},
{Paragraph: "<p>Second item with <em>emphasis</em></p>"},
{Paragraph: "<p>Third item</p>"},
},
}
exporter.processListItem(&buf, item)
result := buf.String()
if !strings.Contains(result, "list-item") {
t.Error("Should contain list-item CSS class")
}
if !strings.Contains(result, "<ul>") {
t.Error("Should contain unordered list")
}
if !strings.Contains(result, "<li>First item</li>") {
t.Error("Should contain first list item")
}
if !strings.Contains(result, "<li>Second item with emphasis</li>") {
t.Error("Should contain second list item with cleaned HTML")
}
if !strings.Contains(result, "<li>Third item</li>") {
t.Error("Should contain third list item")
}
}
// TestHTMLExporter_ProcessKnowledgeCheckItem tests the processKnowledgeCheckItem method.
func TestHTMLExporter_ProcessKnowledgeCheckItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "<p>What is the correct answer?</p>",
Answers: []models.Answer{
{Title: "Wrong answer", Correct: false},
{Title: "Correct answer", Correct: true},
{Title: "Another wrong answer", Correct: false},
},
Feedback: "<p>Great job! This is the feedback.</p>",
},
},
}
exporter.processKnowledgeCheckItem(&buf, item)
result := buf.String()
if !strings.Contains(result, "knowledge-check") {
t.Error("Should contain knowledge-check CSS class")
}
if !strings.Contains(result, "Knowledge Check") {
t.Error("Should contain knowledge check heading")
}
if !strings.Contains(result, "What is the correct answer?") {
t.Error("Should contain question text")
}
if !strings.Contains(result, "Wrong answer") {
t.Error("Should contain first answer")
}
if !strings.Contains(result, "correct-answer") {
t.Error("Should mark correct answer with CSS class")
}
if !strings.Contains(result, "Feedback") {
t.Error("Should contain feedback section")
}
}
// TestHTMLExporter_ProcessMultimediaItem tests the processMultimediaItem method.
func TestHTMLExporter_ProcessMultimediaItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "multimedia",
Items: []models.SubItem{
{
Title: "<p>Video Title</p>",
Media: &models.Media{
Video: &models.VideoMedia{
OriginalUrl: "https://example.com/video.mp4",
Duration: 120,
},
},
Caption: "<p>Video caption</p>",
},
},
}
exporter.processMultimediaItem(&buf, item)
result := buf.String()
if !strings.Contains(result, "multimedia-item") {
t.Error("Should contain multimedia-item CSS class")
}
if !strings.Contains(result, "Media Content") {
t.Error("Should contain media content heading")
}
if !strings.Contains(result, "Video Title") {
t.Error("Should contain video title")
}
if !strings.Contains(result, "https://example.com/video.mp4") {
t.Error("Should contain video URL")
}
if !strings.Contains(result, "120 seconds") {
t.Error("Should contain video duration")
}
if !strings.Contains(result, "Video caption") {
t.Error("Should contain video caption")
}
}
// TestHTMLExporter_ProcessImageItem tests the processImageItem method.
func TestHTMLExporter_ProcessImageItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "image",
Items: []models.SubItem{
{
Media: &models.Media{
Image: &models.ImageMedia{
OriginalUrl: "https://example.com/image.png",
},
},
Caption: "<p>Image caption</p>",
},
},
}
exporter.processImageItem(&buf, item)
result := buf.String()
if !strings.Contains(result, "multimedia-item") {
t.Error("Should contain multimedia-item CSS class")
}
if !strings.Contains(result, "Image") {
t.Error("Should contain image heading")
}
if !strings.Contains(result, "https://example.com/image.png") {
t.Error("Should contain image URL")
}
if !strings.Contains(result, "Image caption") {
t.Error("Should contain image caption")
}
}
// TestHTMLExporter_ProcessInteractiveItem tests the processInteractiveItem method.
func TestHTMLExporter_ProcessInteractiveItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "interactive",
Items: []models.SubItem{
{
Title: "<p>Interactive element title</p>",
Paragraph: "<p>Interactive content description</p>",
},
},
}
exporter.processInteractiveItem(&buf, item)
result := buf.String()
if !strings.Contains(result, "interactive-item") {
t.Error("Should contain interactive-item CSS class")
}
if !strings.Contains(result, "Interactive Content") {
t.Error("Should contain interactive content heading")
}
if !strings.Contains(result, "Interactive element title") {
t.Error("Should contain interactive element title")
}
if !strings.Contains(result, "Interactive content description") {
t.Error("Should contain interactive content description")
}
}
// TestHTMLExporter_ProcessDividerItem tests the processDividerItem method.
func TestHTMLExporter_ProcessDividerItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
exporter.processDividerItem(&buf)
result := buf.String()
expected := " <hr>\n\n"
if result != expected {
t.Errorf("Expected %q, got %q", expected, result)
}
}
// TestHTMLExporter_ProcessUnknownItem tests the processUnknownItem method.
func TestHTMLExporter_ProcessUnknownItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "unknown",
Items: []models.SubItem{
{
Title: "<p>Unknown item title</p>",
Paragraph: "<p>Unknown item content</p>",
},
},
}
exporter.processUnknownItem(&buf, item)
result := buf.String()
if !strings.Contains(result, "unknown-item") {
t.Error("Should contain unknown-item CSS class")
}
if !strings.Contains(result, "Unknown Content") {
t.Error("Should contain unknown content heading")
}
if !strings.Contains(result, "Unknown item title") {
t.Error("Should contain unknown item title")
}
if !strings.Contains(result, "Unknown item content") {
t.Error("Should contain unknown item content")
}
}
// TestHTMLExporter_ProcessAnswers tests the processAnswers method.
func TestHTMLExporter_ProcessAnswers(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
answers := []models.Answer{
{Title: "Answer 1", Correct: false},
{Title: "Answer 2", Correct: true},
{Title: "Answer 3", Correct: false},
}
exporter.processAnswers(&buf, answers)
result := buf.String()
if !strings.Contains(result, "answers") {
t.Error("Should contain answers CSS class")
}
if !strings.Contains(result, "<h5>Answers:</h5>") {
t.Error("Should contain answers heading")
}
if !strings.Contains(result, "<ol>") {
t.Error("Should contain ordered list")
}
if !strings.Contains(result, "<li>Answer 1</li>") {
t.Error("Should contain first answer")
}
if !strings.Contains(result, "correct-answer") {
t.Error("Should mark correct answer with CSS class")
}
if !strings.Contains(result, "<li class=\"correct-answer\">Answer 2</li>") {
t.Error("Should mark correct answer properly")
}
if !strings.Contains(result, "<li>Answer 3</li>") {
t.Error("Should contain third answer")
}
}
// TestHTMLExporter_ProcessItemToHTML_AllTypes tests all item types.
func TestHTMLExporter_ProcessItemToHTML_AllTypes(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
tests := []struct {
name string
itemType string
expectedText string
}{
{
name: "text item",
itemType: "text",
expectedText: "Text Content",
},
{
name: "list item",
itemType: "list",
expectedText: "List",
},
{
name: "knowledge check item",
itemType: "knowledgeCheck",
expectedText: "Knowledge Check",
},
{
name: "multimedia item",
itemType: "multimedia",
expectedText: "Media Content",
},
{
name: "image item",
itemType: "image",
expectedText: "Image",
},
{
name: "interactive item",
itemType: "interactive",
expectedText: "Interactive Content",
},
{
name: "divider item",
itemType: "divider",
expectedText: "<hr>",
},
{
name: "unknown item",
itemType: "unknown",
expectedText: "Unknown Content",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
var buf bytes.Buffer
item := models.Item{
Type: tt.itemType,
Items: []models.SubItem{
{Title: "Test title", Paragraph: "Test content"},
},
}
// Handle empty unknown items
if tt.itemType == "unknown" && tt.expectedText == "" {
item.Items = []models.SubItem{}
}
exporter.processItemToHTML(&buf, item)
result := buf.String()
if tt.expectedText != "" && !strings.Contains(result, tt.expectedText) {
t.Errorf("Expected content to contain: %q\nGot: %q", tt.expectedText, result)
}
})
}
}
// TestHTMLExporter_ComplexCourse tests export of a complex course structure.
// TestHTMLExporter_ComplexCourse tests export of a course with complex content.
func TestHTMLExporter_ComplexCourse(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
@ -743,11 +344,17 @@ func TestHTMLExporter_HTMLCleaning(t *testing.T) {
Type: "text",
Items: []models.SubItem{
{
Heading: "<h1>Heading with <em>emphasis</em> and &amp; entities</h1>",
Paragraph: "<p>Paragraph with &lt;code&gt; entities and <strong>formatting</strong>.</p>",
Heading: "<h2>HTML Heading</h2>",
Paragraph: "<p>Content with <em>emphasis</em> and <strong>strong</strong> text.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>List item with <b>bold</b> text</p>"},
},
},
},
},
},
@ -762,13 +369,6 @@ func TestHTMLExporter_HTMLCleaning(t *testing.T) {
t.Fatalf("Export failed: %v", err)
}
// Verify file was created (basic check that HTML handling didn't break export)
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
// Read content and verify some HTML is preserved (descriptions, headings, paragraphs)
// while list items are cleaned for safety
content, err := os.ReadFile(outputPath)
if err != nil {
t.Fatalf("Failed to read output file: %v", err)
@ -776,19 +376,23 @@ func TestHTMLExporter_HTMLCleaning(t *testing.T) {
contentStr := string(content)
// HTML should be preserved in some places
// HTML content in descriptions should be preserved
if !strings.Contains(contentStr, "<b>bold</b>") {
t.Error("Should preserve HTML formatting in descriptions")
}
if !strings.Contains(contentStr, "<h1>Heading with <em>emphasis</em>") {
// HTML content in headings should be preserved
if !strings.Contains(contentStr, "<h2>HTML Heading</h2>") {
t.Error("Should preserve HTML in headings")
}
if !strings.Contains(contentStr, "<strong>formatting</strong>") {
t.Error("Should preserve HTML in paragraphs")
// List items should have HTML tags stripped (cleaned)
if !strings.Contains(contentStr, "List item with bold text") {
t.Error("Should clean HTML from list items")
}
}
// createTestCourseForHTML creates a test course for HTML export testing.
// createTestCourseForHTML creates a test course for HTML export tests.
func createTestCourseForHTML() *models.Course {
return &models.Course{
ShareID: "test-share-id",
@ -838,37 +442,13 @@ func BenchmarkHTMLExporter_Export(b *testing.B) {
exporter := NewHTMLExporter(htmlCleaner)
course := createTestCourseForHTML()
// Create temporary directory
tempDir := b.TempDir()
b.ResetTimer()
for i := 0; i < b.N; i++ {
outputPath := filepath.Join(tempDir, "benchmark-course.html")
_ = exporter.Export(course, outputPath)
// Clean up for next iteration
os.Remove(outputPath)
for i := range b.N {
outputPath := filepath.Join(tempDir, "bench-course-"+string(rune(i))+".html")
if err := exporter.Export(course, outputPath); err != nil {
b.Fatalf("Export failed: %v", err)
}
}
// BenchmarkHTMLExporter_ProcessTextItem benchmarks text item processing.
func BenchmarkHTMLExporter_ProcessTextItem(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &HTMLExporter{htmlCleaner: htmlCleaner}
item := models.Item{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h1>Benchmark Heading</h1>",
Paragraph: "<p>Benchmark paragraph with <strong>formatting</strong>.</p>",
},
},
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
var buf bytes.Buffer
exporter.processTextItem(&buf, item)
}
}
@ -889,39 +469,40 @@ func BenchmarkHTMLExporter_ComplexCourse(b *testing.B) {
}
// Fill with test data
for i := 0; i < 10; i++ {
for i := range 10 {
lesson := models.Lesson{
ID: "lesson-" + string(rune(i)),
Title: "Lesson " + string(rune(i)),
Title: "Benchmark Lesson " + string(rune(i)),
Type: "lesson",
Items: make([]models.Item, 5), // 5 items per lesson
}
for j := 0; j < 5; j++ {
item := models.Item{
Description: "<p>Lesson description</p>",
Items: []models.Item{
{
Type: "text",
Items: make([]models.SubItem, 3), // 3 sub-items per item
Items: []models.SubItem{
{
Heading: "<h2>Heading</h2>",
Paragraph: "<p>Paragraph with content.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>Item 1</p>"},
{Paragraph: "<p>Item 2</p>"},
},
},
},
}
for k := 0; k < 3; k++ {
item.Items[k] = models.SubItem{
Heading: "<h3>Heading " + string(rune(k)) + "</h3>",
Paragraph: "<p>Paragraph content with <strong>formatting</strong> for performance testing.</p>",
}
}
lesson.Items[j] = item
}
course.Course.Lessons[i] = lesson
}
tempDir := b.TempDir()
b.ResetTimer()
for i := 0; i < b.N; i++ {
outputPath := filepath.Join(tempDir, "benchmark-complex.html")
_ = exporter.Export(course, outputPath)
os.Remove(outputPath)
for i := range b.N {
outputPath := filepath.Join(tempDir, "bench-complex-"+string(rune(i))+".html")
if err := exporter.Export(course, outputPath); err != nil {
b.Fatalf("Export failed: %v", err)
}
}
}

View File

@ -1,5 +1,3 @@
// Package exporters provides implementations of the Exporter interface
// for converting Articulate Rise courses into various file formats.
package exporters
import (
@ -8,6 +6,9 @@ import (
"os"
"strings"
"golang.org/x/text/cases"
"golang.org/x/text/language"
"github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
@ -34,16 +35,7 @@ func NewMarkdownExporter(htmlCleaner *services.HTMLCleaner) interfaces.Exporter
}
}
// Export exports a course to Markdown format.
// It generates a structured Markdown document from the course data
// and writes it to the specified output path.
//
// Parameters:
// - course: The course data model to export
// - outputPath: The file path where the Markdown content will be written
//
// Returns:
// - An error if writing to the output file fails
// Export converts the course to Markdown format and writes it to the output path.
func (e *MarkdownExporter) Export(course *models.Course, outputPath string) error {
var buf bytes.Buffer
@ -87,26 +79,20 @@ func (e *MarkdownExporter) Export(course *models.Course, outputPath string) erro
buf.WriteString("\n---\n\n")
}
return os.WriteFile(outputPath, buf.Bytes(), 0644)
// #nosec G306 - 0644 is appropriate for export files that should be readable by others
if err := os.WriteFile(outputPath, buf.Bytes(), 0o644); err != nil {
return fmt.Errorf("failed to write markdown file: %w", err)
}
return nil
}
// GetSupportedFormat returns the format name this exporter supports
// It indicates the file format that the MarkdownExporter can generate.
//
// Returns:
// - A string representing the supported format ("markdown")
func (e *MarkdownExporter) GetSupportedFormat() string {
return "markdown"
// SupportedFormat returns "markdown".
func (e *MarkdownExporter) SupportedFormat() string {
return FormatMarkdown
}
// processItemToMarkdown converts a course item into Markdown format
// and appends it to the provided buffer. It handles different item types
// with appropriate Markdown formatting.
//
// Parameters:
// - buf: The buffer to write the Markdown content to
// - item: The course item to process
// - level: The heading level for the item (determines the number of # characters)
// processItemToMarkdown converts a course item into Markdown format.
// The level parameter determines the heading level (number of # characters).
func (e *MarkdownExporter) processItemToMarkdown(buf *bytes.Buffer, item models.Item, level int) {
headingPrefix := strings.Repeat("#", level)
@ -130,47 +116,47 @@ func (e *MarkdownExporter) processItemToMarkdown(buf *bytes.Buffer, item models.
}
}
// processTextItem handles text content with headings and paragraphs
// processTextItem handles text content with headings and paragraphs.
func (e *MarkdownExporter) processTextItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
for _, subItem := range item.Items {
if subItem.Heading != "" {
heading := e.htmlCleaner.CleanHTML(subItem.Heading)
if heading != "" {
buf.WriteString(fmt.Sprintf("%s %s\n\n", headingPrefix, heading))
fmt.Fprintf(buf, "%s %s\n\n", headingPrefix, heading)
}
}
if subItem.Paragraph != "" {
paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
if paragraph != "" {
buf.WriteString(fmt.Sprintf("%s\n\n", paragraph))
fmt.Fprintf(buf, "%s\n\n", paragraph)
}
}
}
}
// processListItem handles list items with bullet points
// processListItem handles list items with bullet points.
func (e *MarkdownExporter) processListItem(buf *bytes.Buffer, item models.Item) {
for _, subItem := range item.Items {
if subItem.Paragraph != "" {
paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
if paragraph != "" {
buf.WriteString(fmt.Sprintf("- %s\n", paragraph))
fmt.Fprintf(buf, "- %s\n", paragraph)
}
}
}
buf.WriteString("\n")
}
// processMultimediaItem handles multimedia content including videos and images
// processMultimediaItem handles multimedia content including videos and images.
func (e *MarkdownExporter) processMultimediaItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
buf.WriteString(fmt.Sprintf("%s Media Content\n\n", headingPrefix))
fmt.Fprintf(buf, "%s Media Content\n\n", headingPrefix)
for _, subItem := range item.Items {
e.processMediaSubItem(buf, subItem)
}
buf.WriteString("\n")
}
// processMediaSubItem processes individual media items (video/image)
// processMediaSubItem processes individual media items (video/image).
func (e *MarkdownExporter) processMediaSubItem(buf *bytes.Buffer, subItem models.SubItem) {
if subItem.Media != nil {
e.processVideoMedia(buf, subItem.Media)
@ -178,67 +164,67 @@ func (e *MarkdownExporter) processMediaSubItem(buf *bytes.Buffer, subItem models
}
if subItem.Caption != "" {
caption := e.htmlCleaner.CleanHTML(subItem.Caption)
buf.WriteString(fmt.Sprintf("*%s*\n", caption))
fmt.Fprintf(buf, "*%s*\n", caption)
}
}
// processVideoMedia processes video media content
// processVideoMedia processes video media content.
func (e *MarkdownExporter) processVideoMedia(buf *bytes.Buffer, media *models.Media) {
if media.Video != nil {
buf.WriteString(fmt.Sprintf("**Video**: %s\n", media.Video.OriginalUrl))
fmt.Fprintf(buf, "**Video**: %s\n", media.Video.OriginalURL)
if media.Video.Duration > 0 {
buf.WriteString(fmt.Sprintf("**Duration**: %d seconds\n", media.Video.Duration))
fmt.Fprintf(buf, "**Duration**: %d seconds\n", media.Video.Duration)
}
}
}
// processImageMedia processes image media content
// processImageMedia processes image media content.
func (e *MarkdownExporter) processImageMedia(buf *bytes.Buffer, media *models.Media) {
if media.Image != nil {
buf.WriteString(fmt.Sprintf("**Image**: %s\n", media.Image.OriginalUrl))
fmt.Fprintf(buf, "**Image**: %s\n", media.Image.OriginalURL)
}
}
// processImageItem handles standalone image items
// processImageItem handles standalone image items.
func (e *MarkdownExporter) processImageItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
buf.WriteString(fmt.Sprintf("%s Image\n\n", headingPrefix))
fmt.Fprintf(buf, "%s Image\n\n", headingPrefix)
for _, subItem := range item.Items {
if subItem.Media != nil && subItem.Media.Image != nil {
buf.WriteString(fmt.Sprintf("**Image**: %s\n", subItem.Media.Image.OriginalUrl))
fmt.Fprintf(buf, "**Image**: %s\n", subItem.Media.Image.OriginalURL)
}
if subItem.Caption != "" {
caption := e.htmlCleaner.CleanHTML(subItem.Caption)
buf.WriteString(fmt.Sprintf("*%s*\n", caption))
fmt.Fprintf(buf, "*%s*\n", caption)
}
}
buf.WriteString("\n")
}
// processKnowledgeCheckItem handles quiz questions and knowledge checks
// processKnowledgeCheckItem handles quiz questions and knowledge checks.
func (e *MarkdownExporter) processKnowledgeCheckItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
buf.WriteString(fmt.Sprintf("%s Knowledge Check\n\n", headingPrefix))
fmt.Fprintf(buf, "%s Knowledge Check\n\n", headingPrefix)
for _, subItem := range item.Items {
e.processQuestionSubItem(buf, subItem)
}
buf.WriteString("\n")
}
// processQuestionSubItem processes individual question items
// processQuestionSubItem processes individual question items.
func (e *MarkdownExporter) processQuestionSubItem(buf *bytes.Buffer, subItem models.SubItem) {
if subItem.Title != "" {
title := e.htmlCleaner.CleanHTML(subItem.Title)
buf.WriteString(fmt.Sprintf("**Question**: %s\n\n", title))
fmt.Fprintf(buf, "**Question**: %s\n\n", title)
}
e.processAnswers(buf, subItem.Answers)
if subItem.Feedback != "" {
feedback := e.htmlCleaner.CleanHTML(subItem.Feedback)
buf.WriteString(fmt.Sprintf("\n**Feedback**: %s\n", feedback))
fmt.Fprintf(buf, "\n**Feedback**: %s\n", feedback)
}
}
// processAnswers processes answer choices for quiz questions
// processAnswers processes answer choices for quiz questions.
func (e *MarkdownExporter) processAnswers(buf *bytes.Buffer, answers []models.Answer) {
buf.WriteString("**Answers**:\n")
for i, answer := range answers {
@ -246,44 +232,45 @@ func (e *MarkdownExporter) processAnswers(buf *bytes.Buffer, answers []models.An
if answer.Correct {
correctMark = " ✓"
}
buf.WriteString(fmt.Sprintf("%d. %s%s\n", i+1, answer.Title, correctMark))
fmt.Fprintf(buf, "%d. %s%s\n", i+1, answer.Title, correctMark)
}
}
// processInteractiveItem handles interactive content
// processInteractiveItem handles interactive content.
func (e *MarkdownExporter) processInteractiveItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
buf.WriteString(fmt.Sprintf("%s Interactive Content\n\n", headingPrefix))
fmt.Fprintf(buf, "%s Interactive Content\n\n", headingPrefix)
for _, subItem := range item.Items {
if subItem.Title != "" {
title := e.htmlCleaner.CleanHTML(subItem.Title)
buf.WriteString(fmt.Sprintf("**%s**\n\n", title))
fmt.Fprintf(buf, "**%s**\n\n", title)
}
}
}
// processDividerItem handles divider elements
// processDividerItem handles divider elements.
func (e *MarkdownExporter) processDividerItem(buf *bytes.Buffer) {
buf.WriteString("---\n\n")
}
// processUnknownItem handles unknown or unsupported item types
// processUnknownItem handles unknown or unsupported item types.
func (e *MarkdownExporter) processUnknownItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
if len(item.Items) > 0 {
buf.WriteString(fmt.Sprintf("%s %s Content\n\n", headingPrefix, strings.Title(item.Type)))
caser := cases.Title(language.English)
fmt.Fprintf(buf, "%s %s Content\n\n", headingPrefix, caser.String(item.Type))
for _, subItem := range item.Items {
e.processGenericSubItem(buf, subItem)
}
}
}
// processGenericSubItem processes sub-items for unknown types
// processGenericSubItem processes sub-items for unknown types.
func (e *MarkdownExporter) processGenericSubItem(buf *bytes.Buffer, subItem models.SubItem) {
if subItem.Title != "" {
title := e.htmlCleaner.CleanHTML(subItem.Title)
buf.WriteString(fmt.Sprintf("**%s**\n\n", title))
fmt.Fprintf(buf, "**%s**\n\n", title)
}
if subItem.Paragraph != "" {
paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
buf.WriteString(fmt.Sprintf("%s\n\n", paragraph))
fmt.Fprintf(buf, "%s\n\n", paragraph)
}
}

View File

@ -1,4 +1,3 @@
// Package exporters_test provides tests for the markdown exporter.
package exporters
import (
@ -32,13 +31,13 @@ func TestNewMarkdownExporter(t *testing.T) {
}
}
// TestMarkdownExporter_GetSupportedFormat tests the GetSupportedFormat method.
func TestMarkdownExporter_GetSupportedFormat(t *testing.T) {
// TestMarkdownExporter_SupportedFormat tests the SupportedFormat method.
func TestMarkdownExporter_SupportedFormat(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewMarkdownExporter(htmlCleaner)
expected := "markdown"
result := exporter.GetSupportedFormat()
result := exporter.SupportedFormat()
if result != expected {
t.Errorf("Expected format '%s', got '%s'", expected, result)
@ -188,7 +187,7 @@ func TestMarkdownExporter_ProcessMultimediaItem(t *testing.T) {
{
Media: &models.Media{
Video: &models.VideoMedia{
OriginalUrl: "https://example.com/video.mp4",
OriginalURL: "https://example.com/video.mp4",
Duration: 120,
},
},
@ -227,7 +226,7 @@ func TestMarkdownExporter_ProcessImageItem(t *testing.T) {
{
Media: &models.Media{
Image: &models.ImageMedia{
OriginalUrl: "https://example.com/image.jpg",
OriginalURL: "https://example.com/image.jpg",
},
},
Caption: "<p>Image caption</p>",
@ -372,7 +371,7 @@ func TestMarkdownExporter_ProcessVideoMedia(t *testing.T) {
var buf bytes.Buffer
media := &models.Media{
Video: &models.VideoMedia{
OriginalUrl: "https://example.com/video.mp4",
OriginalURL: "https://example.com/video.mp4",
Duration: 300,
},
}
@ -397,7 +396,7 @@ func TestMarkdownExporter_ProcessImageMedia(t *testing.T) {
var buf bytes.Buffer
media := &models.Media{
Image: &models.ImageMedia{
OriginalUrl: "https://example.com/image.jpg",
OriginalURL: "https://example.com/image.jpg",
},
}
@ -661,12 +660,13 @@ func BenchmarkMarkdownExporter_Export(b *testing.B) {
// Create temporary directory
tempDir := b.TempDir()
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
outputPath := filepath.Join(tempDir, "benchmark-course.md")
_ = exporter.Export(course, outputPath)
// Clean up for next iteration
os.Remove(outputPath)
// Clean up for next iteration. Remove errors are ignored because we've already
// benchmarked the export operation; cleanup failures don't affect the benchmark
// measurements or the validity of the next iteration's export.
_ = os.Remove(outputPath)
}
}
@ -685,8 +685,7 @@ func BenchmarkMarkdownExporter_ProcessTextItem(b *testing.B) {
},
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
var buf bytes.Buffer
exporter.processTextItem(&buf, item, "###")
}

Binary file not shown.

View File

@ -0,0 +1,11 @@
# Example Course
Course description
## Course Information
- **Course ID**:
- **Share ID**: example-id
- **Navigation Mode**:
---

View File

@ -1,5 +1,3 @@
// Package interfaces provides the core contracts for the articulate-parser application.
// It defines interfaces for parsing and exporting Articulate Rise courses.
package interfaces
import "github.com/kjanat/articulate-parser/internal/models"
@ -12,9 +10,9 @@ type Exporter interface {
// specified output path. It returns an error if the export operation fails.
Export(course *models.Course, outputPath string) error
// GetSupportedFormat returns the name of the format this exporter supports.
// SupportedFormat returns the name of the format this exporter supports.
// This is used to identify which exporter to use for a given format.
GetSupportedFormat() string
SupportedFormat() string
}
// ExporterFactory creates exporters for different formats.
@ -25,7 +23,7 @@ type ExporterFactory interface {
// It returns the appropriate exporter or an error if the format is not supported.
CreateExporter(format string) (Exporter, error)
// GetSupportedFormats returns a list of all export formats supported by this factory.
// SupportedFormats returns a list of all export formats supported by this factory.
// This is used to inform users of available export options.
GetSupportedFormats() []string
SupportedFormats() []string
}

View File

@ -0,0 +1,25 @@
package interfaces
import "context"
// Logger defines the interface for structured logging.
// Implementations should provide leveled, structured logging capabilities.
type Logger interface {
// Debug logs a debug-level message with optional key-value pairs.
Debug(msg string, keysAndValues ...any)
// Info logs an info-level message with optional key-value pairs.
Info(msg string, keysAndValues ...any)
// Warn logs a warning-level message with optional key-value pairs.
Warn(msg string, keysAndValues ...any)
// Error logs an error-level message with optional key-value pairs.
Error(msg string, keysAndValues ...any)
// With returns a new logger with the given key-value pairs added as context.
With(keysAndValues ...any) Logger
// WithContext returns a new logger with context information.
WithContext(ctx context.Context) Logger
}

View File

@ -2,7 +2,11 @@
// It defines interfaces for parsing and exporting Articulate Rise courses.
package interfaces
import "github.com/kjanat/articulate-parser/internal/models"
import (
"context"
"github.com/kjanat/articulate-parser/internal/models"
)
// CourseParser defines the interface for loading course data.
// It provides methods to fetch course content either from a remote URI
@ -10,8 +14,9 @@ import "github.com/kjanat/articulate-parser/internal/models"
type CourseParser interface {
// FetchCourse loads a course from a URI (typically an Articulate Rise share URL).
// It retrieves the course data from the remote location and returns a parsed Course model.
// The context can be used for cancellation and timeout control.
// Returns an error if the fetch operation fails or if the data cannot be parsed.
FetchCourse(uri string) (*models.Course, error)
FetchCourse(ctx context.Context, uri string) (*models.Course, error)
// LoadCourseFromFile loads a course from a local file.
// It reads and parses the course data from the specified file path.

View File

@ -1,5 +1,3 @@
// Package models defines the data structures representing Articulate Rise courses.
// These structures closely match the JSON format used by Articulate Rise.
package models
// Lesson represents a single lesson or section within an Articulate Rise course.
@ -18,7 +16,7 @@ type Lesson struct {
// Items is an ordered array of content items within the lesson
Items []Item `json:"items"`
// Position stores the ordering information for the lesson
Position interface{} `json:"position"`
Position any `json:"position"`
// Ready indicates whether the lesson is marked as complete
Ready bool `json:"ready"`
// CreatedAt is the timestamp when the lesson was created
@ -41,9 +39,9 @@ type Item struct {
// Items contains the actual content elements (sub-items) of this item
Items []SubItem `json:"items"`
// Settings contains configuration options specific to this item type
Settings interface{} `json:"settings"`
Settings any `json:"settings"`
// Data contains additional structured data for the item
Data interface{} `json:"data"`
Data any `json:"data"`
// Media contains any associated media for the item
Media *Media `json:"media,omitempty"`
}

View File

@ -1,5 +1,3 @@
// Package models defines the data structures representing Articulate Rise courses.
// These structures closely match the JSON format used by Articulate Rise.
package models
// Media represents a media element that can be either an image or a video.
@ -23,8 +21,8 @@ type ImageMedia struct {
Height int `json:"height,omitempty"`
// CrushedKey is the identifier for a compressed version of the image
CrushedKey string `json:"crushedKey,omitempty"`
// OriginalUrl is the URL to the full-resolution image
OriginalUrl string `json:"originalUrl"`
// OriginalURL is the URL to the full-resolution image
OriginalURL string `json:"originalUrl"`
// UseCrushedKey indicates whether to use the compressed version
UseCrushedKey bool `json:"useCrushedKey,omitempty"`
}
@ -45,6 +43,6 @@ type VideoMedia struct {
InputKey string `json:"inputKey,omitempty"`
// Thumbnail is the URL to a smaller preview image
Thumbnail string `json:"thumbnail,omitempty"`
// OriginalUrl is the URL to the source video file
OriginalUrl string `json:"originalUrl"`
// OriginalURL is the URL to the source video file
OriginalURL string `json:"originalUrl"`
}

View File

@ -1,4 +1,3 @@
// Package models_test provides tests for the data models.
package models
import (
@ -98,7 +97,7 @@ func TestCourseInfo_JSONMarshalUnmarshal(t *testing.T) {
Type: "jpg",
Width: 800,
Height: 600,
OriginalUrl: "https://example.com/image.jpg",
OriginalURL: "https://example.com/image.jpg",
},
},
}
@ -133,7 +132,7 @@ func TestLesson_JSONMarshalUnmarshal(t *testing.T) {
Ready: true,
CreatedAt: "2023-06-01T12:00:00Z",
UpdatedAt: "2023-06-01T13:00:00Z",
Position: map[string]interface{}{"x": 1, "y": 2},
Position: map[string]any{"x": 1, "y": 2},
Items: []Item{
{
ID: "item-test",
@ -149,13 +148,13 @@ func TestLesson_JSONMarshalUnmarshal(t *testing.T) {
URL: "https://example.com/video.mp4",
Type: "mp4",
Duration: 120,
OriginalUrl: "https://example.com/video.mp4",
OriginalURL: "https://example.com/video.mp4",
},
},
},
},
Settings: map[string]interface{}{"autoplay": false},
Data: map[string]interface{}{"metadata": "test"},
Settings: map[string]any{"autoplay": false},
Data: map[string]any{"metadata": "test"},
},
},
}
@ -197,11 +196,11 @@ func TestItem_JSONMarshalUnmarshal(t *testing.T) {
Feedback: "Well done!",
},
},
Settings: map[string]interface{}{
Settings: map[string]any{
"allowRetry": true,
"showAnswer": true,
},
Data: map[string]interface{}{
Data: map[string]any{
"points": 10,
"weight": 1.5,
},
@ -244,7 +243,7 @@ func TestSubItem_JSONMarshalUnmarshal(t *testing.T) {
Type: "png",
Width: 400,
Height: 300,
OriginalUrl: "https://example.com/subitem.png",
OriginalURL: "https://example.com/subitem.png",
CrushedKey: "crushed-123",
UseCrushedKey: true,
},
@ -305,7 +304,7 @@ func TestMedia_JSONMarshalUnmarshal(t *testing.T) {
Type: "jpeg",
Width: 1200,
Height: 800,
OriginalUrl: "https://example.com/media.jpg",
OriginalURL: "https://example.com/media.jpg",
CrushedKey: "crushed-media",
UseCrushedKey: false,
},
@ -336,7 +335,7 @@ func TestMedia_JSONMarshalUnmarshal(t *testing.T) {
Poster: "https://example.com/poster.jpg",
Thumbnail: "https://example.com/thumb.jpg",
InputKey: "input-123",
OriginalUrl: "https://example.com/original.mp4",
OriginalURL: "https://example.com/original.mp4",
},
}
@ -363,7 +362,7 @@ func TestImageMedia_JSONMarshalUnmarshal(t *testing.T) {
Type: "gif",
Width: 640,
Height: 480,
OriginalUrl: "https://example.com/image.gif",
OriginalURL: "https://example.com/image.gif",
CrushedKey: "crushed-gif",
UseCrushedKey: true,
}
@ -397,7 +396,7 @@ func TestVideoMedia_JSONMarshalUnmarshal(t *testing.T) {
Poster: "https://example.com/poster.jpg",
Thumbnail: "https://example.com/thumbnail.jpg",
InputKey: "upload-456",
OriginalUrl: "https://example.com/original.webm",
OriginalURL: "https://example.com/original.webm",
}
// Marshal to JSON
@ -475,7 +474,7 @@ func TestLabelSet_JSONMarshalUnmarshal(t *testing.T) {
func TestEmptyStructures(t *testing.T) {
testCases := []struct {
name string
data interface{}
data any
}{
{"Empty Course", Course{}},
{"Empty CourseInfo", CourseInfo{}},
@ -569,7 +568,7 @@ func TestNilPointerSafety(t *testing.T) {
// TestJSONTagsPresence tests that JSON tags are properly defined.
func TestJSONTagsPresence(t *testing.T) {
// Test that important fields have JSON tags
courseType := reflect.TypeOf(Course{})
courseType := reflect.TypeFor[Course]()
if courseType.Kind() == reflect.Struct {
field, found := courseType.FieldByName("ShareID")
if !found {
@ -586,7 +585,7 @@ func TestJSONTagsPresence(t *testing.T) {
}
// Test CourseInfo
courseInfoType := reflect.TypeOf(CourseInfo{})
courseInfoType := reflect.TypeFor[CourseInfo]()
if courseInfoType.Kind() == reflect.Struct {
field, found := courseInfoType.FieldByName("NavigationMode")
if !found {
@ -626,8 +625,7 @@ func BenchmarkCourse_JSONMarshal(b *testing.B) {
},
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
_, _ = json.Marshal(course)
}
}
@ -660,17 +658,16 @@ func BenchmarkCourse_JSONUnmarshal(b *testing.B) {
jsonData, _ := json.Marshal(course)
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
var result Course
_ = json.Unmarshal(jsonData, &result)
}
}
// compareMaps compares two interface{} values that should be maps
func compareMaps(original, unmarshaled interface{}) bool {
origMap, origOk := original.(map[string]interface{})
unMap, unOk := unmarshaled.(map[string]interface{})
// compareMaps compares two any values that should be maps.
func compareMaps(original, unmarshaled any) bool {
origMap, origOk := original.(map[string]any)
unMap, unOk := unmarshaled.(map[string]any)
if !origOk || !unOk {
// If not maps, use deep equal
@ -714,7 +711,7 @@ func compareMaps(original, unmarshaled interface{}) bool {
return true
}
// compareLessons compares two Lesson structs accounting for JSON type conversion
// compareLessons compares two Lesson structs accounting for JSON type conversion.
func compareLessons(original, unmarshaled Lesson) bool {
// Compare all fields except Position and Items
if original.ID != unmarshaled.ID ||
@ -737,7 +734,7 @@ func compareLessons(original, unmarshaled Lesson) bool {
return compareItems(original.Items, unmarshaled.Items)
}
// compareItems compares two Item slices accounting for JSON type conversion
// compareItems compares two Item slices accounting for JSON type conversion.
func compareItems(original, unmarshaled []Item) bool {
if len(original) != len(unmarshaled) {
return false
@ -751,7 +748,7 @@ func compareItems(original, unmarshaled []Item) bool {
return true
}
// compareItem compares two Item structs accounting for JSON type conversion
// compareItem compares two Item structs accounting for JSON type conversion.
func compareItem(original, unmarshaled Item) bool {
// Compare basic fields
if original.ID != unmarshaled.ID ||

View File

@ -3,6 +3,7 @@
package services
import (
"context"
"fmt"
"github.com/kjanat/articulate-parser/internal/interfaces"
@ -44,8 +45,8 @@ func (a *App) ProcessCourseFromFile(filePath, format, outputPath string) error {
// ProcessCourseFromURI fetches a course from the provided URI and exports it to the specified format.
// It takes the URI to fetch the course from, the desired export format, and the output file path.
// Returns an error if fetching or exporting fails.
func (a *App) ProcessCourseFromURI(uri, format, outputPath string) error {
course, err := a.parser.FetchCourse(uri)
func (a *App) ProcessCourseFromURI(ctx context.Context, uri, format, outputPath string) error {
course, err := a.parser.FetchCourse(ctx, uri)
if err != nil {
return fmt.Errorf("failed to fetch course: %w", err)
}
@ -69,8 +70,8 @@ func (a *App) exportCourse(course *models.Course, format, outputPath string) err
return nil
}
// GetSupportedFormats returns a list of all export formats supported by the application.
// SupportedFormats returns a list of all export formats supported by the application.
// This information is provided by the ExporterFactory.
func (a *App) GetSupportedFormats() []string {
return a.exporterFactory.GetSupportedFormats()
func (a *App) SupportedFormats() []string {
return a.exporterFactory.SupportedFormats()
}

View File

@ -1,7 +1,7 @@
// Package services_test provides tests for the services package.
package services
import (
"context"
"errors"
"testing"
@ -11,13 +11,13 @@ import (
// MockCourseParser is a mock implementation of interfaces.CourseParser for testing.
type MockCourseParser struct {
mockFetchCourse func(uri string) (*models.Course, error)
mockFetchCourse func(ctx context.Context, uri string) (*models.Course, error)
mockLoadCourseFromFile func(filePath string) (*models.Course, error)
}
func (m *MockCourseParser) FetchCourse(uri string) (*models.Course, error) {
func (m *MockCourseParser) FetchCourse(ctx context.Context, uri string) (*models.Course, error) {
if m.mockFetchCourse != nil {
return m.mockFetchCourse(uri)
return m.mockFetchCourse(ctx, uri)
}
return nil, errors.New("not implemented")
}
@ -32,7 +32,7 @@ func (m *MockCourseParser) LoadCourseFromFile(filePath string) (*models.Course,
// MockExporter is a mock implementation of interfaces.Exporter for testing.
type MockExporter struct {
mockExport func(course *models.Course, outputPath string) error
mockGetSupportedFormat func() string
mockSupportedFormat func() string
}
func (m *MockExporter) Export(course *models.Course, outputPath string) error {
@ -42,9 +42,9 @@ func (m *MockExporter) Export(course *models.Course, outputPath string) error {
return nil
}
func (m *MockExporter) GetSupportedFormat() string {
if m.mockGetSupportedFormat != nil {
return m.mockGetSupportedFormat()
func (m *MockExporter) SupportedFormat() string {
if m.mockSupportedFormat != nil {
return m.mockSupportedFormat()
}
return "mock"
}
@ -52,7 +52,7 @@ func (m *MockExporter) GetSupportedFormat() string {
// MockExporterFactory is a mock implementation of interfaces.ExporterFactory for testing.
type MockExporterFactory struct {
mockCreateExporter func(format string) (*MockExporter, error)
mockGetSupportedFormats func() []string
mockSupportedFormats func() []string
}
func (m *MockExporterFactory) CreateExporter(format string) (interfaces.Exporter, error) {
@ -63,9 +63,9 @@ func (m *MockExporterFactory) CreateExporter(format string) (interfaces.Exporter
return &MockExporter{}, nil
}
func (m *MockExporterFactory) GetSupportedFormats() []string {
if m.mockGetSupportedFormats != nil {
return m.mockGetSupportedFormats()
func (m *MockExporterFactory) SupportedFormats() []string {
if m.mockSupportedFormats != nil {
return m.mockSupportedFormats()
}
return []string{"mock"}
}
@ -119,7 +119,7 @@ func TestNewApp(t *testing.T) {
}
// Test that the factory is set (we can't directly compare interface values)
formats := app.GetSupportedFormats()
formats := app.SupportedFormats()
if len(formats) == 0 {
t.Error("App exporterFactory was not set correctly - no supported formats")
}
@ -216,11 +216,9 @@ func TestApp_ProcessCourseFromFile(t *testing.T) {
if !contains(err.Error(), tt.expectedError) {
t.Errorf("Expected error containing '%s', got '%s'", tt.expectedError, err.Error())
}
} else {
if err != nil {
} else if err != nil {
t.Errorf("Expected no error, got: %v", err)
}
}
})
}
}
@ -243,7 +241,7 @@ func TestApp_ProcessCourseFromURI(t *testing.T) {
format: "docx",
outputPath: "output.docx",
setupMocks: func(parser *MockCourseParser, factory *MockExporterFactory, exporter *MockExporter) {
parser.mockFetchCourse = func(uri string) (*models.Course, error) {
parser.mockFetchCourse = func(ctx context.Context, uri string) (*models.Course, error) {
if uri != "https://rise.articulate.com/share/test123" {
t.Errorf("Expected uri 'https://rise.articulate.com/share/test123', got '%s'", uri)
}
@ -271,7 +269,7 @@ func TestApp_ProcessCourseFromURI(t *testing.T) {
format: "docx",
outputPath: "output.docx",
setupMocks: func(parser *MockCourseParser, factory *MockExporterFactory, exporter *MockExporter) {
parser.mockFetchCourse = func(uri string) (*models.Course, error) {
parser.mockFetchCourse = func(ctx context.Context, uri string) (*models.Course, error) {
return nil, errors.New("network error")
}
},
@ -288,7 +286,7 @@ func TestApp_ProcessCourseFromURI(t *testing.T) {
tt.setupMocks(parser, factory, exporter)
app := NewApp(parser, factory)
err := app.ProcessCourseFromURI(tt.uri, tt.format, tt.outputPath)
err := app.ProcessCourseFromURI(context.Background(), tt.uri, tt.format, tt.outputPath)
if tt.expectedError != "" {
if err == nil {
@ -297,28 +295,26 @@ func TestApp_ProcessCourseFromURI(t *testing.T) {
if !contains(err.Error(), tt.expectedError) {
t.Errorf("Expected error containing '%s', got '%s'", tt.expectedError, err.Error())
}
} else {
if err != nil {
} else if err != nil {
t.Errorf("Expected no error, got: %v", err)
}
}
})
}
}
// TestApp_GetSupportedFormats tests the GetSupportedFormats method.
func TestApp_GetSupportedFormats(t *testing.T) {
// TestApp_SupportedFormats tests the SupportedFormats method.
func TestApp_SupportedFormats(t *testing.T) {
expectedFormats := []string{"markdown", "docx", "pdf"}
parser := &MockCourseParser{}
factory := &MockExporterFactory{
mockGetSupportedFormats: func() []string {
mockSupportedFormats: func() []string {
return expectedFormats
},
}
app := NewApp(parser, factory)
formats := app.GetSupportedFormats()
formats := app.SupportedFormats()
if len(formats) != len(expectedFormats) {
t.Errorf("Expected %d formats, got %d", len(expectedFormats), len(formats))
@ -334,7 +330,7 @@ func TestApp_GetSupportedFormats(t *testing.T) {
// contains checks if a string contains a substring.
func contains(s, substr string) bool {
return len(s) >= len(substr) &&
(len(substr) == 0 ||
(substr == "" ||
s == substr ||
(len(s) > len(substr) &&
(s[:len(substr)] == substr ||

View File

@ -0,0 +1,96 @@
// Package services_test provides examples for the services package.
package services_test
import (
"context"
"fmt"
"log"
"github.com/kjanat/articulate-parser/internal/services"
)
// ExampleNewArticulateParser demonstrates creating a new parser.
func ExampleNewArticulateParser() {
// Create a no-op logger for this example
logger := services.NewNoOpLogger()
// Create parser with defaults
parser := services.NewArticulateParser(logger, "", 0)
fmt.Printf("Parser created: %T\n", parser)
// Output: Parser created: *services.ArticulateParser
}
// ExampleNewArticulateParser_custom demonstrates creating a parser with custom configuration.
func ExampleNewArticulateParser_custom() {
logger := services.NewNoOpLogger()
// Create parser with custom base URL and timeout
parser := services.NewArticulateParser(
logger,
"https://custom.articulate.com",
60_000_000_000, // 60 seconds in nanoseconds
)
fmt.Printf("Parser configured: %T\n", parser)
// Output: Parser configured: *services.ArticulateParser
}
// ExampleArticulateParser_LoadCourseFromFile demonstrates loading a course from a file.
func ExampleArticulateParser_LoadCourseFromFile() {
logger := services.NewNoOpLogger()
parser := services.NewArticulateParser(logger, "", 0)
// In a real scenario, you'd have an actual file
// This example shows the API usage
_, err := parser.LoadCourseFromFile("course.json")
if err != nil {
log.Printf("Failed to load course: %v", err)
}
}
// ExampleArticulateParser_FetchCourse demonstrates fetching a course from a URI.
func ExampleArticulateParser_FetchCourse() {
logger := services.NewNoOpLogger()
parser := services.NewArticulateParser(logger, "", 0)
// Create a context with timeout
ctx := context.Background()
// In a real scenario, you'd use an actual share URL
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/YOUR_SHARE_ID")
if err != nil {
log.Printf("Failed to fetch course: %v", err)
}
}
// ExampleHTMLCleaner demonstrates cleaning HTML content.
func ExampleHTMLCleaner() {
cleaner := services.NewHTMLCleaner()
html := "<p>This is <strong>bold</strong> text with entities.</p>"
clean := cleaner.CleanHTML(html)
fmt.Println(clean)
// Output: This is bold text with entities.
}
// ExampleHTMLCleaner_CleanHTML demonstrates complex HTML cleaning.
func ExampleHTMLCleaner_CleanHTML() {
cleaner := services.NewHTMLCleaner()
html := `
<div>
<h1>Title</h1>
<p>Paragraph with <a href="#">link</a> and &amp; entity.</p>
<ul>
<li>Item 1</li>
<li>Item 2</li>
</ul>
</div>
`
clean := cleaner.CleanHTML(html)
fmt.Println(clean)
// Output: Title Paragraph with link and & entity. Item 1 Item 2
}

View File

@ -1,15 +1,17 @@
// Package services provides the core functionality for the articulate-parser application.
// It implements the interfaces defined in the interfaces package.
package services
import (
"regexp"
"bytes"
stdhtml "html"
"io"
"strings"
"golang.org/x/net/html"
)
// HTMLCleaner provides utilities for converting HTML content to plain text.
// It removes HTML tags while preserving their content and converts HTML entities
// to their plain text equivalents.
// to their plain text equivalents using proper HTML parsing instead of regex.
type HTMLCleaner struct{}
// NewHTMLCleaner creates a new HTML cleaner instance.
@ -20,34 +22,47 @@ func NewHTMLCleaner() *HTMLCleaner {
}
// CleanHTML removes HTML tags and converts entities, returning clean plain text.
// The function preserves the textual content of the HTML while removing markup.
// It handles common HTML entities like &nbsp;, &amp;, etc., and normalizes whitespace.
//
// Parameters:
// - html: The HTML content to clean
//
// Returns:
// - A plain text string with all HTML elements and entities removed/converted
func (h *HTMLCleaner) CleanHTML(html string) string {
// Remove HTML tags but preserve content
re := regexp.MustCompile(`<[^>]*>`)
cleaned := re.ReplaceAllString(html, "")
// It parses the HTML into a node tree and extracts only text content,
// skipping script and style tags. HTML entities are automatically handled
// by the parser, and whitespace is normalized.
func (h *HTMLCleaner) CleanHTML(htmlStr string) string {
// Parse the HTML into a node tree
doc, err := html.Parse(strings.NewReader(htmlStr))
if err != nil {
// If parsing fails, return empty string
// This maintains backward compatibility with the test expectations
return ""
}
// Replace common HTML entities with their character equivalents
cleaned = strings.ReplaceAll(cleaned, "&nbsp;", " ")
cleaned = strings.ReplaceAll(cleaned, "&amp;", "&")
cleaned = strings.ReplaceAll(cleaned, "&lt;", "<")
cleaned = strings.ReplaceAll(cleaned, "&gt;", ">")
cleaned = strings.ReplaceAll(cleaned, "&quot;", "\"")
cleaned = strings.ReplaceAll(cleaned, "&#39;", "'")
cleaned = strings.ReplaceAll(cleaned, "&iuml;", "ï")
cleaned = strings.ReplaceAll(cleaned, "&euml;", "ë")
cleaned = strings.ReplaceAll(cleaned, "&eacute;", "é")
// Extract text content from the node tree
var buf bytes.Buffer
extractText(&buf, doc)
// Clean up extra whitespace by replacing multiple spaces, tabs, and newlines
// with a single space, then trim any leading/trailing whitespace
cleaned = regexp.MustCompile(`\s+`).ReplaceAllString(cleaned, " ")
cleaned = strings.TrimSpace(cleaned)
// Unescape any remaining HTML entities
unescaped := stdhtml.UnescapeString(buf.String())
return cleaned
// Normalize whitespace: replace multiple spaces, tabs, and newlines with a single space
cleaned := strings.Join(strings.Fields(unescaped), " ")
return strings.TrimSpace(cleaned)
}
// extractText recursively traverses the HTML node tree and extracts text content.
// It skips script and style tags to avoid including their content in the output.
func extractText(w io.Writer, n *html.Node) {
// Skip script and style tags entirely
if n.Type == html.ElementNode && (n.Data == "script" || n.Data == "style") {
return
}
// If this is a text node, write its content
if n.Type == html.TextNode {
// Write errors are ignored because we're writing to an in-memory buffer
// which cannot fail in normal circumstances
_, _ = w.Write([]byte(n.Data))
}
// Recursively process all child nodes
for c := n.FirstChild; c != nil; c = c.NextSibling {
extractText(w, c)
}
}

View File

@ -1,4 +1,3 @@
// Package services_test provides tests for the HTML cleaner service.
package services
import (
@ -112,7 +111,7 @@ func TestHTMLCleaner_CleanHTML(t *testing.T) {
{
name: "script and style tags content",
input: "<script>alert('test');</script>Content<style>body{color:red;}</style>",
expected: "alert('test');Contentbody{color:red;}",
expected: "Content", // Script and style tags are correctly skipped
},
{
name: "line breaks and formatting",
@ -147,7 +146,7 @@ func TestHTMLCleaner_CleanHTML(t *testing.T) {
{
name: "special HTML5 entities",
input: "Left arrow &larr; Right arrow &rarr;",
expected: "Left arrow &larr; Right arrow &rarr;", // These are not handled by the cleaner
expected: "Left arrow Right arrow ", // HTML5 entities are properly handled by the parser
},
}
@ -168,7 +167,7 @@ func TestHTMLCleaner_CleanHTML_LargeContent(t *testing.T) {
// Create a large HTML string
var builder strings.Builder
builder.WriteString("<html><body>")
for i := 0; i < 1000; i++ {
for i := range 1000 {
builder.WriteString("<p>Paragraph ")
builder.WriteString(string(rune('0' + i%10)))
builder.WriteString(" with some content &amp; entities.</p>")
@ -217,9 +216,9 @@ func TestHTMLCleaner_CleanHTML_EdgeCases(t *testing.T) {
expected: "&&&",
},
{
name: "entities without semicolon (should not be converted)",
name: "entities without semicolon (properly converted)",
input: "&amp test &lt test",
expected: "&amp test &lt test",
expected: "& test < test", // Parser handles entities even without semicolons in some cases
},
{
name: "mixed valid and invalid entities",
@ -234,7 +233,7 @@ func TestHTMLCleaner_CleanHTML_EdgeCases(t *testing.T) {
{
name: "tag with no closing bracket",
input: "Content <p class='test' with no closing bracket",
expected: "Content <p class='test' with no closing bracket",
expected: "Content", // Parser handles malformed HTML gracefully
},
{
name: "extremely nested tags",
@ -299,8 +298,7 @@ func BenchmarkHTMLCleaner_CleanHTML(b *testing.B) {
cleaner := NewHTMLCleaner()
input := "<div class=\"content\"><h1>Course Title</h1><p>This is a <em>great</em> course about &amp; HTML entities like &nbsp; and &quot;quotes&quot;.</p><ul><li>Item 1</li><li>Item 2</li></ul></div>"
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
cleaner.CleanHTML(input)
}
}
@ -311,15 +309,14 @@ func BenchmarkHTMLCleaner_CleanHTML_Large(b *testing.B) {
// Create a large HTML string
var builder strings.Builder
for i := 0; i < 100; i++ {
for i := range 100 {
builder.WriteString("<p>Paragraph ")
builder.WriteString(string(rune('0' + i%10)))
builder.WriteString(" with some content &amp; entities &lt;test&gt;.</p>")
}
input := builder.String()
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
cleaner.CleanHTML(input)
}
}

104
internal/services/logger.go Normal file
View File

@ -0,0 +1,104 @@
package services
import (
"context"
"log/slog"
"os"
"github.com/kjanat/articulate-parser/internal/interfaces"
)
// SlogLogger implements the Logger interface using the standard library's slog package.
type SlogLogger struct {
logger *slog.Logger
}
// NewSlogLogger creates a new structured logger using slog.
// The level parameter controls the minimum log level (debug, info, warn, error).
func NewSlogLogger(level slog.Level) interfaces.Logger {
opts := &slog.HandlerOptions{
Level: level,
}
handler := slog.NewJSONHandler(os.Stdout, opts)
return &SlogLogger{
logger: slog.New(handler),
}
}
// NewTextLogger creates a new structured logger with human-readable text output.
// Useful for development and debugging.
func NewTextLogger(level slog.Level) interfaces.Logger {
opts := &slog.HandlerOptions{
Level: level,
}
handler := slog.NewTextHandler(os.Stdout, opts)
return &SlogLogger{
logger: slog.New(handler),
}
}
// Debug logs a debug-level message with optional key-value pairs.
func (l *SlogLogger) Debug(msg string, keysAndValues ...any) {
l.logger.Debug(msg, keysAndValues...)
}
// Info logs an info-level message with optional key-value pairs.
func (l *SlogLogger) Info(msg string, keysAndValues ...any) {
l.logger.Info(msg, keysAndValues...)
}
// Warn logs a warning-level message with optional key-value pairs.
func (l *SlogLogger) Warn(msg string, keysAndValues ...any) {
l.logger.Warn(msg, keysAndValues...)
}
// Error logs an error-level message with optional key-value pairs.
func (l *SlogLogger) Error(msg string, keysAndValues ...any) {
l.logger.Error(msg, keysAndValues...)
}
// With returns a new logger with the given key-value pairs added as context.
func (l *SlogLogger) With(keysAndValues ...any) interfaces.Logger {
return &SlogLogger{
logger: l.logger.With(keysAndValues...),
}
}
// WithContext returns a new logger with context information.
// Currently preserves the logger as-is, but can be extended to extract
// trace IDs or other context values in the future.
func (l *SlogLogger) WithContext(ctx context.Context) interfaces.Logger {
// Can be extended to extract trace IDs, request IDs, etc. from context
return l
}
// NoOpLogger is a logger that discards all log messages.
// Useful for testing or when logging should be disabled.
type NoOpLogger struct{}
// NewNoOpLogger creates a logger that discards all messages.
func NewNoOpLogger() interfaces.Logger {
return &NoOpLogger{}
}
// Debug does nothing.
func (l *NoOpLogger) Debug(msg string, keysAndValues ...any) {}
// Info does nothing.
func (l *NoOpLogger) Info(msg string, keysAndValues ...any) {}
// Warn does nothing.
func (l *NoOpLogger) Warn(msg string, keysAndValues ...any) {}
// Error does nothing.
func (l *NoOpLogger) Error(msg string, keysAndValues ...any) {}
// With returns the same no-op logger.
func (l *NoOpLogger) With(keysAndValues ...any) interfaces.Logger {
return l
}
// WithContext returns the same no-op logger.
func (l *NoOpLogger) WithContext(ctx context.Context) interfaces.Logger {
return l
}

View File

@ -0,0 +1,95 @@
package services
import (
"context"
"io"
"log/slog"
"testing"
)
// BenchmarkSlogLogger_Info benchmarks structured JSON logging.
func BenchmarkSlogLogger_Info(b *testing.B) {
// Create logger that writes to io.Discard to avoid benchmark noise
opts := &slog.HandlerOptions{Level: slog.LevelInfo}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
logger.Info("test message", "key1", "value1", "key2", 42, "key3", true)
}
}
// BenchmarkSlogLogger_Debug benchmarks debug level logging.
func BenchmarkSlogLogger_Debug(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelDebug}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
logger.Debug("debug message", "operation", "test", "duration", 123)
}
}
// BenchmarkSlogLogger_Error benchmarks error logging.
func BenchmarkSlogLogger_Error(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelError}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
logger.Error("error occurred", "error", "test error", "code", 500)
}
}
// BenchmarkTextLogger_Info benchmarks text logging.
func BenchmarkTextLogger_Info(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelInfo}
handler := slog.NewTextHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
logger.Info("test message", "key1", "value1", "key2", 42)
}
}
// BenchmarkNoOpLogger benchmarks the no-op logger.
func BenchmarkNoOpLogger(b *testing.B) {
logger := NewNoOpLogger()
b.ResetTimer()
for b.Loop() {
logger.Info("test message", "key1", "value1", "key2", 42)
logger.Error("error message", "error", "test")
}
}
// BenchmarkLogger_With benchmarks logger with context.
func BenchmarkLogger_With(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelInfo}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
contextLogger := logger.With("request_id", "123", "user_id", "456")
contextLogger.Info("operation completed")
}
}
// BenchmarkLogger_WithContext benchmarks logger with Go context.
func BenchmarkLogger_WithContext(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelInfo}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
ctx := context.Background()
b.ResetTimer()
for b.Loop() {
contextLogger := logger.WithContext(ctx)
contextLogger.Info("context operation")
}
}

View File

@ -1,8 +1,7 @@
// Package services provides the core functionality for the articulate-parser application.
// It implements the interfaces defined in the interfaces package.
package services
import (
"context"
"encoding/json"
"fmt"
"io"
@ -23,32 +22,36 @@ type ArticulateParser struct {
BaseURL string
// Client is the HTTP client used to make requests to the API
Client *http.Client
// Logger for structured logging
Logger interfaces.Logger
}
// NewArticulateParser creates a new ArticulateParser instance with default settings.
// The default configuration uses the standard Articulate Rise API URL and a
// HTTP client with a 30-second timeout.
func NewArticulateParser() interfaces.CourseParser {
// NewArticulateParser creates a new ArticulateParser instance.
// If baseURL is empty, uses the default Articulate Rise API URL.
// If timeout is zero, uses a 30-second timeout.
func NewArticulateParser(logger interfaces.Logger, baseURL string, timeout time.Duration) interfaces.CourseParser {
if logger == nil {
logger = NewNoOpLogger()
}
if baseURL == "" {
baseURL = "https://rise.articulate.com"
}
if timeout == 0 {
timeout = 30 * time.Second
}
return &ArticulateParser{
BaseURL: "https://rise.articulate.com",
BaseURL: baseURL,
Client: &http.Client{
Timeout: 30 * time.Second,
Timeout: timeout,
},
Logger: logger,
}
}
// FetchCourse fetches a course from the given URI.
// It extracts the share ID from the URI, constructs an API URL, and fetches the course data.
// The course data is then unmarshalled into a Course model.
//
// Parameters:
// - uri: The Articulate Rise share URL (e.g., https://rise.articulate.com/share/SHARE_ID)
//
// Returns:
// - A parsed Course model if successful
// - An error if the fetch fails, if the share ID can't be extracted,
// or if the response can't be parsed
func (p *ArticulateParser) FetchCourse(uri string) (*models.Course, error) {
// FetchCourse fetches a course from the given URI and returns the parsed course data.
// The URI should be an Articulate Rise share URL (e.g., https://rise.articulate.com/share/SHARE_ID).
// The context can be used for cancellation and timeout control.
func (p *ArticulateParser) FetchCourse(ctx context.Context, uri string) (*models.Course, error) {
shareID, err := p.extractShareID(uri)
if err != nil {
return nil, err
@ -56,11 +59,24 @@ func (p *ArticulateParser) FetchCourse(uri string) (*models.Course, error) {
apiURL := p.buildAPIURL(shareID)
resp, err := p.Client.Get(apiURL)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, apiURL, http.NoBody)
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
resp, err := p.Client.Do(req)
if err != nil {
return nil, fmt.Errorf("failed to fetch course data: %w", err)
}
defer resp.Body.Close()
// Ensure response body is closed even if ReadAll fails. Close errors are logged
// but not fatal since the body content has already been read and parsed. In the
// context of HTTP responses, the body must be closed to release the underlying
// connection, but a close error doesn't invalidate the data already consumed.
defer func() {
if err := resp.Body.Close(); err != nil {
p.Logger.Warn("failed to close response body", "error", err, "url", apiURL)
}
}()
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("API returned status %d", resp.StatusCode)
@ -80,15 +96,8 @@ func (p *ArticulateParser) FetchCourse(uri string) (*models.Course, error) {
}
// LoadCourseFromFile loads an Articulate Rise course from a local JSON file.
// The file should contain a valid JSON representation of an Articulate Rise course.
//
// Parameters:
// - filePath: The path to the JSON file containing the course data
//
// Returns:
// - A parsed Course model if successful
// - An error if the file can't be read or the JSON can't be parsed
func (p *ArticulateParser) LoadCourseFromFile(filePath string) (*models.Course, error) {
// #nosec G304 - File path is provided by user via CLI argument, which is expected behavior
data, err := os.ReadFile(filePath)
if err != nil {
return nil, fmt.Errorf("failed to read file: %w", err)

View File

@ -0,0 +1,219 @@
package services
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"github.com/kjanat/articulate-parser/internal/models"
)
// BenchmarkArticulateParser_FetchCourse benchmarks the FetchCourse method.
func BenchmarkArticulateParser_FetchCourse(b *testing.B) {
testCourse := &models.Course{
ShareID: "benchmark-id",
Author: "Benchmark Author",
Course: models.CourseInfo{
ID: "bench-course",
Title: "Benchmark Course",
Description: "Testing performance",
Lessons: []models.Lesson{
{
ID: "lesson1",
Title: "Lesson 1",
Type: "lesson",
},
},
},
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
// Encode errors are ignored in benchmarks; the test server's ResponseWriter
// writes are reliable and any encoding error would be a test setup issue
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{},
Logger: NewNoOpLogger(),
}
b.ResetTimer()
for b.Loop() {
_, err := parser.FetchCourse(context.Background(), "https://rise.articulate.com/share/benchmark-id")
if err != nil {
b.Fatalf("FetchCourse failed: %v", err)
}
}
}
// BenchmarkArticulateParser_FetchCourse_LargeCourse benchmarks with a large course.
func BenchmarkArticulateParser_FetchCourse_LargeCourse(b *testing.B) {
// Create a large course with many lessons
lessons := make([]models.Lesson, 100)
for i := range 100 {
lessons[i] = models.Lesson{
ID: string(rune(i)),
Title: "Lesson " + string(rune(i)),
Type: "lesson",
Description: "This is a test lesson with some description",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "Test Heading",
Paragraph: "Test paragraph content with some text",
},
},
},
},
}
}
testCourse := &models.Course{
ShareID: "large-course-id",
Author: "Benchmark Author",
Course: models.CourseInfo{
ID: "large-course",
Title: "Large Benchmark Course",
Description: "Testing performance with large course",
Lessons: lessons,
},
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
// Encode errors are ignored in benchmarks; the test server's ResponseWriter
// writes are reliable and any encoding error would be a test setup issue
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{},
Logger: NewNoOpLogger(),
}
b.ResetTimer()
for b.Loop() {
_, err := parser.FetchCourse(context.Background(), "https://rise.articulate.com/share/large-course-id")
if err != nil {
b.Fatalf("FetchCourse failed: %v", err)
}
}
}
// BenchmarkArticulateParser_LoadCourseFromFile benchmarks loading from file.
func BenchmarkArticulateParser_LoadCourseFromFile(b *testing.B) {
testCourse := &models.Course{
ShareID: "file-test-id",
Course: models.CourseInfo{
Title: "File Test Course",
},
}
tempDir := b.TempDir()
tempFile := filepath.Join(tempDir, "benchmark.json")
data, err := json.Marshal(testCourse)
if err != nil {
b.Fatalf("Failed to marshal: %v", err)
}
if err := os.WriteFile(tempFile, data, 0o644); err != nil {
b.Fatalf("Failed to write file: %v", err)
}
parser := NewArticulateParser(nil, "", 0)
b.ResetTimer()
for b.Loop() {
_, err := parser.LoadCourseFromFile(tempFile)
if err != nil {
b.Fatalf("LoadCourseFromFile failed: %v", err)
}
}
}
// BenchmarkArticulateParser_LoadCourseFromFile_Large benchmarks with large file.
func BenchmarkArticulateParser_LoadCourseFromFile_Large(b *testing.B) {
// Create a large course
lessons := make([]models.Lesson, 200)
for i := range 200 {
lessons[i] = models.Lesson{
ID: string(rune(i)),
Title: "Lesson " + string(rune(i)),
Type: "lesson",
Items: []models.Item{
{Type: "text", Items: []models.SubItem{{Heading: "H", Paragraph: "P"}}},
{Type: "list", Items: []models.SubItem{{Paragraph: "Item 1"}, {Paragraph: "Item 2"}}},
},
}
}
testCourse := &models.Course{
ShareID: "large-file-id",
Course: models.CourseInfo{
Title: "Large File Course",
Lessons: lessons,
},
}
tempDir := b.TempDir()
tempFile := filepath.Join(tempDir, "large-benchmark.json")
data, err := json.Marshal(testCourse)
if err != nil {
b.Fatalf("Failed to marshal: %v", err)
}
if err := os.WriteFile(tempFile, data, 0o644); err != nil {
b.Fatalf("Failed to write file: %v", err)
}
parser := NewArticulateParser(nil, "", 0)
b.ResetTimer()
for b.Loop() {
_, err := parser.LoadCourseFromFile(tempFile)
if err != nil {
b.Fatalf("LoadCourseFromFile failed: %v", err)
}
}
}
// BenchmarkArticulateParser_ExtractShareID benchmarks share ID extraction.
func BenchmarkArticulateParser_ExtractShareID(b *testing.B) {
parser := &ArticulateParser{}
uri := "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/"
b.ResetTimer()
for b.Loop() {
_, err := parser.extractShareID(uri)
if err != nil {
b.Fatalf("extractShareID failed: %v", err)
}
}
}
// BenchmarkArticulateParser_BuildAPIURL benchmarks API URL building.
func BenchmarkArticulateParser_BuildAPIURL(b *testing.B) {
parser := &ArticulateParser{
BaseURL: "https://rise.articulate.com",
}
shareID := "test-share-id-12345"
b.ResetTimer()
for b.Loop() {
_ = parser.buildAPIURL(shareID)
}
}

View File

@ -0,0 +1,289 @@
package services
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"github.com/kjanat/articulate-parser/internal/models"
)
// TestArticulateParser_FetchCourse_ContextCancellation tests that FetchCourse
// respects context cancellation.
func TestArticulateParser_FetchCourse_ContextCancellation(t *testing.T) {
// Create a server that delays response
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Sleep to give time for context cancellation
time.Sleep(100 * time.Millisecond)
testCourse := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
Title: "Test Course",
},
}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Create a context that we'll cancel immediately
ctx, cancel := context.WithCancel(context.Background())
cancel() // Cancel immediately
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
// Should get a context cancellation error
if err == nil {
t.Fatal("Expected error due to context cancellation, got nil")
}
if !strings.Contains(err.Error(), "context canceled") {
t.Errorf("Expected context cancellation error, got: %v", err)
}
}
// TestArticulateParser_FetchCourse_ContextTimeout tests that FetchCourse
// respects context timeout.
func TestArticulateParser_FetchCourse_ContextTimeout(t *testing.T) {
// Create a server that delays response longer than timeout
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Sleep longer than the context timeout
time.Sleep(200 * time.Millisecond)
testCourse := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
Title: "Test Course",
},
}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Create a context with a very short timeout
ctx, cancel := context.WithTimeout(context.Background(), 50*time.Millisecond)
defer cancel()
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
// Should get a context deadline exceeded error
if err == nil {
t.Fatal("Expected error due to context timeout, got nil")
}
if !strings.Contains(err.Error(), "deadline exceeded") &&
!strings.Contains(err.Error(), "context deadline exceeded") {
t.Errorf("Expected context timeout error, got: %v", err)
}
}
// TestArticulateParser_FetchCourse_ContextDeadline tests that FetchCourse
// respects context deadline.
func TestArticulateParser_FetchCourse_ContextDeadline(t *testing.T) {
// Create a server that delays response
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(150 * time.Millisecond)
testCourse := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
Title: "Test Course",
},
}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Create a context with a deadline in the past
ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(10*time.Millisecond))
defer cancel()
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
// Should get a deadline exceeded error
if err == nil {
t.Fatal("Expected error due to context deadline, got nil")
}
if !strings.Contains(err.Error(), "deadline exceeded") &&
!strings.Contains(err.Error(), "context deadline exceeded") {
t.Errorf("Expected deadline exceeded error, got: %v", err)
}
}
// TestArticulateParser_FetchCourse_ContextSuccess tests that FetchCourse
// succeeds when context is not canceled.
func TestArticulateParser_FetchCourse_ContextSuccess(t *testing.T) {
testCourse := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
Title: "Test Course",
},
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Respond quickly
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Create a context with generous timeout
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
course, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
if err != nil {
t.Fatalf("Expected no error, got: %v", err)
}
if course == nil {
t.Fatal("Expected course, got nil")
}
if course.Course.Title != testCourse.Course.Title {
t.Errorf("Expected title '%s', got '%s'", testCourse.Course.Title, course.Course.Title)
}
}
// TestArticulateParser_FetchCourse_CancellationDuringRequest tests cancellation
// during an in-flight request.
func TestArticulateParser_FetchCourse_CancellationDuringRequest(t *testing.T) {
requestStarted := make(chan bool)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
requestStarted <- true
// Keep the handler running to simulate slow response
time.Sleep(300 * time.Millisecond)
testCourse := &models.Course{
ShareID: "test-id",
}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
ctx, cancel := context.WithCancel(context.Background())
// Start the request in a goroutine
errChan := make(chan error, 1)
go func() {
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
errChan <- err
}()
// Wait for request to start
<-requestStarted
// Cancel after request has started
cancel()
// Get the error
err := <-errChan
if err == nil {
t.Fatal("Expected error due to context cancellation, got nil")
}
// Should contain context canceled somewhere in the error chain
if !strings.Contains(err.Error(), "context canceled") {
t.Errorf("Expected context canceled error, got: %v", err)
}
}
// TestArticulateParser_FetchCourse_MultipleTimeouts tests behavior with
// multiple concurrent requests and timeouts.
func TestArticulateParser_FetchCourse_MultipleTimeouts(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(100 * time.Millisecond)
testCourse := &models.Course{ShareID: "test"}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Launch multiple requests with different timeouts
tests := []struct {
name string
timeout time.Duration
shouldSucceed bool
}{
{"very short timeout", 10 * time.Millisecond, false},
{"short timeout", 50 * time.Millisecond, false},
{"adequate timeout", 500 * time.Millisecond, true},
{"long timeout", 2 * time.Second, true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), tt.timeout)
defer cancel()
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
if tt.shouldSucceed && err != nil {
t.Errorf("Expected success with timeout %v, got error: %v", tt.timeout, err)
}
if !tt.shouldSucceed && err == nil {
t.Errorf("Expected timeout error with timeout %v, got success", tt.timeout)
}
})
}
}

View File

@ -1,7 +1,7 @@
// Package services_test provides tests for the parser service.
package services
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
@ -16,7 +16,7 @@ import (
// TestNewArticulateParser tests the NewArticulateParser constructor.
func TestNewArticulateParser(t *testing.T) {
parser := NewArticulateParser()
parser := NewArticulateParser(nil, "", 0)
if parser == nil {
t.Fatal("NewArticulateParser() returned nil")
@ -112,7 +112,7 @@ func TestArticulateParser_FetchCourse(t *testing.T) {
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
course, err := parser.FetchCourse(tt.uri)
course, err := parser.FetchCourse(context.Background(), tt.uri)
if tt.expectedError != "" {
if err == nil {
@ -146,7 +146,7 @@ func TestArticulateParser_FetchCourse_NetworkError(t *testing.T) {
},
}
_, err := parser.FetchCourse("https://rise.articulate.com/share/test-share-id")
_, err := parser.FetchCourse(context.Background(), "https://rise.articulate.com/share/test-share-id")
if err == nil {
t.Fatal("Expected network error, got nil")
}
@ -161,7 +161,10 @@ func TestArticulateParser_FetchCourse_InvalidJSON(t *testing.T) {
// Create test server that returns invalid JSON
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.Write([]byte("invalid json"))
// Write is used for its side effect; the test verifies error handling on
// the client side, not whether the write succeeds. Ignore the error since
// httptest.ResponseWriter writes are rarely problematic in test contexts.
_, _ = w.Write([]byte("invalid json"))
}))
defer server.Close()
@ -172,7 +175,7 @@ func TestArticulateParser_FetchCourse_InvalidJSON(t *testing.T) {
},
}
_, err := parser.FetchCourse("https://rise.articulate.com/share/test-share-id")
_, err := parser.FetchCourse(context.Background(), "https://rise.articulate.com/share/test-share-id")
if err == nil {
t.Fatal("Expected JSON parsing error, got nil")
}
@ -205,11 +208,11 @@ func TestArticulateParser_LoadCourseFromFile(t *testing.T) {
t.Fatalf("Failed to marshal test course: %v", err)
}
if err := os.WriteFile(tempFile, data, 0644); err != nil {
if err := os.WriteFile(tempFile, data, 0o644); err != nil {
t.Fatalf("Failed to write test file: %v", err)
}
parser := NewArticulateParser()
parser := NewArticulateParser(nil, "", 0)
tests := []struct {
name string
@ -264,11 +267,11 @@ func TestArticulateParser_LoadCourseFromFile_InvalidJSON(t *testing.T) {
tempDir := t.TempDir()
tempFile := filepath.Join(tempDir, "invalid.json")
if err := os.WriteFile(tempFile, []byte("invalid json content"), 0644); err != nil {
if err := os.WriteFile(tempFile, []byte("invalid json content"), 0o644); err != nil {
t.Fatalf("Failed to write test file: %v", err)
}
parser := NewArticulateParser()
parser := NewArticulateParser(nil, "", 0)
_, err := parser.LoadCourseFromFile(tempFile)
if err == nil {
@ -420,8 +423,7 @@ func BenchmarkExtractShareID(b *testing.B) {
parser := &ArticulateParser{}
uri := "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/"
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
_, _ = parser.extractShareID(uri)
}
}
@ -433,8 +435,7 @@ func BenchmarkBuildAPIURL(b *testing.B) {
}
shareID := "N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO"
b.ResetTimer()
for i := 0; i < b.N; i++ {
for b.Loop() {
_ = parser.buildAPIURL(shareID)
}
}

View File

@ -5,7 +5,17 @@ package version
// Version information.
var (
// Version is the current version of the application.
Version = "0.3.0"
// Breaking changes from 0.4.x:
// - Renamed GetSupportedFormat() -> SupportedFormat()
// - Renamed GetSupportedFormats() -> SupportedFormats()
// - FetchCourse now requires context.Context parameter
// - NewArticulateParser now accepts logger, baseURL, timeout
// New features:
// - Structured logging with slog
// - Configuration via environment variables
// - Context-aware HTTP requests
// - Comprehensive benchmarks and examples.
Version = "1.0.0"
// BuildTime is the time the binary was built.
BuildTime = "unknown"

98
main.go
View File

@ -4,54 +4,84 @@
package main
import (
"context"
"fmt"
"log"
"os"
"strings"
"github.com/kjanat/articulate-parser/internal/config"
"github.com/kjanat/articulate-parser/internal/exporters"
"github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/services"
"github.com/kjanat/articulate-parser/internal/version"
)
// main is the entry point of the application.
// It handles command-line arguments, sets up dependencies,
// and coordinates the parsing and exporting of courses.
func main() {
// Dependency injection setup
os.Exit(run(os.Args))
}
// run contains the main application logic and returns an exit code.
// This function is testable as it doesn't call os.Exit directly.
func run(args []string) int {
// Load configuration
cfg := config.Load()
// Dependency injection setup with configuration
var logger interfaces.Logger
if cfg.LogFormat == "json" {
logger = services.NewSlogLogger(cfg.LogLevel)
} else {
logger = services.NewTextLogger(cfg.LogLevel)
}
htmlCleaner := services.NewHTMLCleaner()
parser := services.NewArticulateParser()
parser := services.NewArticulateParser(logger, cfg.BaseURL, cfg.RequestTimeout)
exporterFactory := exporters.NewFactory(htmlCleaner)
app := services.NewApp(parser, exporterFactory)
// Check for required command-line arguments
if len(os.Args) < 4 {
fmt.Printf("Usage: %s <source> <format> <output>\n", os.Args[0])
fmt.Printf(" source: URI or file path to the course\n")
fmt.Printf(" format: export format (%s)\n", joinStrings(app.GetSupportedFormats(), ", "))
fmt.Printf(" output: output file path\n")
fmt.Println("\nExample:")
fmt.Printf(" %s articulate-sample.json markdown output.md\n", os.Args[0])
fmt.Printf(" %s https://rise.articulate.com/share/xyz docx output.docx\n", os.Args[0])
os.Exit(1)
// Check for version flag
if len(args) > 1 && (args[1] == "--version" || args[1] == "-v") {
fmt.Printf("%s version %s\n", args[0], version.Version)
fmt.Printf("Build time: %s\n", version.BuildTime)
fmt.Printf("Git commit: %s\n", version.GitCommit)
return 0
}
source := os.Args[1]
format := os.Args[2]
output := os.Args[3]
// Check for help flag
if len(args) > 1 && (args[1] == "--help" || args[1] == "-h" || args[1] == "help") {
printUsage(args[0], app.SupportedFormats())
return 0
}
// Check for required command-line arguments
if len(args) < 4 {
printUsage(args[0], app.SupportedFormats())
return 1
}
source := args[1]
format := args[2]
output := args[3]
var err error
// Determine if source is a URI or file path
if isURI(source) {
err = app.ProcessCourseFromURI(source, format, output)
err = app.ProcessCourseFromURI(context.Background(), source, format, output)
} else {
err = app.ProcessCourseFromFile(source, format, output)
}
if err != nil {
log.Fatalf("Error processing course: %v", err)
logger.Error("failed to process course", "error", err, "source", source)
return 1
}
fmt.Printf("Successfully exported course to %s\n", output)
logger.Info("successfully exported course", "output", output, "format", format)
return 0
}
// isURI checks if a string is a URI by looking for http:// or https:// prefixes.
@ -65,25 +95,17 @@ func isURI(str string) bool {
return len(str) > 7 && (str[:7] == "http://" || str[:8] == "https://")
}
// joinStrings concatenates a slice of strings using the specified separator.
// printUsage prints the command-line usage information.
//
// Parameters:
// - strs: The slice of strings to join
// - sep: The separator to insert between each string
//
// Returns:
// - A single string with all elements joined by the separator
func joinStrings(strs []string, sep string) string {
if len(strs) == 0 {
return ""
}
if len(strs) == 1 {
return strs[0]
}
result := strs[0]
for i := 1; i < len(strs); i++ {
result += sep + strs[i]
}
return result
// - programName: The name of the program (args[0])
// - supportedFormats: Slice of supported export formats
func printUsage(programName string, supportedFormats []string) {
fmt.Printf("Usage: %s <source> <format> <output>\n", programName)
fmt.Printf(" source: URI or file path to the course\n")
fmt.Printf(" format: export format (%s)\n", strings.Join(supportedFormats, ", "))
fmt.Printf(" output: output file path\n")
fmt.Println("\nExample:")
fmt.Printf(" %s articulate-sample.json markdown output.md\n", programName)
fmt.Printf(" %s https://rise.articulate.com/share/xyz docx output.docx\n", programName)
}

View File

@ -1,7 +1,11 @@
// Package main_test provides tests for the main package utility functions.
package main
import (
"bytes"
"io"
"log"
"os"
"strings"
"testing"
)
@ -79,97 +83,444 @@ func TestIsURI(t *testing.T) {
}
}
// TestJoinStrings tests the joinStrings function with various input scenarios.
func TestJoinStrings(t *testing.T) {
// BenchmarkIsURI benchmarks the isURI function performance.
func BenchmarkIsURI(b *testing.B) {
testStr := "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/"
for b.Loop() {
isURI(testStr)
}
}
// TestRunWithInsufficientArgs tests the run function with insufficient command-line arguments.
func TestRunWithInsufficientArgs(t *testing.T) {
tests := []struct {
name string
strs []string
separator string
expected string
args []string
}{
{
name: "empty slice",
strs: []string{},
separator: ", ",
expected: "",
name: "no arguments",
args: []string{"articulate-parser"},
},
{
name: "single string",
strs: []string{"hello"},
separator: ", ",
expected: "hello",
name: "one argument",
args: []string{"articulate-parser", "source"},
},
{
name: "two strings with comma separator",
strs: []string{"markdown", "docx"},
separator: ", ",
expected: "markdown, docx",
},
{
name: "three strings with comma separator",
strs: []string{"markdown", "md", "docx"},
separator: ", ",
expected: "markdown, md, docx",
},
{
name: "multiple strings with pipe separator",
strs: []string{"option1", "option2", "option3"},
separator: " | ",
expected: "option1 | option2 | option3",
},
{
name: "strings with no separator",
strs: []string{"a", "b", "c"},
separator: "",
expected: "abc",
},
{
name: "strings with newline separator",
strs: []string{"line1", "line2", "line3"},
separator: "\n",
expected: "line1\nline2\nline3",
},
{
name: "empty strings in slice",
strs: []string{"", "middle", ""},
separator: "-",
expected: "-middle-",
},
{
name: "nil slice",
strs: nil,
separator: ", ",
expected: "",
name: "two arguments",
args: []string{"articulate-parser", "source", "format"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := joinStrings(tt.strs, tt.separator)
if result != tt.expected {
t.Errorf("joinStrings(%v, %q) = %q, want %q", tt.strs, tt.separator, result, tt.expected)
// Capture stdout
oldStdout := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
// Run the function
exitCode := run(tt.args)
// Restore stdout. Close errors are ignored: we've already captured the
// output before closing, and any close error doesn't affect test validity.
_ = w.Close()
os.Stdout = oldStdout
// Read captured output. Copy errors are ignored: in this test context,
// reading from a pipe that was just closed is not expected to fail, and
// we're verifying the captured output regardless.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Verify exit code
if exitCode != 1 {
t.Errorf("Expected exit code 1, got %d", exitCode)
}
// Verify usage message is displayed
if !strings.Contains(output, "Usage:") {
t.Errorf("Expected usage message in output, got: %s", output)
}
if !strings.Contains(output, "export format") {
t.Errorf("Expected format information in output, got: %s", output)
}
})
}
}
// BenchmarkIsURI benchmarks the isURI function performance.
func BenchmarkIsURI(b *testing.B) {
testStr := "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/"
// TestRunWithHelpFlags tests the run function with help flag arguments.
func TestRunWithHelpFlags(t *testing.T) {
helpFlags := []string{"--help", "-h", "help"}
b.ResetTimer()
for i := 0; i < b.N; i++ {
isURI(testStr)
for _, flag := range helpFlags {
t.Run("help_flag_"+flag, func(t *testing.T) {
// Capture stdout
oldStdout := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
// Run with help flag
args := []string{"articulate-parser", flag}
exitCode := run(args)
// Restore stdout. Close errors are ignored: the pipe write end is already
// closed before reading, and any close error doesn't affect the test.
_ = w.Close()
os.Stdout = oldStdout
// Read captured output. Copy errors are ignored: we successfully wrote
// the help output to the pipe and can verify it regardless of close semantics.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Verify exit code is 0 (success)
if exitCode != 0 {
t.Errorf("Expected exit code 0 for help flag %s, got %d", flag, exitCode)
}
// Verify help content is displayed
expectedContent := []string{
"Usage:",
"source: URI or file path to the course",
"format: export format",
"output: output file path",
"Example:",
"articulate-sample.json markdown output.md",
"https://rise.articulate.com/share/xyz docx output.docx",
}
for _, expected := range expectedContent {
if !strings.Contains(output, expected) {
t.Errorf("Expected help output to contain %q when using flag %s, got: %s", expected, flag, output)
}
}
})
}
}
// BenchmarkJoinStrings benchmarks the joinStrings function performance.
func BenchmarkJoinStrings(b *testing.B) {
strs := []string{"markdown", "md", "docx", "word", "pdf", "html"}
separator := ", "
// TestRunWithVersionFlags tests the run function with version flag arguments.
func TestRunWithVersionFlags(t *testing.T) {
versionFlags := []string{"--version", "-v"}
b.ResetTimer()
for i := 0; i < b.N; i++ {
joinStrings(strs, separator)
for _, flag := range versionFlags {
t.Run("version_flag_"+flag, func(t *testing.T) {
// Capture stdout
oldStdout := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
// Run with version flag
args := []string{"articulate-parser", flag}
exitCode := run(args)
// Restore stdout. Close errors are ignored: the version output has already
// been written and we're about to read it; close semantics don't affect correctness.
_ = w.Close()
os.Stdout = oldStdout
// Read captured output. Copy errors are ignored: the output was successfully
// produced and we can verify its contents regardless of any I/O edge cases.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Verify exit code is 0 (success)
if exitCode != 0 {
t.Errorf("Expected exit code 0 for version flag %s, got %d", flag, exitCode)
}
// Verify version content is displayed
expectedContent := []string{
"articulate-parser version",
"Build time:",
"Git commit:",
}
for _, expected := range expectedContent {
if !strings.Contains(output, expected) {
t.Errorf("Expected version output to contain %q when using flag %s, got: %s", expected, flag, output)
}
}
})
}
}
// TestRunWithInvalidFile tests the run function with a non-existent file.
func TestRunWithInvalidFile(t *testing.T) {
// Capture stdout and stderr
oldStdout := os.Stdout
oldStderr := os.Stderr
stdoutR, stdoutW, _ := os.Pipe()
stderrR, stderrW, _ := os.Pipe()
os.Stdout = stdoutW
os.Stderr = stderrW
// Also need to redirect log output
oldLogOutput := log.Writer()
log.SetOutput(stderrW)
// Run with non-existent file
args := []string{"articulate-parser", "nonexistent-file.json", "markdown", "output.md"}
exitCode := run(args)
// Restore stdout/stderr and log output. Close errors are ignored: we've already
// written all error messages to these pipes before closing them, and the test
// only cares about verifying the captured output.
_ = stdoutW.Close()
_ = stderrW.Close()
os.Stdout = oldStdout
os.Stderr = oldStderr
log.SetOutput(oldLogOutput)
// Read captured output. Copy errors are ignored: the error messages have been
// successfully written to the pipes, and we can verify the output content
// regardless of any edge cases in pipe closure or I/O completion.
var stdoutBuf, stderrBuf bytes.Buffer
_, _ = io.Copy(&stdoutBuf, stdoutR)
_, _ = io.Copy(&stderrBuf, stderrR)
// Close read ends of pipes. Errors ignored: we've already consumed all data
// from these pipes, and close errors don't affect test assertions.
_ = stdoutR.Close()
_ = stderrR.Close()
// Verify exit code
if exitCode != 1 {
t.Errorf("Expected exit code 1 for non-existent file, got %d", exitCode)
}
// Should have error output in structured log format
output := stdoutBuf.String()
if !strings.Contains(output, "level=ERROR") && !strings.Contains(output, "failed to process course") {
t.Errorf("Expected error message about processing course, got: %s", output)
}
}
// TestRunWithInvalidURI tests the run function with an invalid URI.
func TestRunWithInvalidURI(t *testing.T) {
// Capture stdout and stderr
oldStdout := os.Stdout
oldStderr := os.Stderr
stdoutR, stdoutW, _ := os.Pipe()
stderrR, stderrW, _ := os.Pipe()
os.Stdout = stdoutW
os.Stderr = stderrW
// Also need to redirect log output
oldLogOutput := log.Writer()
log.SetOutput(stderrW)
// Run with invalid URI (will fail because we can't actually fetch)
args := []string{"articulate-parser", "https://example.com/invalid", "markdown", "output.md"}
exitCode := run(args)
// Restore stdout/stderr and log output. Close errors are ignored: we've already
// written all error messages about the invalid URI to these pipes before closing,
// and test correctness only depends on verifying the captured error output.
_ = stdoutW.Close()
_ = stderrW.Close()
os.Stdout = oldStdout
os.Stderr = oldStderr
log.SetOutput(oldLogOutput)
// Read captured output. Copy errors are ignored: the error messages have been
// successfully written and we can verify the failure output content regardless
// of any edge cases in pipe lifecycle or I/O synchronization.
var stdoutBuf, stderrBuf bytes.Buffer
_, _ = io.Copy(&stdoutBuf, stdoutR)
_, _ = io.Copy(&stderrBuf, stderrR)
// Close read ends of pipes. Errors ignored: we've already consumed all data
// and close errors don't affect the validation of the error output.
_ = stdoutR.Close()
_ = stderrR.Close()
// Should fail because the URI is invalid/unreachable
if exitCode != 1 {
t.Errorf("Expected failure (exit code 1) for invalid URI, got %d", exitCode)
}
// Should have error output in structured log format
output := stdoutBuf.String()
if !strings.Contains(output, "level=ERROR") && !strings.Contains(output, "failed to process course") {
t.Errorf("Expected error message about processing course, got: %s", output)
}
}
// TestRunWithValidJSONFile tests the run function with a valid JSON file.
func TestRunWithValidJSONFile(t *testing.T) {
// Create a temporary test JSON file
testContent := `{
"title": "Test Course",
"lessons": [
{
"id": "lesson1",
"title": "Test Lesson",
"blocks": [
{
"type": "text",
"id": "block1",
"data": {
"text": "Test content"
}
}
]
}
]
}`
tmpFile, err := os.CreateTemp("", "test-course-*.json")
if err != nil {
t.Fatalf("Failed to create temp file: %v", err)
}
// Ensure temporary test file is cleaned up. Remove errors are ignored because
// the test has already used the file for its purpose, and cleanup failures don't
// invalidate the test results (the OS will eventually clean up temp files).
defer func() {
_ = os.Remove(tmpFile.Name())
}()
if _, err := tmpFile.WriteString(testContent); err != nil {
t.Fatalf("Failed to write test content: %v", err)
}
// Close the temporary file. Errors are ignored because we've already written
// the test content and the main test logic (loading the file) doesn't depend
// on the success of closing this file descriptor.
_ = tmpFile.Close()
// Test successful run with valid file
outputFile := "test-output.md"
// Ensure test output file is cleaned up. Remove errors are ignored because the
// test has already verified the export succeeded; cleanup failures don't affect
// the test assertions.
defer func() {
_ = os.Remove(outputFile)
}()
// Save original stdout
originalStdout := os.Stdout
defer func() { os.Stdout = originalStdout }()
// Capture stdout
r, w, _ := os.Pipe()
os.Stdout = w
args := []string{"articulate-parser", tmpFile.Name(), "markdown", outputFile}
exitCode := run(args)
// Close write end and restore stdout. Close errors are ignored: we've already
// written the success message before closing, and any close error doesn't affect
// the validity of the captured output or the test assertions.
_ = w.Close()
os.Stdout = originalStdout
// Read captured output. Copy errors are ignored: the success message was
// successfully written to the pipe, and we can verify it regardless of any
// edge cases in pipe closure or I/O synchronization.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Verify successful execution
if exitCode != 0 {
t.Errorf("Expected successful execution (exit code 0), got %d", exitCode)
}
// Verify success message in structured log format
if !strings.Contains(output, "level=INFO") || !strings.Contains(output, "successfully exported course") {
t.Errorf("Expected success message in output, got: %s", output)
}
// Verify output file was created
if _, err := os.Stat(outputFile); os.IsNotExist(err) {
t.Errorf("Expected output file %s to be created", outputFile)
}
}
// TestRunIntegration tests the run function with different output formats using sample file.
func TestRunIntegration(t *testing.T) {
// Skip if sample file doesn't exist
if _, err := os.Stat("articulate-sample.json"); os.IsNotExist(err) {
t.Skip("Skipping integration test: articulate-sample.json not found")
}
formats := []struct {
format string
output string
}{
{"markdown", "test-output.md"},
{"html", "test-output.html"},
{"docx", "test-output.docx"},
}
for _, format := range formats {
t.Run("format_"+format.format, func(t *testing.T) {
// Capture stdout
oldStdout := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
// Run the function
args := []string{"articulate-parser", "articulate-sample.json", format.format, format.output}
exitCode := run(args)
// Restore stdout. Close errors are ignored: the export success message
// has already been written and we're about to read it; close semantics
// don't affect the validity of the captured output.
_ = w.Close()
os.Stdout = oldStdout
// Read captured output. Copy errors are ignored: the output was successfully
// produced and we can verify its contents regardless of any I/O edge cases.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Clean up test file. Remove errors are ignored because the test has
// already verified the export succeeded; cleanup failures don't affect
// the test assertions.
defer func() {
_ = os.Remove(format.output)
}()
// Verify successful execution
if exitCode != 0 {
t.Errorf("Expected successful execution (exit code 0), got %d", exitCode)
}
// Verify success message
expectedMsg := "Successfully exported course to " + format.output
if !strings.Contains(output, expectedMsg) {
t.Errorf("Expected success message '%s' in output, got: %s", expectedMsg, output)
}
// Verify output file was created
if _, err := os.Stat(format.output); os.IsNotExist(err) {
t.Errorf("Expected output file %s to be created", format.output)
}
})
}
}
// TestMainFunction tests that the main function exists and is properly structured.
// We can't test os.Exit behavior directly, but we can verify the main function
// calls the run function correctly by testing run function behavior.
func TestMainFunction(t *testing.T) {
// Test that insufficient args return exit code 1
exitCode := run([]string{"program"})
if exitCode != 1 {
t.Errorf("Expected run to return exit code 1 for insufficient args, got %d", exitCode)
}
// Test that main function exists (this is mainly for coverage)
// The main function just calls os.Exit(run(os.Args)), which we can't test directly
// but we've tested the run function thoroughly above.
}