32 Commits

Author SHA1 Message Date
33673d661b fix: set go 1.24.0 minimum with toolchain 1.25.5 2026-01-05 03:31:09 +01:00
41f3f5c4e2 [autofix.ci] apply automated fixes 2026-01-05 02:26:28 +00:00
d644094999 chore: enable CGO for race detection, update deps, drop old Go versions 2026-01-05 03:24:49 +01:00
71d1429048 chore: update actions/checkout to v6, improve AGENTS.md 2026-01-05 03:17:26 +01:00
bd308e4dfc refactor(exporter): rewrite HTML exporter to use Go templates
Replaces the manual string-building implementation of the HTML exporter with a more robust and maintainable solution using Go's `html/template` package. This improves readability, security, and separation of concerns.

- HTML structure and CSS styles are moved into their own files and embedded into the binary using `go:embed`.
- A new data preparation layer adapts the course model for the template, simplifying rendering logic.
- Tests are updated to reflect the new implementation, removing obsolete test cases for the old string-building methods.

Additionally, this commit:
- Adds an `AGENTS.md` file with development and contribution guidelines.
- Updates `.golangci.yml` to allow standard Go patterns for interface package naming.
2025-11-07 06:33:38 +01:00
227f88cb9b chore(lint): fix golangci-lint issues
- Remove duplicate package comments (godoclint)
- Improve code style (gocritic: assignOp, elseif, emptyStringTest)
- Extract repeated format strings to constants (goconst)
- Fix naming conventions: OriginalUrl -> OriginalURL (revive)
- Wrap external errors with context (wrapcheck)
- Disable gocognit for test files in .golangci.yml

Remaining issues by design:
- funlen: getDefaultCSS (CSS content)
- revive: interfaces package name (meaningful in context)
2025-11-06 16:50:44 +01:00
b56c9fa29f refactor: Align with Go conventions and improve maintainability
Renames the `OriginalUrl` field to `OriginalURL` across media models to adhere to Go's common initialisms convention. The `json` tag is unchanged to maintain API compatibility.

Introduces constants for exporter formats (e.g., `FormatMarkdown`, `FormatDocx`) to eliminate the use of magic strings, enhancing type safety and making the code easier to maintain.

Additionally, this commit includes several minor code quality improvements:
- Wraps file-writing errors in exporters to provide more context.
- Removes redundant package-level comments from test files.
- Applies various minor linting fixes throughout the codebase.
2025-11-06 16:48:00 +01:00
d8e4d97841 chore: Apply modern Go idioms and perform code cleanup
This commit introduces a series of small refactorings and style fixes across the codebase to improve consistency and leverage modern Go features.

Key changes include:
- Adopting the Go 1.22 `reflect.TypeFor` generic function.
- Replacing `interface{}` with the `any` type alias for better readability.
- Using the explicit `http.NoBody` constant for HTTP requests.
- Updating octal literals for file permissions to the `0o` prefix syntax.
- Standardizing comment formatting and fixing minor typos.
- Removing redundant blank lines and organizing imports.
2025-11-06 15:59:11 +01:00
fe588dadda chore(ci): add linting and refine workflow dependencies
Adds a golangci-lint job to the CI pipeline to enforce code quality and style. The test job is now dependent on the new linting job.

The final image build job is also updated to depend on the successful completion of the test, docker-test, and dependency-review jobs, ensuring more checks pass before publishing.

Additionally, Go 1.25 is added to the testing matrix.
2025-11-06 15:56:29 +01:00
68c6f4e408 chore!: prepare for v1.0.0 release
Bumps the application version to 1.0.0, signaling the first stable release. This version consolidates several new features and breaking API changes.

This commit also includes various code quality improvements:
- Modernizes tests to use t.Setenv for safer environment variable handling.
- Addresses various linter warnings (gosec, errcheck).
- Updates loop syntax to use Go 1.22's range-over-integer feature.

BREAKING CHANGE: The public API has been updated for consistency and to introduce new features like context support and structured logging.
- `GetSupportedFormat()` is renamed to `SupportedFormat()`.
- `GetSupportedFormats()` is renamed to `SupportedFormats()`.
- `FetchCourse()` now requires a `context.Context` parameter.
- `NewArticulateParser()` constructor signature has been updated.
2025-11-06 05:59:52 +01:00
37927a36b6 refactor(core)!: Add context, config, and structured logging
Introduces `context.Context` to the `FetchCourse` method and its call chain, allowing for cancellable network requests and timeouts. This improves application robustness when fetching remote course data.

A new configuration package centralizes application settings, loading them from environment variables with sensible defaults for base URL, request timeout, and logging.

Standard `log` and `fmt` calls are replaced with a structured logging system built on `slog`, supporting both JSON and human-readable text formats.

This change also includes:
- Extensive benchmarks and example tests.
- Simplified Go doc comments across several packages.

BREAKING CHANGE: The `NewArticulateParser` constructor signature has been updated to accept a logger, base URL, and timeout, which are now supplied via the new configuration system.
2025-11-06 05:14:14 +01:00
e6977d3374 refactor(html_cleaner): adopt robust HTML parsing for content cleaning
Replaces the fragile regex-based HTML cleaning logic with a proper HTML parser using `golang.org/x/net/html`. The previous implementation was unreliable and could not correctly handle malformed tags, script content, or a wide range of HTML entities.

This new approach provides several key improvements:
- Skips the content of `
2025-11-06 04:26:51 +01:00
2790064ad5 refactor: Standardize method names and introduce context propagation
Removes the `Get` prefix from exporter methods (e.g., GetSupportedFormat -> SupportedFormat) to better align with Go conventions for simple accessors.

Introduces `context.Context` propagation through the application, starting from `ProcessCourseFromURI` down to the HTTP request in the parser. This makes network operations cancellable and allows for setting deadlines, improving application robustness.

Additionally, optimizes the HTML cleaner by pre-compiling regular expressions for a minor performance gain.
2025-11-06 04:26:41 +01:00
65469ea52e chore: Improve code quality and address linter feedback
This commit introduces several improvements across the codebase, primarily focused on enhancing performance, robustness, and developer experience based on static analysis feedback.

- Replaces `WriteString(fmt.Sprintf())` with the more performant `fmt.Fprintf` in the HTML and Markdown exporters.
- Enhances deferred `Close()` operations to log warnings on failure instead of silently ignoring potential I/O issues.
- Explicitly discards non-critical errors in test suites, particularly during file cleanup, to satisfy linters and clarify intent.
- Suppresses command echoing in `Taskfile.yml` for cleaner output during development tasks.
2025-11-06 04:17:00 +01:00
2db2e0b1a3 feat(task): add golangci-lint tasks and fix Windows checks
Introduces new tasks for running `golangci-lint`, a powerful and fast Go linter. This includes `lint:golangci` for checking the codebase and `lint:golangci:fix` for automatically applying fixes.

Additionally, this commit corrects the command used for checking the existence of executables on Windows. The change from `where` to `where.exe` ensures better cross-platform compatibility and reliability within the Taskfile.
2025-11-06 04:00:53 +01:00
6317ce268b refactor(exporters): replace deprecated strings.Title with cases.Title
The `strings.Title` function is deprecated because it does not handle Unicode punctuation correctly.

This change replaces its usage in the DOCX, HTML, and Markdown exporters with the recommended `golang.org/x/text/cases` package. This ensures more robust and accurate title-casing for item headings.
2025-11-06 03:55:07 +01:00
59f2de9d22 chore(build): introduce go-task for project automation
Adds a comprehensive Taskfile.yml to centralize all project scripts for building, testing, linting, and Docker image management.

The GitHub Actions CI workflow is refactored to utilize these `task` commands, resulting in a cleaner, more readable, and maintainable configuration. This approach ensures consistency between local development and CI environments.
2025-11-06 03:52:21 +01:00
f8fecc3967 chore: clean up CI workflows by removing unused release job and updating permissions 2025-11-05 22:38:56 +01:00
af15bcccd4 chore: update CI actions, Go 1.25, Alpine 3.22
Updates CI to latest major actions (checkout v5, setup-go v6, upload-artifact v5, CodeQL v4) for security and compatibility.
Uses stable major tag for autofix action.
Updates Docker images to Go 1.25 and Alpine 3.22 to leverage newer toolchain and patched bases.

Updates open-pull-requests-limit to 2 in dependabot.yml and upgrade CodeQL action to v4
2025-11-05 22:38:28 +01:00
422b56aa86 deps: Bump Go to 1.24 and x/image to v0.32 2025-11-05 22:22:17 +01:00
903ee92e4c Update ci.yml
- Added docker hub to the login.
- Removed some cache BS.
2025-05-29 00:19:25 +02:00
9c51c0d9e3 Reorganizes badges in README for clarity
Switches CI and Docker badges to clarify workflow separation.
Promotes Docker image visibility by rearranging badge positions.
2025-05-28 23:50:54 +02:00
ec5c8c099c Update labels and bump version to 0.4.0
Standardizes dependabot labels to include 'dependencies/' prefix
for better organization and clarity.

Bumps application version to 0.4.0 to reflect recent changes
and improvements.
2025-05-28 23:31:16 +02:00
9eaf7dfcf2 docker(deps): bump golang in the docker-images group (#4)
Bumps the docker-images group with 1 update: golang.


Updates `golang` from 1.23-alpine to 1.24-alpine

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24-alpine
  dependency-type: direct:production
  dependency-group: docker-images
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-28 23:09:09 +02:00
b7f23b2387 Add Docker support and GitHub Container Registry CI workflow (#3)
* Add comprehensive Docker support with multi-stage builds
* Set up GitHub Container Registry integration
* Enhance CI/CD workflows with Docker build and push capabilities
* Add --help and --version flags to main application
* Update documentation with Docker usage examples
* Implement security best practices for container deployment
2025-05-28 23:04:43 +02:00
a0003983c4 [autofix.ci] apply automated fixes 2025-05-28 12:24:31 +00:00
1c1460ff04 Refactors main function and enhances test suite
Refactors the main function for improved testability by extracting
the core logic into a new run function. Updates argument handling
and error reporting to use return codes instead of os.Exit.

Adds comprehensive test coverage for main functionality,
including integration tests and validation against edge cases.

Enhances README with updated code coverage and feature improvement lists.

Addresses improved maintainability and testability of the application.

Bumps version to 0.3.1
2025-05-28 14:23:56 +02:00
1b945ca2bc Updates version and improves diagram styling
Bumps application version to 0.3.0 for new features or fixes.
Enhances README diagram styling for better readability in both
light and dark GitHub themes by adjusting colors and adding
text color contrast.
2025-05-28 13:26:33 +02:00
fb343f9a23 Merge pull request #2
Merge pull request #2: Add HTML export functionality and fix build system cross-compilation issues
2025-05-28 13:23:45 +02:00
ce5b5c20bb Enhances README and script improvements
Updates README to include HTML format, architecture overview,
and comprehensive feature descriptions. Improves build scripts
by adding cleanup of environment variables and validating Go
installation, reducing cross-platform build issues.

Fixes README inconsistencies and improves script reliability.
2025-05-28 13:17:08 +02:00
cc11d2fd84 feat: Add HTML export functionality and GitHub workflow
- Implement HTMLExporter with professional styling and embedded CSS
- Add comprehensive test suite for HTML export functionality
- Update factory to support HTML format ('html' and 'htm')
- Add autofix.ci GitHub workflow for code formatting
- Support all content types: text, lists, quizzes, multimedia, etc.
- Include proper HTML escaping for security
- Add benchmark tests for performance validation
2025-05-28 13:00:27 +02:00
b01260e765 Add comprehensive unit tests for services and main package
- Implement tests for the app service, including course processing from file and URI.
- Create mock implementations for CourseParser and Exporter to facilitate testing.
- Add tests for HTML cleaner service to validate HTML content cleaning functionality.
- Develop tests for the parser service, covering course fetching and loading from files.
- Introduce tests for utility functions in the main package, ensuring URI validation and string joining.
- Include benchmarks for performance evaluation of key functions.
2025-05-25 15:46:10 +02:00
57 changed files with 9860 additions and 599 deletions

68
.dockerignore Normal file
View File

@ -0,0 +1,68 @@
# Git
.git
.gitignore
.gitattributes
# CI/CD
.github
.codecov.yml
# Documentation
README.md
*.md
docs/
# Build artifacts
build/
dist/
*.exe
*.tar.gz
*.zip
# Test files
*_test.go
test_*.go
test/
coverage.out
coverage.html
*.log
# Development
.vscode/
.idea/
*.swp
*.swo
*~
# OS specific
.DS_Store
Thumbs.db
# Output and temporary files
output/
tmp/
temp/
# Node.js (if any)
node_modules/
npm-debug.log
yarn-error.log
# Python (if any)
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
env/
venv/
# Scripts (build scripts not needed in container)
scripts/
# Sample files
articulate-sample.json
test_input.json
# License
LICENSE

View File

@ -8,14 +8,60 @@ updates:
day: 'monday' day: 'monday'
time: '07:00' time: '07:00'
timezone: 'Europe/Amsterdam' timezone: 'Europe/Amsterdam'
open-pull-requests-limit: 10 open-pull-requests-limit: 2
labels: labels:
- 'dependencies' - 'dependencies'
- 'github-actions' - 'dependencies/github-actions'
commit-message: commit-message:
prefix: 'ci' prefix: 'ci'
include: 'scope' include: 'scope'
# Check for updates to Docker
- package-ecosystem: 'docker'
directory: '/'
schedule:
interval: 'weekly'
day: 'monday'
time: '07:00'
timezone: 'Europe/Amsterdam'
open-pull-requests-limit: 2
labels:
- 'dependencies'
- 'dependencies/docker'
commit-message:
prefix: 'docker'
include: 'scope'
groups:
docker:
patterns:
- '*'
update-types:
- 'minor'
- 'patch'
# Check for updates to Docker Compose
- package-ecosystem: 'docker-compose'
directory: '/'
schedule:
interval: 'weekly'
day: 'monday'
time: '07:00'
timezone: 'Europe/Amsterdam'
open-pull-requests-limit: 2
labels:
- 'dependencies'
- 'dependencies/docker-compose'
commit-message:
prefix: 'docker'
include: 'scope'
groups:
docker:
patterns:
- '*'
update-types:
- 'minor'
- 'patch'
# Check for updates to Go modules # Check for updates to Go modules
- package-ecosystem: 'gomod' - package-ecosystem: 'gomod'
directory: '/' directory: '/'
@ -24,10 +70,10 @@ updates:
day: 'monday' day: 'monday'
time: '07:00' time: '07:00'
timezone: 'Europe/Amsterdam' timezone: 'Europe/Amsterdam'
open-pull-requests-limit: 10 open-pull-requests-limit: 2
labels: labels:
- 'dependencies' - 'dependencies'
- 'go' - 'dependencies/go'
commit-message: commit-message:
prefix: 'deps' prefix: 'deps'
include: 'scope' include: 'scope'

45
.github/workflows/autofix.yml vendored Normal file
View File

@ -0,0 +1,45 @@
name: autofix.ci
on:
pull_request:
push:
branches: ["master"]
permissions:
contents: read
jobs:
autofix:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v6
- name: Install Task
uses: go-task/setup-task@v1
- uses: actions/setup-go@v6
with: { go-version-file: "go.mod" }
- name: Setup go deps
run: |
# Install golangci-lint
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/HEAD/install.sh | sh -s -- -b $(go env GOPATH)/bin
# Install go-task dependencies
go install golang.org/x/tools/cmd/goimports@latest
- name: Run goimports
run: goimports -w .
- name: Run golangci-lint autofix
run: golangci-lint run --fix
- name: Run golangci-lint format
run: golangci-lint fmt
- name: Run go mod tidy
run: go mod tidy
- name: Run gopls modernize
run: task modernize
- uses: autofix-ci/action@v1

View File

@ -2,58 +2,279 @@ name: CI
on: on:
push: push:
branches: [ "master", "develop" ] branches: ["master", "develop"]
tags:
- "v*.*.*"
pull_request: pull_request:
branches: [ "master", "develop" ]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs: jobs:
golangci:
name: lint
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
steps:
- uses: actions/checkout@v6
- uses: actions/setup-go@v6
with:
go-version: stable
- name: golangci-lint
uses: golangci/golangci-lint-action@v9
with: { version: latest }
test: test:
name: Test name: Test
needs: [golangci]
runs-on: ubuntu-latest runs-on: ubuntu-latest
permissions: permissions:
contents: write contents: write
strategy: strategy:
matrix: matrix:
go: [1.21.x, 1.22.x, 1.23.x, 1.24.x] go:
- 1.24.x
- 1.25.x
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v6
- name: Set up Go ${{ matrix.go }} - name: Set up Go ${{ matrix.go }}
uses: actions/setup-go@v5 uses: actions/setup-go@v6
with: with:
go-version: ${{ matrix.go }} go-version: ${{ matrix.go }}
check-latest: true check-latest: true
cache-dependency-path: "**/*.sum"
- name: Install Task
uses: go-task/setup-task@v1
- name: Show build info
run: task info
- name: Download dependencies - name: Download dependencies
run: go mod download && echo "Download successful" || go mod tidy && echo "Tidy successful" || return 1 run: task deps
- name: Verify dependencies
run: go mod verify
- name: Build - name: Build
run: go build -v ./... run: task build
- name: Run tests - name: Run tests with enhanced reporting
run: go test -v -race -coverprofile=coverage.out ./... id: test
- name: Run go vet
run: go vet ./...
- name: Run go fmt
run: | run: |
if [ "$(gofmt -s -l . | wc -l)" -gt 0 ]; then cat >> $GITHUB_STEP_SUMMARY << EOF
echo "The following files are not formatted:" ## 🔧 Test Environment
gofmt -s -l . - **Go Version:** ${{ matrix.go }}
exit 1 - **OS:** ubuntu-latest
- **Timestamp:** $(date -u)
EOF
echo "Running tests with coverage..."
task test:coverage 2>&1 | tee test-output.log
# Extract test results for summary
TEST_STATUS=$?
TOTAL_TESTS=$(grep -c "=== RUN" test-output.log || echo "0")
PASSED_TESTS=$(grep -c "--- PASS:" test-output.log || echo "0")
FAILED_TESTS=$(grep -c "--- FAIL:" test-output.log || echo "0")
SKIPPED_TESTS=$(grep -c "--- SKIP:" test-output.log || echo "0")
# Generate test summary
cat >> $GITHUB_STEP_SUMMARY << EOF
## 🧪 Test Results (Go ${{ matrix.go }})
| Metric | Value |
| ----------- | ----------------------------------------------------------- |
| Total Tests | $TOTAL_TESTS |
| Passed | $PASSED_TESTS |
| Failed | $FAILED_TESTS |
| Skipped | $SKIPPED_TESTS |
| Status | $([ $TEST_STATUS -eq 0 ] && echo "PASSED" || echo "FAILED") |
### 📦 Package Test Results
| Package | Status |
|---------|--------|
EOF
# Extract package results
grep "^ok\|^FAIL" test-output.log | while read -r line; do
if [[ $line == ok* ]]; then
pkg=$(echo "$line" | awk '{print $2}')
echo "| $pkg | ✅ PASS |" >> $GITHUB_STEP_SUMMARY
elif [[ $line == FAIL* ]]; then
pkg=$(echo "$line" | awk '{print $2}')
echo "| $pkg | ❌ FAIL |" >> $GITHUB_STEP_SUMMARY
fi
done
echo "" >> $GITHUB_STEP_SUMMARY
# Add detailed results if tests failed
if [ $TEST_STATUS -ne 0 ]; then
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
### ❌ Failed Tests Details
```
EOF
grep -A 10 "--- FAIL:" test-output.log | head -100 >> $GITHUB_STEP_SUMMARY
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
```
EOF
fi fi
# Set outputs for other steps
cat >> $GITHUB_OUTPUT << EOF
test-status=$TEST_STATUS
total-tests=$TOTAL_TESTS
passed-tests=$PASSED_TESTS
failed-tests=$FAILED_TESTS
EOF
# Exit with the original test status
exit $TEST_STATUS
- name: Generate coverage report
if: always()
run: |
if [ -f coverage/coverage.out ]; then
COVERAGE=$(go tool cover -func=coverage/coverage.out | grep total | awk '{print $3}')
cat >> $GITHUB_STEP_SUMMARY << EOF
## 📊 Code Coverage (Go ${{ matrix.go }})
**Total Coverage: $COVERAGE**
<details>
<summary>Click to expand 📋 Coverage by Package details</summary>
| Package | Coverage |
| ------- | -------- |
EOF
# Create temporary file for package coverage aggregation
temp_coverage=$(mktemp)
# Extract package-level coverage data
go tool cover -func=coverage/coverage.out | grep -v total | while read -r line; do
if [[ $line == *".go:"* ]]; then
# Extract package path from file path (everything before the filename)
filepath=$(echo "$line" | awk '{print $1}')
pkg_path=$(dirname "$filepath" | sed 's|github.com/kjanat/articulate-parser/||; s|^\./||')
coverage=$(echo "$line" | awk '{print $3}' | sed 's/%//')
# Use root package if no subdirectory
[[ "$pkg_path" == "." || "$pkg_path" == "" ]] && pkg_path="root"
echo "$pkg_path $coverage" >> "$temp_coverage"
fi
done
# Aggregate coverage by package (average)
awk '{
packages[$1] += $2
counts[$1]++
}
END {
for (pkg in packages) {
avg = packages[pkg] / counts[pkg]
printf "| %s | %.1f%% |\n", pkg, avg
}
}' "$temp_coverage" | sort >> $GITHUB_STEP_SUMMARY
rm -f "$temp_coverage"
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
</details>
EOF
else
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
## ⚠️ Coverage Report
No coverage file generated
EOF
fi
- name: Upload test artifacts
if: failure()
uses: actions/upload-artifact@v6
with:
name: test-results-go-${{ matrix.go }}
path: |
test-output.log
coverage/
retention-days: 7
- name: Run linters
run: |
# Initialize summary
cat >> $GITHUB_STEP_SUMMARY << EOF
## 🔍 Static Analysis (Go ${{ matrix.go }})
EOF
# Run go vet
VET_OUTPUT=$(task lint:vet 2>&1 || echo "")
VET_STATUS=$?
if [ $VET_STATUS -eq 0 ]; then
echo "✅ **go vet:** No issues found" >> $GITHUB_STEP_SUMMARY
else
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
❌ **go vet:** Issues found
```
EOF
echo "$VET_OUTPUT" >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
fi
echo "" >> $GITHUB_STEP_SUMMARY
# Run go fmt check
FMT_OUTPUT=$(task lint:fmt 2>&1 || echo "")
FMT_STATUS=$?
if [ $FMT_STATUS -eq 0 ]; then
echo "✅ **go fmt:** All files properly formatted" >> $GITHUB_STEP_SUMMARY
else
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
❌ **go fmt:** Files need formatting
```
EOF
echo "$FMT_OUTPUT" >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
fi
# Exit with error if any linter failed
[ $VET_STATUS -eq 0 ] && [ $FMT_STATUS -eq 0 ] || exit 1
- name: Job Summary
if: always()
run: |
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
## 📋 Job Summary (Go ${{ matrix.go }})
| Step | Status |
| --------------- | --------------------------------------------------------------- |
| Dependencies | Success |
| Build | Success |
| Tests | ${{ steps.test.outcome == 'success' && 'Success' || 'Failed' }} |
| Coverage | ${{ job.status == 'success' && 'Generated' || 'Partial' }} |
| Static Analysis | ${{ job.status == 'success' && 'Clean' || 'Issues' }} |
| Code Formatting | ${{ job.status == 'success' && 'Clean' || 'Issues' }} |
EOF
- name: Upload coverage reports to Codecov - name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v5 uses: codecov/codecov-action@v5
with: with:
files: ./coverage/coverage.out
flags: Go ${{ matrix.go }} flags: Go ${{ matrix.go }}
slug: kjanat/articulate-parser slug: kjanat/articulate-parser
token: ${{ secrets.CODECOV_TOKEN }} token: ${{ secrets.CODECOV_TOKEN }}
@ -65,6 +286,52 @@ jobs:
flags: Go ${{ matrix.go }} flags: Go ${{ matrix.go }}
token: ${{ secrets.CODECOV_TOKEN }} token: ${{ secrets.CODECOV_TOKEN }}
docker-test:
name: Docker Build Test
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: go.mod
check-latest: true
- name: Install Task
uses: go-task/setup-task@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build Docker image using Task
run: task docker:build
- name: Test Docker image using Task
run: |
cat >> $GITHUB_STEP_SUMMARY << 'EOF'
## 🧪 Docker Image Tests
EOF
# Run Task docker test
task docker:test
echo "**Testing help command:**" >> $GITHUB_STEP_SUMMARY
echo '```terminaloutput' >> $GITHUB_STEP_SUMMARY
docker run --rm articulate-parser:latest --help >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
# Test image size
IMAGE_SIZE=$(docker image inspect articulate-parser:latest --format='{{.Size}}' | numfmt --to=iec-i --suffix=B)
echo "**Image size:** $IMAGE_SIZE" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
dependency-review: dependency-review:
name: Dependency Review name: Dependency Review
runs-on: ubuntu-latest runs-on: ubuntu-latest
@ -72,105 +339,140 @@ jobs:
contents: read contents: read
if: github.event_name == 'pull_request' if: github.event_name == 'pull_request'
steps: steps:
- name: 'Checkout Repository' - name: "Checkout Repository"
uses: actions/checkout@v4 uses: actions/checkout@v6
- name: 'Dependency Review' - name: "Dependency Review"
uses: actions/dependency-review-action@v4 uses: actions/dependency-review-action@v4
with: with:
fail-on-severity: moderate fail-on-severity: moderate
comment-summary-in-pr: always comment-summary-in-pr: always
# # Use comma-separated names to pass list arguments: docker:
# deny-licenses: LGPL-2.0, BSD-2-Clause name: Docker Build & Push
release:
name: Release
runs-on: ubuntu-latest runs-on: ubuntu-latest
if: github.ref_type == 'tag'
permissions: permissions:
contents: write contents: read
needs: [ "test" ] packages: write
needs: [test, docker-test, dependency-review]
if: |
github.event_name == 'push' && (github.ref == 'refs/heads/master' ||
github.ref == 'refs/heads/develop' ||
startsWith(github.ref, 'refs/heads/feature/docker'))
steps: steps:
- uses: actions/checkout@v4 - name: Checkout repository
uses: actions/checkout@v6
- name: Login to Docker Hub
uses: docker/login-action@v3
with: with:
fetch-depth: 0 username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Set up Go - name: Log in to GitHub Container Registry
uses: actions/setup-go@v5 uses: docker/login-action@v3
with: with:
go-version-file: 'go.mod' registry: ${{ env.REGISTRY }}
check-latest: true username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Run tests - name: Set up QEMU
run: go test -v ./... uses: docker/setup-qemu-action@v3
- name: Install UPX - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: |
${{ env.IMAGE_NAME }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
type=raw,value=latest,enable={{is_default_branch}}
labels: |
org.opencontainers.image.title=Articulate Parser
org.opencontainers.image.description=A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats including Markdown HTML and DOCX. Supports media extraction content cleaning and batch processing for educational content conversion.
org.opencontainers.image.vendor=kjanat
org.opencontainers.image.licenses=MIT
org.opencontainers.image.url=https://github.com/${{ github.repository }}
org.opencontainers.image.source=https://github.com/${{ github.repository }}
org.opencontainers.image.documentation=https://github.com/${{ github.repository }}/blob/master/DOCKER.md
- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
# Multi-architecture build - Docker automatically provides TARGETOS, TARGETARCH, etc.
# Based on Go's supported platforms from 'go tool dist list'
platforms: |
linux/amd64
linux/arm64
linux/arm/v7
linux/386
linux/ppc64le
linux/s390x
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
annotations: ${{ steps.meta.outputs.labels }}
build-args: |
VERSION=${{ github.ref_type == 'tag' && github.ref_name || github.sha }}
BUILD_TIME=${{ github.event.head_commit.timestamp }}
GIT_COMMIT=${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
outputs: type=image,name=target,annotation-index.org.opencontainers.image.description=A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats including Markdown HTML and DOCX. Supports media extraction content cleaning and batch processing for educational content conversion.
sbom: true
provenance: true
- name: Generate Docker summary
run: | run: |
sudo apt-get update cat >> $GITHUB_STEP_SUMMARY << 'EOF'
sudo apt-get install -y upx ## 🐳 Docker Build Summary
- name: Build binaries **Image:** `ghcr.io/${{ github.repository }}`
run: |
# Set the build time environment variable
BUILD_TIME=$(date -u +'%Y-%m-%dT%H:%M:%SZ')
# Add run permissions to the build script **Tags built:**
chmod +x ./scripts/build.sh
# Display help information for the build script ```text
./scripts/build.sh --help ${{ steps.meta.outputs.tags }}
```
# Build for all platforms **Features:**
./scripts/build.sh \ - **Platforms:** linux/amd64, linux/arm64, linux/arm/v7, linux/386, linux/ppc64le, linux/s390x
--verbose \ - **Architecture optimization:** Native compilation for each platform
-ldflags "-s -w -X github.com/kjanat/articulate-parser/internal/version.Version=${{ github.ref_name }} -X github.com/kjanat/articulate-parser/internal/version.BuildTime=$BUILD_TIME -X github.com/kjanat/articulate-parser/internal/version.GitCommit=${{ github.sha }}" - **Multi-arch image description:** Enabled
- **SBOM (Software Bill of Materials):** Generated
- **Provenance attestation:** Generated
- **Security scanning:** Ready for vulnerability analysis
- name: Compress binaries with UPX **Usage:**
run: |
echo "Compressing binaries with UPX..."
cd build/
# Get original sizes ```bash
echo "Original sizes:" # Pull the image
ls -lah docker pull ghcr.io/${{ github.repository }}:latest
echo ""
# Compress all binaries except Darwin (macOS) binaries as UPX doesn't work well with recent macOS versions # Run with help
for binary in articulate-parser-*; do docker run --rm ghcr.io/${{ github.repository }}:latest --help
if [[ "$binary" == *"darwin"* ]]; then
echo "Skipping UPX compression for $binary (macOS compatibility)"
else
echo "Compressing $binary..."
upx --best --lzma "$binary" || {
echo "Warning: UPX compression failed for $binary, keeping original"
}
fi
done
echo "" # Process a local file (mount current directory)
echo "Final sizes:" docker run --rm -v $(pwd):/workspace ghcr.io/${{ github.repository }}:latest /workspace/input.json markdown /workspace/output.md
ls -lah ```
- name: Upload a Build Artifact EOF
uses: actions/upload-artifact@v4.6.2
with:
# Artifact name
name: build-artifacts # optional, default is artifact
# A file, directory or wildcard pattern that describes what to upload
path: build/
if-no-files-found: ignore
retention-days: 1
compression-level: 9
overwrite: true
include-hidden-files: true
- name: Create Release # Security and quality analysis workflows
uses: softprops/action-gh-release@v2 codeql-analysis:
with: name: CodeQL Analysis
files: build/* uses: ./.github/workflows/codeql.yml
generate_release_notes: true permissions:
draft: false security-events: write
prerelease: ${{ startsWith(github.ref, 'refs/tags/v0.') }} packages: read
env: actions: read
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} contents: read

View File

@ -11,13 +11,17 @@
# #
name: "CodeQL" name: "CodeQL"
# This workflow is configured to be called by other workflows for more controlled execution
# This allows integration with the main CI pipeline and avoids redundant runs
# To enable automatic execution, uncomment the triggers below:
on: on:
push: workflow_call:
branches: [ "master" ]
pull_request:
branches: [ "master" ]
schedule: schedule:
- cron: '44 16 * * 6' - cron: '44 16 * * 6'
# push:
# branches: [ "master" ]
# pull_request:
# branches: [ "master" ]
jobs: jobs:
analyze: analyze:
@ -57,7 +61,7 @@ jobs:
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages # your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@v4 uses: actions/checkout@v6
# Add any setup steps before running the `github/codeql-action/init` action. # Add any setup steps before running the `github/codeql-action/init` action.
# This includes steps like installing compilers or runtimes (`actions/setup-node` # This includes steps like installing compilers or runtimes (`actions/setup-node`
@ -67,7 +71,7 @@ jobs:
# Initializes the CodeQL tools for scanning. # Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL - name: Initialize CodeQL
uses: github/codeql-action/init@v3 uses: github/codeql-action/init@v4
with: with:
languages: ${{ matrix.language }} languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }} build-mode: ${{ matrix.build-mode }}
@ -95,6 +99,6 @@ jobs:
exit 1 exit 1
- name: Perform CodeQL Analysis - name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3 uses: github/codeql-action/analyze@v4
with: with:
category: "/language:${{matrix.language}}" category: "/language:${{matrix.language}}"

26
.github/workflows/dependency-review.yml vendored Normal file
View File

@ -0,0 +1,26 @@
name: Dependency Review
# This workflow is designed to be called by other workflows rather than triggered automatically
# This allows for more controlled execution and integration with other CI/CD processes
# To enable automatic execution on pull requests, uncomment the line below:
# on: [pull_request]
on: [workflow_call]
permissions:
contents: read
# Required to post security advisories
security-events: write
pull-requests: write
jobs:
dependency-review:
runs-on: ubuntu-latest
steps:
- name: 'Checkout Repository'
uses: actions/checkout@v6
- name: 'Dependency Review'
uses: actions/dependency-review-action@v4
with:
fail-on-severity: moderate
comment-summary-in-pr: always

156
.github/workflows/release.yml vendored Normal file
View File

@ -0,0 +1,156 @@
name: Release
on:
push:
tags:
- "v*.*.*"
workflow_call:
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
permissions:
contents: write
packages: write
jobs:
release:
name: Create Release
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: "go.mod"
check-latest: true
- name: Run tests
run: go test -v ./...
- name: Install UPX
run: |
sudo apt-get update
sudo apt-get install -y upx
- name: Build binaries
run: |
# Set the build time environment variable using git commit timestamp
BUILD_TIME=$(git log -1 --format=%cd --date=iso-strict)
# Add run permissions to the build script
chmod +x ./scripts/build.sh
# Build for all platforms
./scripts/build.sh \
--verbose \
-ldflags "-s -w -X github.com/kjanat/articulate-parser/internal/version.Version=${{ github.ref_name }} -X github.com/kjanat/articulate-parser/internal/version.BuildTime=$BUILD_TIME -X github.com/kjanat/articulate-parser/internal/version.GitCommit=${{ github.sha }}"
- name: Compress binaries
run: |
cd build/
for binary in articulate-parser-*; do
echo "Compressing $binary..."
upx --best "$binary" || {
echo "Warning: UPX compression failed for $binary, keeping original"
}
done
- name: Create Release
uses: softprops/action-gh-release@v2
with:
files: |
build/articulate-parser-linux-amd64
build/articulate-parser-linux-arm64
build/articulate-parser-windows-amd64.exe
build/articulate-parser-windows-arm64.exe
build/articulate-parser-darwin-amd64
build/articulate-parser-darwin-arm64
generate_release_notes: true
draft: false
# Mark pre-1.0 versions (v0.x.x) as prerelease since they are considered unstable
# This helps users understand that these releases may have breaking changes
prerelease: ${{ startsWith(github.ref, 'refs/tags/v0.') }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
docker:
name: Docker Build & Push
runs-on: ubuntu-latest
needs: ['release']
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: |
${{ env.IMAGE_NAME }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
type=raw,value=latest,enable={{is_default_branch}}
labels: |
org.opencontainers.image.title=Articulate Parser
org.opencontainers.image.description=A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats including Markdown HTML and DOCX. Supports media extraction content cleaning and batch processing for educational content conversion.
org.opencontainers.image.vendor=kjanat
org.opencontainers.image.licenses=MIT
org.opencontainers.image.url=https://github.com/${{ github.repository }}
org.opencontainers.image.source=https://github.com/${{ github.repository }}
org.opencontainers.image.documentation=https://github.com/${{ github.repository }}/blob/master/DOCKER.md
- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
platforms: |
linux/amd64
linux/arm64
linux/arm/v7
linux/386
linux/ppc64le
linux/s390x
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
annotations: ${{ steps.meta.outputs.labels }}
build-args: |
VERSION=${{ github.ref_name }}
BUILD_TIME=${{ github.event.head_commit.timestamp || github.event.repository.pushed_at }}
GIT_COMMIT=${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
outputs: type=image,name=target,annotation-index.org.opencontainers.image.description=A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats including Markdown HTML and DOCX. Supports media extraction content cleaning and batch processing for educational content conversion.
sbom: true
provenance: true

43
.gitignore vendored
View File

@ -26,11 +26,54 @@ go.work
# End of https://www.toptal.com/developers/gitignore/api/go # End of https://www.toptal.com/developers/gitignore/api/go
# Shit
.github/TODO
# Local test files # Local test files
output/ output/
outputs/
articulate-sample.json articulate-sample.json
test-output.* test-output.*
go-os-arch-matrix.csv go-os-arch-matrix.csv
test_godocx.go
test_input.json
# Build artifacts # Build artifacts
build/ build/
# Old workflows
.github/workflows/ci-old.yml
.github/workflows/ci-enhanced.yml
# Test coverage files
coverage.out
coverage.txt
coverage.html
coverage.*
coverage
*.cover
*.coverprofile
main_coverage
# Other common exclusions
*.exe
*.exe~
*.dll
*.so
*.dylib
*.test
*.out
/tmp/
.github/copilot-instructions.md
# Editors
.vscode/
.idea/
.task/
**/*.local.*
.claude/
NUL

388
.golangci.yml Normal file
View File

@ -0,0 +1,388 @@
# golangci-lint configuration for articulate-parser
# https://golangci-lint.run/usage/configuration/
version: "2"
# Options for analysis running
run:
# Timeout for total work
timeout: 5m
# Skip directories (not allowed in config v2, will use issues exclude instead)
# Go version
go: "1.24"
# Include test files
tests: true
# Use Go module mode
modules-download-mode: readonly
# Output configuration
output:
# Format of output
formats:
text:
print-issued-lines: true
print-linter-name: true
# Sort results
sort-order:
- linter
- severity
- file
# Show statistics
show-stats: true
# Issues configuration
issues:
# Maximum issues count per one linter
max-issues-per-linter: 0
# Maximum count of issues with the same text
max-same-issues: 3
# Show only new issues
new: false
# Fix found issues (if linter supports)
fix: false
# Formatters configuration
formatters:
enable:
- gofmt
- goimports
- gofumpt
settings:
# gofmt settings
gofmt:
simplify: true
# goimports settings
goimports:
local-prefixes:
- github.com/kjanat/articulate-parser
# gofumpt settings
gofumpt:
module-path: github.com/kjanat/articulate-parser
extra-rules: true
# Linters configuration
linters:
# Default set of linters
default: none
# Enable specific linters
enable:
# Default/standard linters
- errcheck # Check for unchecked errors
- govet # Go vet
- ineffassign # Detect ineffectual assignments
- staticcheck # Staticcheck
- unused # Find unused code
# Code quality
- revive # Fast, configurable linter
- gocritic # Opinionated Go source code linter
- godot # Check comment periods
- godox # Detect TODO/FIXME comments
- gocognit # Cognitive complexity
- gocyclo # Cyclomatic complexity
- funlen # Function length
- maintidx # Maintainability index
# Security
- gosec # Security problems
# Performance
- prealloc # Find slice preallocation opportunities
- bodyclose # Check HTTP response body closed
# Style and formatting
- goconst # Find repeated strings
- misspell # Find misspellings
- whitespace # Find unnecessary blank lines
- unconvert # Remove unnecessary type conversions
- dupword # Check for duplicate words
# Error handling
- errorlint # Error handling improvements
- wrapcheck # Check error wrapping
# Testing
- testifylint # Testify usage
- tparallel # Detect improper t.Parallel() usage
- thelper # Detect test helpers without t.Helper()
# Best practices
- exhaustive # Check exhaustiveness of enum switches
- nolintlint # Check nolint directives
- nakedret # Find naked returns
- nilnil # Check for redundant nil checks
- noctx # Check sending HTTP requests without context
- contextcheck # Check context propagation
- asciicheck # Check for non-ASCII identifiers
- bidichk # Check for dangerous unicode sequences
- durationcheck # Check for multiplied durations
- makezero # Find slice declarations with non-zero length
- nilerr # Find code returning nil with non-nil error
- predeclared # Find code shadowing predeclared identifiers
- promlinter # Check Prometheus metrics naming
- reassign # Check reassignment of package variables
- usestdlibvars # Use variables from stdlib
- wastedassign # Find wasted assignments
# Documentation
- godoclint # Check godoc comments
# New
- modernize # Suggest simplifications using new Go features
# Exclusion rules for linters
exclusions:
rules:
# Exclude some linters from test files
- path: _test\.go
linters:
- gosec
- dupl
- errcheck
- goconst
- funlen
- goerr113
- gocognit
# Exclude benchmarks from some linters
- path: _bench_test\.go
linters:
- gosec
- dupl
- errcheck
- goconst
- funlen
- goerr113
- wrapcheck
# Exclude example tests
- path: _example_test\.go
linters:
- gosec
- errcheck
- funlen
- goerr113
- wrapcheck
- revive
# Exclude linters for main.go
- path: ^main\.go$
linters:
- forbidigo
# Exclude certain linters for generated files
- path: internal/version/version.go
linters:
- gochecknoglobals
- gochecknoinits
# Exclude var-naming for interfaces package (standard Go pattern for interface definitions)
- path: internal/interfaces/
text: "var-naming: avoid meaningless package names"
linters:
- revive
# Allow fmt.Print* in main package
- path: ^main\.go$
text: "use of fmt.Print"
linters:
- forbidigo
# Exclude common false positives
- text: "Error return value of .((os\\.)?std(out|err)\\..*|.*Close|.*Flush|os\\.Remove(All)?|.*print(f|ln)?|os\\.(Un)?Setenv). is not checked"
linters:
- errcheck
# Exclude error wrapping suggestions for well-known errors
- text: "non-wrapping format verb for fmt.Errorf"
linters:
- errorlint
# Linters settings
settings:
# errcheck settings
errcheck:
check-type-assertions: true
check-blank: false
# govet settings
govet:
enable-all: true
disable:
- fieldalignment # Too many false positives
- shadow # Can be noisy
# goconst settings
goconst:
min-len: 3
min-occurrences: 3
# godot settings
godot:
scope: toplevel
exclude:
- "^fixme:"
- "^todo:"
capital: true
period: true
# godox settings
godox:
keywords:
- TODO
- FIXME
- HACK
- BUG
- XXX
# misspell settings
misspell:
locale: US
# funlen settings
funlen:
lines: 100
statements: 50
# gocognit settings
gocognit:
min-complexity: 20
# gocyclo settings
gocyclo:
min-complexity: 15
# gocritic settings
gocritic:
enabled-tags:
- diagnostic
- style
- performance
- experimental
disabled-checks:
- ifElseChain
- singleCaseSwitch
- commentedOutCode
settings:
hugeParam:
sizeThreshold: 512
rangeValCopy:
sizeThreshold: 512
# gosec settings
gosec:
severity: medium
confidence: medium
excludes:
- G104 # Handled by errcheck
- G304 # File path provided as taint input
# revive settings
revive:
severity: warning
rules:
- name: blank-imports
- name: context-as-argument
- name: context-keys-type
- name: dot-imports
- name: empty-block
- name: error-naming
- name: error-return
- name: error-strings
- name: errorf
- name: exported
- name: if-return
- name: increment-decrement
- name: indent-error-flow
- name: package-comments
- name: range
- name: receiver-naming
- name: time-naming
- name: unexported-return
- name: var-declaration
- name: var-naming
# errorlint settings
errorlint:
errorf: true
errorf-multi: true
asserts: true
comparison: true
# wrapcheck settings
wrapcheck:
ignore-sigs:
- .Errorf(
- errors.New(
- errors.Unwrap(
- errors.Join(
- .WithMessage(
- .WithMessagef(
- .WithStack(
ignore-package-globs:
- github.com/kjanat/articulate-parser/*
# exhaustive settings
exhaustive:
check:
- switch
- map
default-signifies-exhaustive: true
# nolintlint settings
nolintlint:
allow-unused: false
require-explanation: true
require-specific: true
# stylecheck settings
staticcheck:
checks: ["all", "-ST1000", "-ST1003", "-ST1016", "-ST1020", "-ST1021", "-ST1022"]
# maintidx settings
maintidx:
under: 20
# testifylint settings
testifylint:
enable-all: true
disable:
- float-compare
# thelper settings
thelper:
test:
first: true
name: true
begin: true
benchmark:
first: true
name: true
begin: true
# Severity rules
severity:
default: warning
rules:
- linters:
- gosec
severity: error
- linters:
- errcheck
- staticcheck
severity: error
- linters:
- godox
severity: info

178
AGENTS.md Normal file
View File

@ -0,0 +1,178 @@
# Agent Guidelines for articulate-parser
A Go CLI tool that parses Articulate Rise courses from URLs or local JSON files and exports them to Markdown, HTML, or DOCX formats.
## Build/Test Commands
### Primary Commands (using Taskfile)
```bash
task build # Build binary to bin/articulate-parser
task test # Run all tests with race detection
task lint # Run all linters (vet, fmt, staticcheck, golangci-lint)
task fmt # Format all Go files
task ci # Full CI pipeline: deps, lint, test with coverage, build
task qa # Quick QA: fmt + lint + test
```
### Direct Go Commands
```bash
# Build
go build -o bin/articulate-parser main.go
# Run all tests
go test -race -timeout 5m ./...
# Run single test by name
go test -v -race -run ^TestMarkdownExporter_Export$ ./internal/exporters
# Run tests in specific package
go test -v -race ./internal/services
# Run tests matching pattern
go test -v -race -run "TestParser" ./...
# Test with coverage
go test -race -coverprofile=coverage/coverage.out -covermode=atomic ./...
go tool cover -html=coverage/coverage.out -o coverage/coverage.html
# Benchmarks
go test -bench=. -benchmem ./...
go test -bench=BenchmarkMarkdownExporter ./internal/exporters
```
### Security & Auditing
```bash
task security:check # Run gosec security scanner
task security:audit # Run govulncheck for vulnerabilities
```
## Code Style Guidelines
### Imports
- Use `goimports` with local prefix: `github.com/kjanat/articulate-parser`
- Order: stdlib, blank line, external packages, blank line, internal packages
```go
import (
"context"
"fmt"
"github.com/fumiama/go-docx"
"github.com/kjanat/articulate-parser/internal/interfaces"
)
```
### Formatting
- Use `gofmt -s` (simplify) and `gofumpt` with extra rules
- Function length: max 100 lines, 50 statements
- Cyclomatic complexity: max 15; Cognitive complexity: max 20
### Types & Naming
- Use interface-based design (see `internal/interfaces/`)
- Exported types/functions require godoc comments ending with period
- Use descriptive names: `ArticulateParser`, `MarkdownExporter`
- Receiver names: short (1-2 chars), consistent per type
### Error Handling
- Always wrap errors with context: `fmt.Errorf("operation failed: %w", err)`
- Use `%w` verb for error wrapping to preserve error chain
- Check all error returns (enforced by `errcheck`)
- Document error handling rationale in defer blocks when ignoring close errors
```go
// Good: Error wrapping with context
if err := json.Unmarshal(body, &course); err != nil {
return nil, fmt.Errorf("failed to unmarshal JSON: %w", err)
}
// Good: Documented defer with error handling
defer func() {
if err := resp.Body.Close(); err != nil {
p.Logger.Warn("failed to close response body", "error", err)
}
}()
```
### Comments
- All exported types/functions require godoc comments
- End sentences with periods (`godot` linter enforced)
- Mark known issues with TODO/FIXME/HACK/BUG/XXX
### Security
- Use `#nosec` with justification for deliberate security exceptions
- G304: File paths from CLI args; G306: Export file permissions
```go
// #nosec G304 - File path provided by user via CLI argument
data, err := os.ReadFile(filePath)
```
### Testing
- Enable race detection: `-race` flag always
- Use table-driven tests where applicable
- Mark test helpers with `t.Helper()`
- Use `t.TempDir()` for temporary files
- Benchmarks in `*_bench_test.go`, examples in `*_example_test.go`
- Test naming: `Test<Type>_<Method>` or `Test<Function>`
```go
func TestMarkdownExporter_ProcessItemToMarkdown_AllTypes(t *testing.T) {
tests := []struct {
name, itemType, expectedText string
}{
{"text item", "text", ""},
{"divider item", "divider", "---"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// test implementation
})
}
}
```
### Dependencies
- Minimal external dependencies (go-docx, golang.org/x/net, golang.org/x/text)
- Run `task deps:tidy` after adding/removing dependencies
- CGO disabled by default (`CGO_ENABLED=0`)
## Project Structure
```
articulate-parser/
internal/
config/ # Configuration loading
exporters/ # Export implementations (markdown, html, docx)
interfaces/ # Core interfaces (Exporter, CourseParser, Logger)
models/ # Data models (Course, Lesson, Item, Media)
services/ # Core services (parser, html cleaner, app, logger)
version/ # Version information
main.go # Application entry point
```
## Common Patterns
### Creating a new exporter
1. Implement `interfaces.Exporter` interface
2. Add factory method to `internal/exporters/factory.go`
3. Register format in `NewFactory()`
4. Add tests following existing patterns
### Adding configuration options
1. Add field to `Config` struct in `internal/config/config.go`
2. Load from environment variable with sensible default
3. Document in config struct comments

82
DOCKER.md Normal file
View File

@ -0,0 +1,82 @@
# Articulate Parser - Docker
A powerful command-line tool for parsing and processing articulate data files, now available as a lightweight Docker container.
## Quick Start
### Pull from GitHub Container Registry
```bash
docker pull ghcr.io/kjanat/articulate-parser:latest
```
### Run with Articulate Rise URL
```bash
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/ markdown /data/output.md
```
### Run with local files
```bash
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest /data/input.json markdown /data/output.md
```
## Usage
### Basic File Processing
```bash
# Process from Articulate Rise URL
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/ markdown /data/output.md
# Process a local JSON file
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest /data/document.json markdown /data/output.md
# Process with specific format and output
docker run --rm -v $(pwd):/data ghcr.io/kjanat/articulate-parser:latest /data/input.json docx /data/output.docx
```
### Display Help and Version
```bash
# Show help information
docker run --rm ghcr.io/kjanat/articulate-parser:latest --help
# Show version
docker run --rm ghcr.io/kjanat/articulate-parser:latest --version
```
## Available Tags
- `latest` - Latest stable release
- `v1.x.x` - Specific version tags
- `main` - Latest development build
## Image Details
- **Base Image**: `scratch` (minimal attack surface)
- **Architecture**: Multi-arch support (amd64, arm64)
- **Size**: < 10MB (optimized binary)
- **Security**: Runs as non-root user
- **Features**: SBOM and provenance attestation included
## Development
### Local Build
```bash
docker build -t articulate-parser .
```
### Docker Compose
```bash
docker-compose up --build
```
## Repository
- **Source**: [github.com/kjanat/articulate-parser](https://github.com/kjanat/articulate-parser)
- **Issues**: [Report bugs or request features](https://github.com/kjanat/articulate-parser/issues)
- **License**: See repository for license details

78
Dockerfile Normal file
View File

@ -0,0 +1,78 @@
# Build stage
FROM golang:1.25-alpine AS builder
# Install git and ca-certificates (needed for fetching dependencies and HTTPS)
RUN apk add --no-cache git ca-certificates tzdata file
# Create a non-root user for the final stage
RUN adduser -D -u 1000 appuser
# Set the working directory
WORKDIR /app
# Copy go mod files
COPY go.mod go.sum ./
# Download dependencies
RUN go mod download
# Copy source code
COPY . .
# Build the application
# Disable CGO for a fully static binary
# Use linker flags to reduce binary size and embed version info
ARG VERSION=dev
ARG BUILD_TIME
ARG GIT_COMMIT
# Docker buildx automatically provides these for multi-platform builds
ARG BUILDPLATFORM
ARG TARGETPLATFORM
ARG TARGETOS
ARG TARGETARCH
ARG TARGETVARIANT
# Debug: Show build information
RUN echo "Building for platform: $TARGETPLATFORM (OS: $TARGETOS, Arch: $TARGETARCH, Variant: $TARGETVARIANT)" \
&& echo "Build platform: $BUILDPLATFORM" \
&& echo "Go version: $(go version)"
RUN CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build \
-ldflags="-s -w -X github.com/kjanat/articulate-parser/internal/version.Version=${VERSION} -X github.com/kjanat/articulate-parser/internal/version.BuildTime=${BUILD_TIME} -X github.com/kjanat/articulate-parser/internal/version.GitCommit=${GIT_COMMIT}" \
-o articulate-parser \
./main.go
# Verify the binary architecture
RUN file /app/articulate-parser || echo "file command not available"
# Final stage - minimal runtime image
FROM scratch
# Copy CA certificates for HTTPS requests
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
# Copy timezone data
COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo
# Add a minimal /etc/passwd file to support non-root user
COPY --from=builder /etc/passwd /etc/passwd
# Copy the binary
COPY --from=builder /app/articulate-parser /articulate-parser
# Switch to non-root user (appuser with UID 1000)
USER appuser
# Set the binary as entrypoint
ENTRYPOINT ["/articulate-parser"]
# Default command shows help
CMD ["--help"]
# Add labels for metadata
LABEL org.opencontainers.image.title="Articulate Parser"
LABEL org.opencontainers.image.description="A powerful CLI tool to parse Articulate Rise courses and export them to multiple formats (Markdown, HTML, DOCX). Supports media extraction, content cleaning, and batch processing for educational content conversion."
LABEL org.opencontainers.image.vendor="kjanat"
LABEL org.opencontainers.image.licenses="MIT"
LABEL org.opencontainers.image.source="https://github.com/kjanat/articulate-parser"
LABEL org.opencontainers.image.documentation="https://github.com/kjanat/articulate-parser/blob/master/DOCKER.md"

78
Dockerfile.dev Normal file
View File

@ -0,0 +1,78 @@
# Development Dockerfile with shell access
# Uses Alpine instead of scratch for debugging
# Build stage - same as production
FROM golang:1.25-alpine AS builder
# Install git and ca-certificates (needed for fetching dependencies and HTTPS)
RUN apk add --no-cache git ca-certificates tzdata file
# Create a non-root user
RUN adduser -D -u 1000 appuser
# Set the working directory
WORKDIR /app
# Copy go mod files
COPY go.mod go.sum ./
# Download dependencies
RUN go mod download
# Copy source code
COPY . .
# Build the application
# Disable CGO for a fully static binary
# Use linker flags to reduce binary size and embed version info
ARG VERSION=dev
ARG BUILD_TIME
ARG GIT_COMMIT
# Docker buildx automatically provides these for multi-platform builds
ARG BUILDPLATFORM
ARG TARGETPLATFORM
ARG TARGETOS
ARG TARGETARCH
ARG TARGETVARIANT
# Debug: Show build information
RUN echo "Building for platform: $TARGETPLATFORM (OS: $TARGETOS, Arch: $TARGETARCH, Variant: $TARGETVARIANT)" \
&& echo "Build platform: $BUILDPLATFORM" \
&& echo "Go version: $(go version)"
RUN CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build \
-ldflags="-s -w -X github.com/kjanat/articulate-parser/internal/version.Version=${VERSION} -X github.com/kjanat/articulate-parser/internal/version.BuildTime=${BUILD_TIME} -X github.com/kjanat/articulate-parser/internal/version.GitCommit=${GIT_COMMIT}" \
-o articulate-parser \
./main.go
# Verify the binary architecture
RUN file /app/articulate-parser || echo "file command not available"
# Development stage - uses Alpine for shell access
FROM alpine:3
# Install minimal dependencies
RUN apk add --no-cache ca-certificates tzdata
# Copy the binary
COPY --from=builder /app/articulate-parser /articulate-parser
# Copy the non-root user configuration
COPY --from=builder /etc/passwd /etc/passwd
# Switch to non-root user
USER appuser
# Set the binary as entrypoint
ENTRYPOINT ["/articulate-parser"]
# Default command shows help
CMD ["--help"]
# Add labels for metadata
LABEL org.opencontainers.image.title="Articulate Parser (Dev)"
LABEL org.opencontainers.image.description="Development version of Articulate Parser with shell access"
LABEL org.opencontainers.image.vendor="kjanat"
LABEL org.opencontainers.image.licenses="MIT"
LABEL org.opencontainers.image.source="https://github.com/kjanat/articulate-parser"
LABEL org.opencontainers.image.documentation="https://github.com/kjanat/articulate-parser/blob/master/DOCKER.md"

325
README.md
View File

@ -1,23 +1,95 @@
# Articulate Rise Parser # Articulate Rise Parser
A Go-based parser that converts Articulate Rise e-learning content to various formats including Markdown and Word documents. A Go-based parser that converts Articulate Rise e-learning content to various formats including Markdown, HTML, and Word documents.
[![Go version](https://img.shields.io/github/go-mod/go-version/kjanat/articulate-parser?logo=Go&logoColor=white)][gomod] [![Go version](https://img.shields.io/github/go-mod/go-version/kjanat/articulate-parser?logo=Go&logoColor=white)][gomod]
[![Go Doc](https://godoc.org/github.com/kjanat/articulate-parser?status.svg)][Package documentation] [![Go Doc](https://godoc.org/github.com/kjanat/articulate-parser?status.svg)][Package documentation]
[![Go Report Card](https://goreportcard.com/badge/github.com/kjanat/articulate-parser)][Go report] [![Go Report Card](https://goreportcard.com/badge/github.com/kjanat/articulate-parser)][Go report]
[![Tag](https://img.shields.io/github/v/tag/kjanat/articulate-parser?sort=semver&label=Tag)][Tags] [![Tag](https://img.shields.io/github/v/tag/kjanat/articulate-parser?sort=semver&label=Tag)][Tags] <!-- [![Release Date](https://img.shields.io/github/release-date/kjanat/articulate-parser?label=Release%20date)][Latest release] -->
[![Release Date](https://img.shields.io/github/release-date/kjanat/articulate-parser?label=Release%20date)][Latest release] [![License](https://img.shields.io/github/license/kjanat/articulate-parser?label=License)][MIT License] <!-- [![Commit activity](https://img.shields.io/github/commit-activity/m/kjanat/articulate-parser?label=Commit%20activity)][Commits] -->
[![License](https://img.shields.io/github/license/kjanat/articulate-parser?label=License)](LICENSE)
[![Commit activity](https://img.shields.io/github/commit-activity/m/kjanat/articulate-parser?label=Commit%20activity)][Commits]
[![Last commit](https://img.shields.io/github/last-commit/kjanat/articulate-parser?label=Last%20commit)][Commits] [![Last commit](https://img.shields.io/github/last-commit/kjanat/articulate-parser?label=Last%20commit)][Commits]
[![GitHub Issues or Pull Requests](https://img.shields.io/github/issues/kjanat/articulate-parser?label=Issues)][Issues] [![GitHub Issues or Pull Requests](https://img.shields.io/github/issues/kjanat/articulate-parser?label=Issues)][Issues]
[![Docker Image](https://img.shields.io/badge/docker-ghcr.io-blue?logo=docker&logoColor=white)][Docker image] <!-- [![Docker Size](https://img.shields.io/docker/image-size/kjanat/articulate-parser?logo=docker&label=Image%20Size)][Docker image] -->
[![Docker](https://img.shields.io/github/actions/workflow/status/kjanat/articulate-parser/docker.yml?logo=docker&label=Docker)][Docker workflow]
[![CI](https://img.shields.io/github/actions/workflow/status/kjanat/articulate-parser/ci.yml?logo=github&label=CI)][Build] [![CI](https://img.shields.io/github/actions/workflow/status/kjanat/articulate-parser/ci.yml?logo=github&label=CI)][Build]
[![Codecov](https://img.shields.io/codecov/c/gh/kjanat/articulate-parser?token=eHhaHY8nut&logo=codecov&logoColor=%23F01F7A&label=Codecov)][Codecov] [![Codecov](https://img.shields.io/codecov/c/gh/kjanat/articulate-parser?token=eHhaHY8nut&logo=codecov&logoColor=%23F01F7A&label=Codecov)][Codecov]
## System Architecture
```mermaid
flowchart TD
%% User Input
CLI[Command Line Interface<br/>main.go] --> APP{App Service<br/>services/app.go}
%% Core Application Logic
APP --> |"ProcessCourseFromURI"| PARSER[Course Parser<br/>services/parser.go]
APP --> |"ProcessCourseFromFile"| PARSER
APP --> |"exportCourse"| FACTORY[Exporter Factory<br/>exporters/factory.go]
%% Data Sources
PARSER --> |"FetchCourse"| API[Articulate Rise API<br/>rise.articulate.com]
PARSER --> |"LoadCourseFromFile"| FILE[Local JSON File<br/>*.json]
%% Data Models
API --> MODELS[Data Models<br/>models/course.go]
FILE --> MODELS
MODELS --> |Course, Lesson, Item| APP
%% Export Factory Pattern
FACTORY --> |"CreateExporter"| MARKDOWN[Markdown Exporter<br/>exporters/markdown.go]
FACTORY --> |"CreateExporter"| HTML[HTML Exporter<br/>exporters/html.go]
FACTORY --> |"CreateExporter"| DOCX[DOCX Exporter<br/>exporters/docx.go]
%% HTML Cleaning Service
CLEANER[HTML Cleaner<br/>services/html_cleaner.go] --> MARKDOWN
CLEANER --> HTML
CLEANER --> DOCX
%% Output Files
MARKDOWN --> |"Export"| MD_OUT[Markdown Files<br/>*.md]
HTML --> |"Export"| HTML_OUT[HTML Files<br/>*.html]
DOCX --> |"Export"| DOCX_OUT[Word Documents<br/>*.docx]
%% Interfaces (Contracts)
IPARSER[CourseParser Interface<br/>interfaces/parser.go] -.-> PARSER
IEXPORTER[Exporter Interface<br/>interfaces/exporter.go] -.-> MARKDOWN
IEXPORTER -.-> HTML
IEXPORTER -.-> DOCX
IFACTORY[ExporterFactory Interface<br/>interfaces/exporter.go] -.-> FACTORY
%% Styling - Colors that work in both light and dark GitHub themes
classDef userInput fill:#dbeafe,stroke:#1e40af,stroke-width:2px,color:#1e40af
classDef coreLogic fill:#ede9fe,stroke:#6d28d9,stroke-width:2px,color:#6d28d9
classDef dataSource fill:#d1fae5,stroke:#059669,stroke-width:2px,color:#059669
classDef exporter fill:#fed7aa,stroke:#ea580c,stroke-width:2px,color:#ea580c
classDef output fill:#fce7f3,stroke:#be185d,stroke-width:2px,color:#be185d
classDef interface fill:#ecfdf5,stroke:#16a34a,stroke-width:1px,stroke-dasharray: 5 5,color:#16a34a
classDef service fill:#cffafe,stroke:#0891b2,stroke-width:2px,color:#0891b2
class CLI userInput
class APP,FACTORY coreLogic
class API,FILE,MODELS dataSource
class MARKDOWN,HTML,DOCX exporter
class MD_OUT,HTML_OUT,DOCX_OUT output
class IPARSER,IEXPORTER,IFACTORY interface
class PARSER,CLEANER service
```
### Architecture Overview
The system follows **Clean Architecture** principles with clear separation of concerns:
- **🎯 Entry Point**: Command-line interface handles user input and coordinates operations
- **🏗️ Application Layer**: Core business logic with dependency injection
- **📋 Interface Layer**: Contracts defining behavior without implementation details
- **🔧 Service Layer**: Concrete implementations of parsing and utility services
- **📤 Export Layer**: Factory pattern for format-specific exporters
- **📊 Data Layer**: Domain models representing course structure
## Features ## Features
- Parse Articulate Rise JSON data from URLs or local files - Parse Articulate Rise JSON data from URLs or local files
- Export to Markdown (.md) format - Export to Markdown (.md) format
- Export to HTML (.html) format with professional styling
- Export to Word Document (.docx) format - Export to Word Document (.docx) format
- Support for various content types: - Support for various content types:
- Text content with headings and paragraphs - Text content with headings and paragraphs
@ -29,20 +101,50 @@ A Go-based parser that converts Articulate Rise e-learning content to various fo
## Installation ## Installation
1. Ensure you have Go 1.21 or later installed ### Prerequisites
2. Clone or download the parser code
3. Initialize the Go module: - Go, I don't know the version, but I have [![Go version](https://img.shields.io/github/go-mod/go-version/kjanat/articulate-parser?label=)][gomod] configured right now, and it works, see the [CI][Build] workflow where it is tested.
### Install from source
```bash ```bash
go mod init articulate-parser git clone https://github.com/kjanat/articulate-parser.git
go mod tidy cd articulate-parser
go mod download
go build -o articulate-parser main.go
```
### Or install directly
```bash
go install github.com/kjanat/articulate-parser@latest
``` ```
## Dependencies ## Dependencies
The parser uses the following external library: The parser uses the following external library:
- `github.com/unidoc/unioffice` - For creating Word documents - `github.com/fumiama/go-docx` - For creating Word documents (MIT license)
## Testing
Run the test suite:
```bash
go test ./...
```
Run tests with coverage:
```bash
go test -v -race -coverprofile=coverage.out ./...
```
View coverage report:
```bash
go tool cover -html=coverage.out
```
## Usage ## Usage
@ -54,9 +156,11 @@ go run main.go <input_uri_or_file> <output_format> [output_path]
#### Parameters #### Parameters
- `input_uri_or_file`: Either an Articulate Rise share URL or path to a local JSON file | Parameter | Description | Default |
- `output_format`: `md` for Markdown or `docx` for Word Document | ------------------- | ---------------------------------------------------------------- | --------------- |
- `output_path`: Optional. If not provided, files are saved to `./output/` directory | `input_uri_or_file` | Either an Articulate Rise share URL or path to a local JSON file | None (required) |
| `output_format` | `md` for Markdown, `html` for HTML, or `docx` for Word Document | None (required) |
| `output_path` | Path where output file will be saved. | `./output/` |
#### Examples #### Examples
@ -72,10 +176,16 @@ go run main.go "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviD
go run main.go "articulate-sample.json" docx "my-course.docx" go run main.go "articulate-sample.json" docx "my-course.docx"
``` ```
3. **Parse from local file and export to Markdown:** 3. **Parse from local file and export to HTML:**
```bash ```bash
go run main.go "C:\Users\kjana\Projects\articulate-parser\articulate-sample.json" md go run main.go "articulate-sample.json" html "output.html"
```
4. **Parse from local file and export to Markdown:**
```bash
go run main.go "articulate-sample.json" md "output.md"
``` ```
### Building the Executable ### Building the Executable
@ -92,9 +202,153 @@ Then run:
./articulate-parser input.json md output.md ./articulate-parser input.json md output.md
``` ```
## Docker
The application is available as a Docker image from GitHub Container Registry.
### 🐳 Docker Image Information
- **Registry**: `ghcr.io/kjanat/articulate-parser`
- **Platforms**: linux/amd64, linux/arm64
- **Base Image**: Scratch (minimal footprint)
- **Size**: ~15-20MB compressed
### Quick Start
```bash
# Pull the latest image
docker pull ghcr.io/kjanat/articulate-parser:latest
# Show help
docker run --rm ghcr.io/kjanat/articulate-parser:latest --help
```
### Available Tags
| Tag | Description | Use Case |
|-----|-------------|----------|
| `latest` | Latest stable release from master branch | Production use |
| `edge` | Latest development build from master branch | Testing new features |
| `v1.x.x` | Specific version releases | Production pinning |
| `develop` | Development branch builds | Development/testing |
| `feature/docker-ghcr` | Feature branch builds | Feature testing |
| `master` | Latest master branch build | Continuous integration |
### Usage Examples
#### Process a local file
```bash
# Mount current directory and process a local JSON file
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
/workspace/input.json markdown /workspace/output.md
```
#### Process from URL
```bash
# Mount output directory and process from Articulate Rise URL
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
"https://rise.articulate.com/share/xyz" docx /workspace/output.docx
```
#### Export to different formats
```bash
# Export to HTML
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
/workspace/course.json html /workspace/course.html
# Export to Word Document
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
/workspace/course.json docx /workspace/course.docx
# Export to Markdown
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
/workspace/course.json md /workspace/course.md
```
#### Batch Processing
```bash
# Process multiple files in a directory
docker run --rm -v $(pwd):/workspace \
ghcr.io/kjanat/articulate-parser:latest \
bash -c "for file in /workspace/*.json; do
/articulate-parser \"\$file\" md \"\${file%.json}.md\"
done"
```
### Docker Compose
For local development, you can use the provided `docker-compose.yml`:
```bash
# Build and run with default help command
docker-compose up articulate-parser
# Process files using mounted volumes
docker-compose up parser-with-files
```
### Building Locally
```bash
# Build the Docker image locally
docker build -t articulate-parser:local .
# Run the local image
docker run --rm articulate-parser:local --help
# Build with specific version
docker build --build-arg VERSION=local --build-arg BUILD_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ) -t articulate-parser:local .
```
### Environment Variables
The Docker image supports the following build-time arguments:
| Argument | Description | Default |
|----------|-------------|---------|
| `VERSION` | Version string embedded in the binary | `dev` |
| `BUILD_TIME` | Build timestamp | Current time |
| `GIT_COMMIT` | Git commit hash | Current commit |
### Docker Security
- **Non-root execution**: The application runs as a non-privileged user
- **Minimal attack surface**: Built from scratch base image
- **No shell access**: Only the application binary is available
- **Read-only filesystem**: Container filesystem is read-only except for mounted volumes
## Development
### Code Quality
The project maintains high code quality standards:
- Cyclomatic complexity ≤ 15 (checked with [gocyclo](https://github.com/fzipp/gocyclo))
- Race condition detection enabled
- Comprehensive test coverage
- Code formatting with `gofmt`
- Static analysis with `go vet`
### Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests: `go test ./...`
5. Submit a pull request
## Output Formats ## Output Formats
### Markdown (.md) ### Markdown (`.md`)
- Hierarchical structure with proper heading levels - Hierarchical structure with proper heading levels
- Clean text content with HTML tags removed - Clean text content with HTML tags removed
@ -103,7 +357,16 @@ Then run:
- Media references included - Media references included
- Course metadata at the top - Course metadata at the top
### Word Document (.docx) ### HTML (`.html`)
- Professional styling with embedded CSS
- Interactive and visually appealing layout
- Proper HTML structure with semantic elements
- Responsive design for different screen sizes
- All content types beautifully formatted
- Maintains course hierarchy and organization
### Word Document (`.docx`)
- Professional document formatting - Professional document formatting
- Bold headings and proper typography - Bold headings and proper typography
@ -154,7 +417,7 @@ The parser includes error handling for:
<!-- ## Code coverage <!-- ## Code coverage
![Sunburst](https://codecov.io/gh/kjanat/articulate-parser/graphs/tree.svg?token=eHhaHY8nut) ![Sunburst](https://codecov.io/gh/kjanat/articulate-parser/graphs/sunburst.svg?token=eHhaHY8nut)
![Grid](https://codecov.io/gh/kjanat/articulate-parser/graphs/tree.svg?token=eHhaHY8nut) ![Grid](https://codecov.io/gh/kjanat/articulate-parser/graphs/tree.svg?token=eHhaHY8nut)
@ -167,16 +430,23 @@ The parser includes error handling for:
- Styling and visual formatting is not preserved - Styling and visual formatting is not preserved
- Assessment logic and interactivity is lost in static exports - Assessment logic and interactivity is lost in static exports
## Performance
- Lightweight with minimal dependencies
- Fast JSON parsing and export
- Memory efficient processing
- No external license requirements
## Future Enhancements ## Future Enhancements
Potential improvements could include: Potential improvements could include:
- PDF export support - [ ] PDF export support
- Media file downloading - [ ] Media file downloading
- HTML export with preserved styling - [x] ~~HTML export with preserved styling~~
- SCORM package support - [ ] SCORM package support
- Batch processing capabilities - [ ] Batch processing capabilities
- Custom template support - [ ] Custom template support
## License ## License
@ -185,9 +455,12 @@ This is a utility tool for educational content conversion. Please ensure you hav
[Build]: https://github.com/kjanat/articulate-parser/actions/workflows/ci.yml [Build]: https://github.com/kjanat/articulate-parser/actions/workflows/ci.yml
[Codecov]: https://codecov.io/gh/kjanat/articulate-parser [Codecov]: https://codecov.io/gh/kjanat/articulate-parser
[Commits]: https://github.com/kjanat/articulate-parser/commits/master/ [Commits]: https://github.com/kjanat/articulate-parser/commits/master/
[Docker workflow]: https://github.com/kjanat/articulate-parser/actions/workflows/docker.yml
[Docker image]: https://github.com/kjanat/articulate-parser/pkgs/container/articulate-parser
[Go report]: https://goreportcard.com/report/github.com/kjanat/articulate-parser [Go report]: https://goreportcard.com/report/github.com/kjanat/articulate-parser
[gomod]: go.mod [gomod]: go.mod
[Issues]: https://github.com/kjanat/articulate-parser/issues [Issues]: https://github.com/kjanat/articulate-parser/issues
[Latest release]: https://github.com/kjanat/articulate-parser/releases/latest <!-- [Latest release]: https://github.com/kjanat/articulate-parser/releases/latest -->
[MIT License]: LICENSE
[Package documentation]: https://godoc.org/github.com/kjanat/articulate-parser [Package documentation]: https://godoc.org/github.com/kjanat/articulate-parser
[Tags]: https://github.com/kjanat/articulate-parser/tags [Tags]: https://github.com/kjanat/articulate-parser/tags

610
Taskfile.yml Normal file
View File

@ -0,0 +1,610 @@
# yaml-language-server: $schema=https://taskfile.dev/schema.json
# Articulate Parser - Task Automation
# https://taskfile.dev
version: "3"
# Global output settings
output: prefixed
# Shell settings (only applied on Unix-like systems)
# Note: These are ignored on Windows where PowerShell/cmd is used
set: [errexit, pipefail]
shopt: [globstar]
# Watch mode interval
interval: 500ms
# Global variables
vars:
APP_NAME: articulate-parser
MAIN_FILE: main.go
OUTPUT_DIR: bin
COVERAGE_DIR: coverage
TEST_TIMEOUT: 5m
# Version info
VERSION:
sh: git describe --tags --always --dirty 2>/dev/null || echo "dev"
GIT_COMMIT:
sh: git rev-parse --short HEAD 2>/dev/null || echo "unknown"
BUILD_TIME: '{{now | date "2006-01-02T15:04:05Z07:00"}}'
# Go settings
CGO_ENABLED: 0
GO_FLAGS: -v
LDFLAGS: >-
-s -w
-X github.com/kjanat/articulate-parser/internal/version.Version={{.VERSION}}
-X github.com/kjanat/articulate-parser/internal/version.BuildTime={{.BUILD_TIME}}
-X github.com/kjanat/articulate-parser/internal/version.GitCommit={{.GIT_COMMIT}}
# Platform detection (using Task built-in variables)
GOOS:
sh: go env GOOS
GOARCH:
sh: go env GOARCH
EXE_EXT: '{{if eq OS "windows"}}.exe{{end}}'
# Environment variables
env:
CGO_ENABLED: "{{.CGO_ENABLED}}"
GO111MODULE: on
# Load .env files if present
dotenv: [".env", ".env.local"]
# Task definitions
tasks:
# Default task - show help
default:
desc: Show available tasks
cmds:
- task --list
silent: true
# Development tasks
dev:
desc: Run the application in development mode (with hot reload)
aliases: [run, start]
interactive: true
watch: true
sources:
- "**/*.go"
- go.mod
- go.sum
cmds:
- task: build
- "{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} --help"
# Build tasks
build:
desc: Build the application binary
aliases: [b]
deps: [clean-bin]
sources:
- "**/*.go"
- go.mod
- go.sum
generates:
- "{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}}"
cmds:
- task: mkdir
vars: { DIR: "{{.OUTPUT_DIR}}" }
- go build {{.GO_FLAGS}} -ldflags="{{.LDFLAGS}}" -o {{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} {{.MAIN_FILE}}
method: checksum
build:all:
desc: Build binaries for all major platforms
aliases: [build-all, cross-compile]
deps: [clean-bin]
cmds:
- task: mkdir
vars: { DIR: "{{.OUTPUT_DIR}}" }
- for:
matrix:
GOOS: [linux, darwin, windows]
GOARCH: [amd64, arm64]
task: build:platform
vars:
TARGET_GOOS: "{{.ITEM.GOOS}}"
TARGET_GOARCH: "{{.ITEM.GOARCH}}"
- echo "Built binaries for all platforms in {{.OUTPUT_DIR}}/"
build:platform:
internal: true
vars:
TARGET_EXT: '{{if eq .TARGET_GOOS "windows"}}.exe{{end}}'
OUTPUT_FILE: "{{.OUTPUT_DIR}}/{{.APP_NAME}}-{{.TARGET_GOOS}}-{{.TARGET_GOARCH}}{{.TARGET_EXT}}"
env:
GOOS: "{{.TARGET_GOOS}}"
GOARCH: "{{.TARGET_GOARCH}}"
cmds:
- echo "Building {{.OUTPUT_FILE}}..."
- go build {{.GO_FLAGS}} -ldflags="{{.LDFLAGS}}" -o "{{.OUTPUT_FILE}}" {{.MAIN_FILE}}
# Install task
install:
desc: Install the binary to $GOPATH/bin
deps: [test]
cmds:
- go install -ldflags="{{.LDFLAGS}}" {{.MAIN_FILE}}
- echo "Installed {{.APP_NAME}} to $(go env GOPATH)/bin"
# Testing tasks
test:
desc: Run all tests
aliases: [t]
env:
CGO_ENABLED: 1
cmds:
- go test {{.GO_FLAGS}} -race -timeout {{.TEST_TIMEOUT}} ./...
test:coverage:
desc: Run tests with coverage report
aliases: [cover, cov]
deps: [clean-coverage]
env:
CGO_ENABLED: 1
cmds:
- task: mkdir
vars: { DIR: "{{.COVERAGE_DIR}}" }
- go test {{.GO_FLAGS}} -race -coverprofile={{.COVERAGE_DIR}}/coverage.out -covermode=atomic -timeout {{.TEST_TIMEOUT}} ./...
- go tool cover -html={{.COVERAGE_DIR}}/coverage.out -o {{.COVERAGE_DIR}}/coverage.html
- go tool cover -func={{.COVERAGE_DIR}}/coverage.out
- echo "Coverage report generated at {{.COVERAGE_DIR}}/coverage.html"
test:verbose:
desc: Run tests with verbose output
aliases: [tv]
env:
CGO_ENABLED: 1
cmds:
- go test -v -race -timeout {{.TEST_TIMEOUT}} ./...
test:watch:
desc: Run tests in watch mode
aliases: [tw]
watch: true
sources:
- "**/*.go"
cmds:
- task: test
test:bench:
desc: Run benchmark tests
aliases: [bench]
cmds:
- go test -bench=. -benchmem -timeout {{.TEST_TIMEOUT}} ./...
test:integration:
desc: Run integration tests
env:
CGO_ENABLED: 1
status:
- '{{if eq OS "windows"}}if not exist "main_test.go" exit 1{{else}}test ! -f "main_test.go"{{end}}'
cmds:
- go test -v -race -tags=integration -timeout {{.TEST_TIMEOUT}} ./...
# Code quality tasks
lint:
desc: Run all linters
silent: true
aliases: [l]
cmds:
- task: lint:vet
- task: lint:fmt
- task: lint:staticcheck
- task: lint:golangci
lint:vet:
desc: Run go vet
silent: true
cmds:
- go vet ./...
lint:fmt:
desc: Check code formatting
silent: true
vars:
UNFORMATTED:
sh: gofmt -s -l .
cmds:
- |
{{if ne .UNFORMATTED ""}}
echo "❌ The following files need formatting:"
echo "{{.UNFORMATTED}}"
exit 1
{{else}}
echo "All files are properly formatted"
{{end}}
lint:staticcheck:
desc: Run staticcheck (install if needed)
silent: true
vars:
HAS_STATICCHECK:
sh: '{{if eq OS "windows"}}where.exe staticcheck 2>NUL{{else}}command -v staticcheck 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_STATICCHECK ""}}echo "Installing staticcheck..." && go install honnef.co/go/tools/cmd/staticcheck@latest{{end}}'
- staticcheck ./...
ignore_error: true
lint:golangci:
desc: Run golangci-lint (install if needed)
silent: true
aliases: [golangci, golangci-lint]
vars:
HAS_GOLANGCI:
sh: '{{if eq OS "windows"}}where.exe golangci-lint 2>NUL{{else}}command -v golangci-lint 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GOLANGCI ""}}echo "Installing golangci-lint..." && go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest{{end}}'
- golangci-lint run ./...
- echo "✅ golangci-lint passed"
lint:golangci:fix:
desc: Run golangci-lint with auto-fix
silent: true
aliases: [golangci-fix]
vars:
HAS_GOLANGCI:
sh: '{{if eq OS "windows"}}where.exe golangci-lint 2>NUL{{else}}command -v golangci-lint 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GOLANGCI ""}}echo "Installing golangci-lint..." && go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest{{end}}'
- golangci-lint run --fix ./...
- echo "golangci-lint fixes applied"
fmt:
desc: Format all Go files
silent: true
aliases: [format]
cmds:
- gofmt -s -w .
- echo "Formatted all Go files"
modernize:
desc: Modernize Go code to use modern idioms
silent: true
aliases: [modern]
cmds:
- go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./...
- echo "Code modernized"
# Dependency management
deps:
desc: Download and verify dependencies
aliases: [mod]
cmds:
- go mod download
- go mod verify
- echo "Dependencies downloaded and verified"
deps:tidy:
desc: Tidy go.mod and go.sum
aliases: [tidy]
cmds:
- go mod tidy
- echo "Dependencies tidied"
deps:update:
desc: Update all dependencies to latest versions
aliases: [update]
cmds:
- go get -u ./...
- go mod tidy
- echo "Dependencies updated"
deps:graph:
desc: Display dependency graph
cmds:
- go mod graph
# Docker tasks
docker:build:
desc: Build Docker image
aliases: [db]
cmds:
- |
docker build \
--build-arg VERSION={{.VERSION}} \
--build-arg BUILD_TIME={{.BUILD_TIME}} \
--build-arg GIT_COMMIT={{.GIT_COMMIT}} \
-t {{.APP_NAME}}:{{.VERSION}} \
-t {{.APP_NAME}}:latest \
.
- >
echo "Docker image built: {{.APP_NAME}}:{{.VERSION}}"
docker:build:dev:
desc: Build development Docker image
cmds:
- docker build -f Dockerfile.dev -t {{.APP_NAME}}:dev .
- >
echo "Development Docker image built: {{.APP_NAME}}:dev"
docker:run:
desc: Run Docker container
aliases: [dr]
deps: [docker:build]
cmds:
- docker run --rm {{.APP_NAME}}:{{.VERSION}} --help
docker:test:
desc: Test Docker image
deps: [docker:build]
cmds:
- docker run --rm {{.APP_NAME}}:{{.VERSION}} --version
- echo "Docker image tested successfully"
docker:compose:up:
desc: Start services with docker-compose
cmds:
- docker-compose up -d
docker:compose:down:
desc: Stop services with docker-compose
cmds:
- docker-compose down
# Cleanup tasks
clean:
desc: Clean all generated files
aliases: [c]
cmds:
- task: clean-bin
- task: clean-coverage
- task: clean-cache
- echo "All generated files cleaned"
clean-bin:
desc: Remove built binaries
internal: true
cmds:
- task: rmdir
vars: { DIR: "{{.OUTPUT_DIR}}" }
clean-coverage:
desc: Remove coverage files
internal: true
cmds:
- task: rmdir
vars: { DIR: "{{.COVERAGE_DIR}}" }
clean-cache:
desc: Clean Go build and test cache
cmds:
- go clean -cache -testcache -modcache
- echo "Go caches cleaned"
# CI/CD tasks
ci:
desc: Run all CI checks (test, lint, build)
cmds:
- task: deps
- task: lint
- task: test:coverage
- task: build
- echo "All CI checks passed"
ci:local:
desc: Run CI checks locally with detailed output
cmds:
- echo "🔍 Running local CI checks..."
- echo ""
- echo "📦 Checking dependencies..."
- task: deps
- echo ""
- echo "🔧 Running linters..."
- task: lint
- echo ""
- echo "🧪 Running tests with coverage..."
- task: test:coverage
- echo ""
- echo "🏗️ Building application..."
- task: build:all
- echo ""
- echo "All CI checks completed successfully!"
# Release tasks
release:check:
desc: Check if ready for release
cmds:
- task: ci
- git diff --exit-code
- git diff --cached --exit-code
- echo "Ready for release"
release:tag:
desc: Tag a new release (requires VERSION env var)
requires:
vars: [VERSION]
preconditions:
- sh: "git diff --exit-code"
msg: "Working directory is not clean"
- sh: "git diff --cached --exit-code"
msg: "Staging area is not clean"
cmds:
- git tag -a v{{.VERSION}} -m "Release v{{.VERSION}}"
- echo "Tagged v{{.VERSION}}"
- >
echo "Push with: git push origin v{{.VERSION}}"
# Documentation tasks
docs:serve:
desc: Serve documentation locally
vars:
HAS_GODOC:
sh: '{{if eq OS "windows"}}where.exe godoc 2>NUL{{else}}command -v godoc 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GODOC ""}}echo "Installing godoc..." && go install golang.org/x/tools/cmd/godoc@latest{{end}}'
- echo "📚 Serving documentation at http://localhost:6060"
- godoc -http=:6060
interactive: true
docs:coverage:
desc: Open coverage report in browser
deps: [test:coverage]
cmds:
- '{{if eq OS "darwin"}}open {{.COVERAGE_DIR}}/coverage.html{{else if eq OS "windows"}}start {{.COVERAGE_DIR}}/coverage.html{{else}}xdg-open {{.COVERAGE_DIR}}/coverage.html 2>/dev/null || echo "Please open {{.COVERAGE_DIR}}/coverage.html in your browser"{{end}}'
# Info tasks
info:
desc: Display build information
vars:
GO_VERSION:
sh: go version
cmds:
- task: info:print
silent: true
info:print:
internal: true
silent: true
vars:
GO_VERSION:
sh: go version
cmds:
- echo "Application Info:"
- echo " Name{{":"}} {{.APP_NAME}}"
- echo " Version{{":"}} {{.VERSION}}"
- echo " Git Commit{{":"}} {{.GIT_COMMIT}}"
- echo " Build Time{{":"}} {{.BUILD_TIME}}"
- echo ""
- echo "Go Environment{{":"}}"
- echo " Go Version{{":"}} {{.GO_VERSION}}"
- echo " GOOS{{":"}} {{.GOOS}}"
- echo " GOARCH{{":"}} {{.GOARCH}}"
- echo " CGO{{":"}} {{.CGO_ENABLED}}"
- echo ""
- echo "Paths{{":"}}"
- echo " Output Dir{{":"}} {{.OUTPUT_DIR}}"
- echo " Coverage{{":"}} {{.COVERAGE_DIR}}"
# Security tasks
security:check:
desc: Run security checks with gosec
vars:
HAS_GOSEC:
sh: '{{if eq OS "windows"}}where.exe gosec 2>NUL{{else}}command -v gosec 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GOSEC ""}}echo "Installing gosec..." && go install github.com/securego/gosec/v2/cmd/gosec@latest{{end}}'
- gosec ./...
ignore_error: true
security:audit:
desc: Audit dependencies for vulnerabilities
vars:
HAS_GOVULNCHECK:
sh: '{{if eq OS "windows"}}where.exe govulncheck 2>NUL{{else}}command -v govulncheck 2>/dev/null{{end}}'
cmds:
- '{{if eq .HAS_GOVULNCHECK ""}}echo "Installing govulncheck..." && go install golang.org/x/vuln/cmd/govulncheck@latest{{end}}'
- govulncheck ./...
# Example/Demo tasks
demo:markdown:
desc: Demo - Convert sample to Markdown
status:
- '{{if eq OS "windows"}}if not exist "articulate-sample.json" exit 1{{else}}test ! -f "articulate-sample.json"{{end}}'
deps: [build]
cmds:
- "{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} articulate-sample.json md output-demo.md"
- echo "Demo Markdown created{{:}} output-demo.md"
- defer:
task: rmfile
vars: { FILE: "output-demo.md" }
demo:html:
desc: Demo - Convert sample to HTML
status:
- '{{if eq OS "windows"}}if not exist "articulate-sample.json" exit 1{{else}}test ! -f "articulate-sample.json"{{end}}'
deps: [build]
cmds:
- "{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} articulate-sample.json html output-demo.html"
- echo "Demo HTML created{{:}} output-demo.html"
- defer:
task: rmfile
vars: { FILE: "output-demo.html" }
demo:docx:
desc: Demo - Convert sample to DOCX
status:
- '{{if eq OS "windows"}}if not exist "articulate-sample.json" exit 1{{else}}test ! -f "articulate-sample.json"{{end}}'
deps: [build]
cmds:
- "{{.OUTPUT_DIR}}/{{.APP_NAME}}{{.EXE_EXT}} articulate-sample.json docx output-demo.docx"
- echo "Demo DOCX created{{:}} output-demo.docx"
- defer:
task: rmfile
vars: { FILE: "output-demo.docx" }
# Performance profiling
profile:cpu:
desc: Run CPU profiling
cmds:
- go test -cpuprofile=cpu.prof -bench=. ./...
- go tool pprof -http=:8080 cpu.prof
- defer:
task: rmfile
vars: { FILE: "cpu.prof" }
profile:mem:
desc: Run memory profiling
cmds:
- go test -memprofile=mem.prof -bench=. ./...
- go tool pprof -http=:8080 mem.prof
- defer:
task: rmfile
vars: { FILE: "mem.prof" }
# Git hooks
hooks:install:
desc: Install git hooks
cmds:
- task: mkdir
vars: { DIR: ".git/hooks" }
- '{{if eq OS "windows"}}echo "#!/bin/sh" > .git/hooks/pre-commit && echo "task lint:fmt" >> .git/hooks/pre-commit{{else}}cat > .git/hooks/pre-commit << ''EOF''{{printf "\n"}}#!/bin/sh{{printf "\n"}}task lint:fmt{{printf "\n"}}EOF{{printf "\n"}}chmod +x .git/hooks/pre-commit{{end}}'
- echo "Git hooks installed"
# Quick shortcuts
qa:
desc: Quick quality assurance (fmt + lint + test)
aliases: [q, quick]
cmds:
- task: fmt
- task: lint
- task: test
- echo "Quick QA passed"
all:
desc: Build everything (clean + deps + test + build:all + docker:build)
cmds:
- task: clean
- task: deps:tidy
- task: test:coverage
- task: build:all
- task: docker:build
- echo "Full build completed!"
# Cross-platform helper tasks
mkdir:
internal: true
requires:
vars: [DIR]
cmds:
- '{{if eq OS "windows"}}powershell -Command "New-Item -ItemType Directory -Force -Path ''{{.DIR}}'' | Out-Null"{{else}}mkdir -p "{{.DIR}}"{{end}}'
silent: true
rmdir:
internal: true
requires:
vars: [DIR]
cmds:
- '{{if eq OS "windows"}}powershell -Command "if (Test-Path ''{{.DIR}}'') { Remove-Item -Recurse -Force ''{{.DIR}}'' }"{{else}}rm -rf "{{.DIR}}" 2>/dev/null || true{{end}}'
silent: true
rmfile:
internal: true
requires:
vars: [FILE]
cmds:
- '{{if eq OS "windows"}}powershell -Command "if (Test-Path ''{{.FILE}}'') { Remove-Item -Force ''{{.FILE}}'' }"{{else}}rm -f "{{.FILE}}"{{end}}'
silent: true

39
docker-compose.yml Normal file
View File

@ -0,0 +1,39 @@
services:
articulate-parser: &articulate-parser
build:
context: .
dockerfile: Dockerfile
args:
VERSION: "dev"
BUILD_TIME: "2024-01-01T00:00:00Z"
GIT_COMMIT: "dev"
image: articulate-parser:local
volumes:
# Mount current directory to /workspace for file access
- .:/workspace
working_dir: /workspace
# Override entrypoint for interactive use
entrypoint: ["/articulate-parser"]
# Default to showing help
command: ["--help"]
# Service for processing files with volume mounts
parser-with-files:
<<: *articulate-parser
volumes:
- ./input:/input:ro
- ./output:/output
command: ["/input/sample.json", "markdown", "/output/result.md"]
# Service for development - with shell access
parser-dev:
build:
context: .
dockerfile: Dockerfile.dev
image: articulate-parser:dev
volumes:
- .:/workspace
working_dir: /workspace
entrypoint: ["/bin/sh"]
command: ["-c", "while true; do sleep 30; done"]
# Uses Dockerfile.dev with Alpine base instead of scratch for shell access

15
go.mod
View File

@ -1,7 +1,16 @@
module github.com/kjanat/articulate-parser module github.com/kjanat/articulate-parser
go 1.21 go 1.24.0
require github.com/unidoc/unioffice v1.39.0 toolchain go1.25.5
require github.com/richardlehane/msoleps v1.0.4 // indirect require (
github.com/fumiama/go-docx v0.0.0-20250506085032-0c30fd09304b
golang.org/x/net v0.48.0
golang.org/x/text v0.32.0
)
require (
github.com/fumiama/imgsz v0.0.4 // indirect
golang.org/x/image v0.34.0 // indirect
)

16
go.sum
View File

@ -1,6 +1,10 @@
github.com/richardlehane/msoleps v1.0.3 h1:aznSZzrwYRl3rLKRT3gUk9am7T/mLNSnJINvN0AQoVM= github.com/fumiama/go-docx v0.0.0-20250506085032-0c30fd09304b h1:/mxSugRc4SgN7XgBtT19dAJ7cAXLTbPmlJLJE4JjRkE=
github.com/richardlehane/msoleps v1.0.3/go.mod h1:BWev5JBpU9Ko2WAgmZEuiz4/u3ZYTKbjLycmwiWUfWg= github.com/fumiama/go-docx v0.0.0-20250506085032-0c30fd09304b/go.mod h1:ssRF0IaB1hCcKIObp3FkZOsjTcAHpgii70JelNb4H8M=
github.com/richardlehane/msoleps v1.0.4 h1:WuESlvhX3gH2IHcd8UqyCuFY5yiq/GR/yqaSM/9/g00= github.com/fumiama/imgsz v0.0.4 h1:Lsasu2hdSSFS+vnD+nvR1UkiRMK7hcpyYCC0FzgSMFI=
github.com/richardlehane/msoleps v1.0.4/go.mod h1:BWev5JBpU9Ko2WAgmZEuiz4/u3ZYTKbjLycmwiWUfWg= github.com/fumiama/imgsz v0.0.4/go.mod h1:bISOQVTlw9sRytPwe8ir7tAaEmyz9hSNj9n8mXMBG0E=
github.com/unidoc/unioffice v1.39.0 h1:Wo5zvrzCqhyK/1Zi5dg8a5F5+NRftIMZPnFPYwruLto= golang.org/x/image v0.34.0 h1:33gCkyw9hmwbZJeZkct8XyR11yH889EQt/QH4VmXMn8=
github.com/unidoc/unioffice v1.39.0/go.mod h1:Axz6ltIZZTUUyHoEnPe4Mb3VmsN4TRHT5iZCGZ1rgnU= golang.org/x/image v0.34.0/go.mod h1:2RNFBZRB+vnwwFil8GkMdRvrJOFd1AzdZI6vOY+eJVU=
golang.org/x/net v0.48.0 h1:zyQRTTrjc33Lhh0fBgT/H3oZq9WuvRR5gPC70xpDiQU=
golang.org/x/net v0.48.0/go.mod h1:+ndRgGjkh8FGtu1w1FGbEC31if4VrNVMuKTgcAAnQRY=
golang.org/x/text v0.32.0 h1:ZD01bjUt1FQ9WJ0ClOL5vxgxOI/sVCNgX1YtKwcY0mU=
golang.org/x/text v0.32.0/go.mod h1:o/rUWzghvpD5TXrTIBuJU77MTaN0ljMWE47kxGJQ7jY=

77
internal/config/config.go Normal file
View File

@ -0,0 +1,77 @@
// Package config provides configuration management for the articulate-parser application.
// It supports loading configuration from environment variables and command-line flags.
package config
import (
"log/slog"
"os"
"strconv"
"time"
)
// Config holds all configuration values for the application.
type Config struct {
// Parser configuration
BaseURL string
RequestTimeout time.Duration
// Logging configuration
LogLevel slog.Level
LogFormat string // "json" or "text"
}
// Default configuration values.
const (
DefaultBaseURL = "https://rise.articulate.com"
DefaultRequestTimeout = 30 * time.Second
DefaultLogLevel = slog.LevelInfo
DefaultLogFormat = "text"
)
// Load creates a new Config with values from environment variables.
// Falls back to defaults if environment variables are not set.
func Load() *Config {
return &Config{
BaseURL: getEnv("ARTICULATE_BASE_URL", DefaultBaseURL),
RequestTimeout: getDurationEnv("ARTICULATE_REQUEST_TIMEOUT", DefaultRequestTimeout),
LogLevel: getLogLevelEnv("LOG_LEVEL", DefaultLogLevel),
LogFormat: getEnv("LOG_FORMAT", DefaultLogFormat),
}
}
// getEnv retrieves an environment variable or returns the default value.
func getEnv(key, defaultValue string) string {
if value := os.Getenv(key); value != "" {
return value
}
return defaultValue
}
// getDurationEnv retrieves a duration from environment variable or returns default.
// The environment variable should be in seconds (e.g., "30" for 30 seconds).
func getDurationEnv(key string, defaultValue time.Duration) time.Duration {
if value := os.Getenv(key); value != "" {
if seconds, err := strconv.Atoi(value); err == nil {
return time.Duration(seconds) * time.Second
}
}
return defaultValue
}
// getLogLevelEnv retrieves a log level from environment variable or returns default.
// Accepts: "debug", "info", "warn", "error" (case-insensitive).
func getLogLevelEnv(key string, defaultValue slog.Level) slog.Level {
value := os.Getenv(key)
switch value {
case "debug", "DEBUG":
return slog.LevelDebug
case "info", "INFO":
return slog.LevelInfo
case "warn", "WARN", "warning", "WARNING":
return slog.LevelWarn
case "error", "ERROR":
return slog.LevelError
default:
return defaultValue
}
}

View File

@ -0,0 +1,116 @@
package config
import (
"log/slog"
"os"
"testing"
"time"
)
func TestLoad(t *testing.T) {
// Clear environment
os.Clearenv()
cfg := Load()
if cfg.BaseURL != DefaultBaseURL {
t.Errorf("Expected BaseURL '%s', got '%s'", DefaultBaseURL, cfg.BaseURL)
}
if cfg.RequestTimeout != DefaultRequestTimeout {
t.Errorf("Expected timeout %v, got %v", DefaultRequestTimeout, cfg.RequestTimeout)
}
if cfg.LogLevel != DefaultLogLevel {
t.Errorf("Expected log level %v, got %v", DefaultLogLevel, cfg.LogLevel)
}
if cfg.LogFormat != DefaultLogFormat {
t.Errorf("Expected log format '%s', got '%s'", DefaultLogFormat, cfg.LogFormat)
}
}
func TestLoad_WithEnvironmentVariables(t *testing.T) {
// Set environment variables
t.Setenv("ARTICULATE_BASE_URL", "https://test.example.com")
t.Setenv("ARTICULATE_REQUEST_TIMEOUT", "60")
t.Setenv("LOG_LEVEL", "debug")
t.Setenv("LOG_FORMAT", "json")
cfg := Load()
if cfg.BaseURL != "https://test.example.com" {
t.Errorf("Expected BaseURL 'https://test.example.com', got '%s'", cfg.BaseURL)
}
if cfg.RequestTimeout != 60*time.Second {
t.Errorf("Expected timeout 60s, got %v", cfg.RequestTimeout)
}
if cfg.LogLevel != slog.LevelDebug {
t.Errorf("Expected log level Debug, got %v", cfg.LogLevel)
}
if cfg.LogFormat != "json" {
t.Errorf("Expected log format 'json', got '%s'", cfg.LogFormat)
}
}
func TestGetLogLevelEnv(t *testing.T) {
tests := []struct {
name string
value string
expected slog.Level
}{
{"debug lowercase", "debug", slog.LevelDebug},
{"debug uppercase", "DEBUG", slog.LevelDebug},
{"info lowercase", "info", slog.LevelInfo},
{"info uppercase", "INFO", slog.LevelInfo},
{"warn lowercase", "warn", slog.LevelWarn},
{"warn uppercase", "WARN", slog.LevelWarn},
{"warning lowercase", "warning", slog.LevelWarn},
{"error lowercase", "error", slog.LevelError},
{"error uppercase", "ERROR", slog.LevelError},
{"invalid value", "invalid", slog.LevelInfo},
{"empty value", "", slog.LevelInfo},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
os.Clearenv()
if tt.value != "" {
t.Setenv("TEST_LOG_LEVEL", tt.value)
}
result := getLogLevelEnv("TEST_LOG_LEVEL", slog.LevelInfo)
if result != tt.expected {
t.Errorf("Expected %v, got %v", tt.expected, result)
}
})
}
}
func TestGetDurationEnv(t *testing.T) {
tests := []struct {
name string
value string
expected time.Duration
}{
{"valid duration", "45", 45 * time.Second},
{"zero duration", "0", 0},
{"invalid duration", "invalid", 30 * time.Second},
{"empty value", "", 30 * time.Second},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
os.Clearenv()
if tt.value != "" {
t.Setenv("TEST_DURATION", tt.value)
}
result := getDurationEnv("TEST_DURATION", 30*time.Second)
if result != tt.expected {
t.Errorf("Expected %v, got %v", tt.expected, result)
}
})
}
}

View File

@ -0,0 +1,200 @@
package exporters
import (
"path/filepath"
"testing"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// BenchmarkFactory_CreateExporter_Markdown benchmarks markdown exporter creation.
func BenchmarkFactory_CreateExporter_Markdown(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
b.ResetTimer()
for b.Loop() {
_, _ = factory.CreateExporter("markdown")
}
}
// BenchmarkFactory_CreateExporter_All benchmarks creating all exporter types.
func BenchmarkFactory_CreateExporter_All(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
formats := []string{"markdown", "docx", "html"}
b.ResetTimer()
for b.Loop() {
for _, format := range formats {
_, _ = factory.CreateExporter(format)
}
}
}
// BenchmarkAllExporters_Export benchmarks all exporters with the same course.
func BenchmarkAllExporters_Export(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
course := createBenchmarkCourse()
exporters := map[string]struct {
exporter any
ext string
}{
"Markdown": {NewMarkdownExporter(htmlCleaner), ".md"},
"Docx": {NewDocxExporter(htmlCleaner), ".docx"},
"HTML": {NewHTMLExporter(htmlCleaner), ".html"},
}
for name, exp := range exporters {
b.Run(name, func(b *testing.B) {
tempDir := b.TempDir()
exporter := exp.exporter.(interface {
Export(*models.Course, string) error
})
b.ResetTimer()
for b.Loop() {
outputPath := filepath.Join(tempDir, "benchmark"+exp.ext)
_ = exporter.Export(course, outputPath)
}
})
}
}
// BenchmarkExporters_LargeCourse benchmarks exporters with large course data.
func BenchmarkExporters_LargeCourse(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
course := createLargeBenchmarkCourse()
b.Run("Markdown_Large", func(b *testing.B) {
exporter := NewMarkdownExporter(htmlCleaner)
tempDir := b.TempDir()
b.ResetTimer()
for b.Loop() {
outputPath := filepath.Join(tempDir, "large.md")
_ = exporter.Export(course, outputPath)
}
})
b.Run("Docx_Large", func(b *testing.B) {
exporter := NewDocxExporter(htmlCleaner)
tempDir := b.TempDir()
b.ResetTimer()
for b.Loop() {
outputPath := filepath.Join(tempDir, "large.docx")
_ = exporter.Export(course, outputPath)
}
})
b.Run("HTML_Large", func(b *testing.B) {
exporter := NewHTMLExporter(htmlCleaner)
tempDir := b.TempDir()
b.ResetTimer()
for b.Loop() {
outputPath := filepath.Join(tempDir, "large.html")
_ = exporter.Export(course, outputPath)
}
})
}
// createBenchmarkCourse creates a standard-sized course for benchmarking.
func createBenchmarkCourse() *models.Course {
return &models.Course{
ShareID: "benchmark-id",
Author: "Benchmark Author",
Course: models.CourseInfo{
ID: "bench-course",
Title: "Benchmark Course",
Description: "Performance testing course",
NavigationMode: "menu",
Lessons: []models.Lesson{
{
ID: "lesson1",
Title: "Introduction",
Type: "lesson",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "Welcome",
Paragraph: "<p>This is a test paragraph with <strong>HTML</strong> content.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "Item 1"},
{Paragraph: "Item 2"},
{Paragraph: "Item 3"},
},
},
},
},
},
},
}
}
// createLargeBenchmarkCourse creates a large course for stress testing.
func createLargeBenchmarkCourse() *models.Course {
lessons := make([]models.Lesson, 50)
for i := range 50 {
lessons[i] = models.Lesson{
ID: string(rune(i)),
Title: "Lesson " + string(rune(i)),
Type: "lesson",
Description: "<p>This is lesson description with <em>formatting</em>.</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "Section Heading",
Paragraph: "<p>Content with <strong>bold</strong> and <em>italic</em> text.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "Point 1"},
{Paragraph: "Point 2"},
{Paragraph: "Point 3"},
},
},
{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "Quiz Question",
Answers: []models.Answer{
{Title: "Answer A", Correct: false},
{Title: "Answer B", Correct: true},
{Title: "Answer C", Correct: false},
},
Feedback: "Good job!",
},
},
},
},
}
}
return &models.Course{
ShareID: "large-benchmark-id",
Author: "Benchmark Author",
Course: models.CourseInfo{
ID: "large-bench-course",
Title: "Large Benchmark Course",
Description: "Large performance testing course",
Lessons: lessons,
},
}
}

View File

@ -4,17 +4,21 @@ package exporters
import ( import (
"fmt" "fmt"
"os"
"strings" "strings"
"github.com/fumiama/go-docx"
"golang.org/x/text/cases"
"golang.org/x/text/language"
"github.com/kjanat/articulate-parser/internal/interfaces" "github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/models" "github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services" "github.com/kjanat/articulate-parser/internal/services"
"github.com/unidoc/unioffice/document"
) )
// DocxExporter implements the Exporter interface for DOCX format. // DocxExporter implements the Exporter interface for DOCX format.
// It converts Articulate Rise course data into a Microsoft Word document // It converts Articulate Rise course data into a Microsoft Word document
// using the unioffice/document package. // using the go-docx package.
type DocxExporter struct { type DocxExporter struct {
// htmlCleaner is used to convert HTML content to plain text // htmlCleaner is used to convert HTML content to plain text
htmlCleaner *services.HTMLCleaner htmlCleaner *services.HTMLCleaner
@ -45,21 +49,17 @@ func NewDocxExporter(htmlCleaner *services.HTMLCleaner) interfaces.Exporter {
// Returns: // Returns:
// - An error if creating or saving the document fails // - An error if creating or saving the document fails
func (e *DocxExporter) Export(course *models.Course, outputPath string) error { func (e *DocxExporter) Export(course *models.Course, outputPath string) error {
doc := document.New() doc := docx.New()
// Add title // Add title
titlePara := doc.AddParagraph() titlePara := doc.AddParagraph()
titleRun := titlePara.AddRun() titlePara.AddText(course.Course.Title).Size("32").Bold()
titleRun.AddText(course.Course.Title)
titleRun.Properties().SetBold(true)
titleRun.Properties().SetSize(16)
// Add description if available // Add description if available
if course.Course.Description != "" { if course.Course.Description != "" {
descPara := doc.AddParagraph() descPara := doc.AddParagraph()
descRun := descPara.AddRun()
cleanDesc := e.htmlCleaner.CleanHTML(course.Course.Description) cleanDesc := e.htmlCleaner.CleanHTML(course.Course.Description)
descRun.AddText(cleanDesc) descPara.AddText(cleanDesc)
} }
// Add each lesson // Add each lesson
@ -69,10 +69,32 @@ func (e *DocxExporter) Export(course *models.Course, outputPath string) error {
// Ensure output directory exists and add .docx extension // Ensure output directory exists and add .docx extension
if !strings.HasSuffix(strings.ToLower(outputPath), ".docx") { if !strings.HasSuffix(strings.ToLower(outputPath), ".docx") {
outputPath = outputPath + ".docx" outputPath += ".docx"
} }
return doc.SaveToFile(outputPath) // Create the file
// #nosec G304 - Output path is provided by user via CLI argument, which is expected behavior
file, err := os.Create(outputPath)
if err != nil {
return fmt.Errorf("failed to create output file: %w", err)
}
// Ensure file is closed even if WriteTo fails. Close errors are logged but not
// fatal since the document content has already been written to disk. A close
// error typically indicates a filesystem synchronization issue that doesn't
// affect the validity of the exported file.
defer func() {
if err := file.Close(); err != nil {
fmt.Fprintf(os.Stderr, "warning: failed to close output file: %v\n", err)
}
}()
// Save the document
_, err = doc.WriteTo(file)
if err != nil {
return fmt.Errorf("failed to save document: %w", err)
}
return nil
} }
// exportLesson adds a lesson to the document with appropriate formatting. // exportLesson adds a lesson to the document with appropriate formatting.
@ -81,20 +103,16 @@ func (e *DocxExporter) Export(course *models.Course, outputPath string) error {
// Parameters: // Parameters:
// - doc: The Word document being created // - doc: The Word document being created
// - lesson: The lesson data model to export // - lesson: The lesson data model to export
func (e *DocxExporter) exportLesson(doc *document.Document, lesson *models.Lesson) { func (e *DocxExporter) exportLesson(doc *docx.Docx, lesson *models.Lesson) {
// Add lesson title // Add lesson title
lessonPara := doc.AddParagraph() lessonPara := doc.AddParagraph()
lessonRun := lessonPara.AddRun() lessonPara.AddText(fmt.Sprintf("Lesson: %s", lesson.Title)).Size("28").Bold()
lessonRun.AddText(fmt.Sprintf("Lesson: %s", lesson.Title))
lessonRun.Properties().SetBold(true)
lessonRun.Properties().SetSize(14)
// Add lesson description if available // Add lesson description if available
if lesson.Description != "" { if lesson.Description != "" {
descPara := doc.AddParagraph() descPara := doc.AddParagraph()
descRun := descPara.AddRun()
cleanDesc := e.htmlCleaner.CleanHTML(lesson.Description) cleanDesc := e.htmlCleaner.CleanHTML(lesson.Description)
descRun.AddText(cleanDesc) descPara.AddText(cleanDesc)
} }
// Add each item in the lesson // Add each item in the lesson
@ -109,14 +127,12 @@ func (e *DocxExporter) exportLesson(doc *document.Document, lesson *models.Lesso
// Parameters: // Parameters:
// - doc: The Word document being created // - doc: The Word document being created
// - item: The item data model to export // - item: The item data model to export
func (e *DocxExporter) exportItem(doc *document.Document, item *models.Item) { func (e *DocxExporter) exportItem(doc *docx.Docx, item *models.Item) {
// Add item type as heading // Add item type as heading
if item.Type != "" { if item.Type != "" {
itemPara := doc.AddParagraph() itemPara := doc.AddParagraph()
itemRun := itemPara.AddRun() caser := cases.Title(language.English)
itemRun.AddText(strings.Title(item.Type)) itemPara.AddText(caser.String(item.Type)).Size("24").Bold()
itemRun.Properties().SetBold(true)
itemRun.Properties().SetSize(12)
} }
// Add sub-items // Add sub-items
@ -132,65 +148,55 @@ func (e *DocxExporter) exportItem(doc *document.Document, item *models.Item) {
// Parameters: // Parameters:
// - doc: The Word document being created // - doc: The Word document being created
// - subItem: The sub-item data model to export // - subItem: The sub-item data model to export
func (e *DocxExporter) exportSubItem(doc *document.Document, subItem *models.SubItem) { func (e *DocxExporter) exportSubItem(doc *docx.Docx, subItem *models.SubItem) {
// Add title if available // Add title if available
if subItem.Title != "" { if subItem.Title != "" {
subItemPara := doc.AddParagraph() subItemPara := doc.AddParagraph()
subItemRun := subItemPara.AddRun() subItemPara.AddText(" " + subItem.Title).Bold() // Indented
subItemRun.AddText(" " + subItem.Title) // Indented
subItemRun.Properties().SetBold(true)
} }
// Add heading if available // Add heading if available
if subItem.Heading != "" { if subItem.Heading != "" {
headingPara := doc.AddParagraph() headingPara := doc.AddParagraph()
headingRun := headingPara.AddRun()
cleanHeading := e.htmlCleaner.CleanHTML(subItem.Heading) cleanHeading := e.htmlCleaner.CleanHTML(subItem.Heading)
headingRun.AddText(" " + cleanHeading) // Indented headingPara.AddText(" " + cleanHeading).Bold() // Indented
headingRun.Properties().SetBold(true)
} }
// Add paragraph content if available // Add paragraph content if available
if subItem.Paragraph != "" { if subItem.Paragraph != "" {
contentPara := doc.AddParagraph() contentPara := doc.AddParagraph()
contentRun := contentPara.AddRun()
cleanContent := e.htmlCleaner.CleanHTML(subItem.Paragraph) cleanContent := e.htmlCleaner.CleanHTML(subItem.Paragraph)
contentRun.AddText(" " + cleanContent) // Indented contentPara.AddText(" " + cleanContent) // Indented
} }
// Add answers if this is a question // Add answers if this is a question
if len(subItem.Answers) > 0 { if len(subItem.Answers) > 0 {
answersPara := doc.AddParagraph() answersPara := doc.AddParagraph()
answersRun := answersPara.AddRun() answersPara.AddText(" Answers:").Bold()
answersRun.AddText(" Answers:")
answersRun.Properties().SetBold(true)
for i, answer := range subItem.Answers { for i, answer := range subItem.Answers {
answerPara := doc.AddParagraph() answerPara := doc.AddParagraph()
answerRun := answerPara.AddRun()
prefix := fmt.Sprintf(" %d. ", i+1) prefix := fmt.Sprintf(" %d. ", i+1)
if answer.Correct { if answer.Correct {
prefix += "✓ " prefix += "✓ "
} }
cleanAnswer := e.htmlCleaner.CleanHTML(answer.Title) cleanAnswer := e.htmlCleaner.CleanHTML(answer.Title)
answerRun.AddText(prefix + cleanAnswer) answerPara.AddText(prefix + cleanAnswer)
} }
} }
// Add feedback if available // Add feedback if available
if subItem.Feedback != "" { if subItem.Feedback != "" {
feedbackPara := doc.AddParagraph() feedbackPara := doc.AddParagraph()
feedbackRun := feedbackPara.AddRun()
cleanFeedback := e.htmlCleaner.CleanHTML(subItem.Feedback) cleanFeedback := e.htmlCleaner.CleanHTML(subItem.Feedback)
feedbackRun.AddText(" Feedback: " + cleanFeedback) feedbackPara.AddText(" Feedback: " + cleanFeedback).Italic()
feedbackRun.Properties().SetItalic(true)
} }
} }
// GetSupportedFormat returns the format name this exporter supports. // SupportedFormat returns the format name this exporter supports.
// //
// Returns: // Returns:
// - A string representing the supported format ("docx") // - A string representing the supported format ("docx")
func (e *DocxExporter) GetSupportedFormat() string { func (e *DocxExporter) SupportedFormat() string {
return "docx" return FormatDocx
} }

View File

@ -0,0 +1,671 @@
package exporters
import (
"os"
"path/filepath"
"testing"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// TestNewDocxExporter tests the NewDocxExporter constructor.
func TestNewDocxExporter(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
if exporter == nil {
t.Fatal("NewDocxExporter() returned nil")
}
// Type assertion to check internal structure
docxExporter, ok := exporter.(*DocxExporter)
if !ok {
t.Fatal("NewDocxExporter() returned wrong type")
}
if docxExporter.htmlCleaner == nil {
t.Error("htmlCleaner should not be nil")
}
}
// TestDocxExporter_SupportedFormat tests the SupportedFormat method.
func TestDocxExporter_SupportedFormat(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
expected := "docx"
result := exporter.SupportedFormat()
if result != expected {
t.Errorf("Expected format '%s', got '%s'", expected, result)
}
}
// TestDocxExporter_Export tests the Export method.
func TestDocxExporter_Export(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
// Create test course
testCourse := createTestCourseForDocx()
// Create temporary directory and file
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "test-course.docx")
// Test successful export
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Check that file was created
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
// Verify file has some content (basic check)
fileInfo, err := os.Stat(outputPath)
if err != nil {
t.Fatalf("Failed to get file info: %v", err)
}
if fileInfo.Size() == 0 {
t.Error("Output file is empty")
}
}
// TestDocxExporter_Export_AddDocxExtension tests that the .docx extension is added automatically.
func TestDocxExporter_Export_AddDocxExtension(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
testCourse := createTestCourseForDocx()
// Create temporary directory and file without .docx extension
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "test-course")
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Check that file was created with .docx extension
expectedPath := outputPath + ".docx"
if _, err := os.Stat(expectedPath); os.IsNotExist(err) {
t.Fatal("Output file with .docx extension was not created")
}
}
// TestDocxExporter_Export_InvalidPath tests export with invalid output path.
func TestDocxExporter_Export_InvalidPath(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
testCourse := createTestCourseForDocx()
// Try to write to invalid path
invalidPath := "/invalid/path/that/does/not/exist/file.docx"
err := exporter.Export(testCourse, invalidPath)
if err == nil {
t.Fatal("Expected error for invalid path, got nil")
}
}
// TestDocxExporter_ExportLesson tests the exportLesson method indirectly through Export.
func TestDocxExporter_ExportLesson(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
// Create course with specific lesson content
course := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
ID: "test-course",
Title: "Test Course",
Lessons: []models.Lesson{
{
ID: "lesson-1",
Title: "Test Lesson",
Type: "lesson",
Description: "<p>Test lesson description with <strong>bold</strong> text.</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Title: "Test Item Title",
Paragraph: "<p>Test paragraph content.</p>",
},
},
},
},
},
},
},
}
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "lesson-test.docx")
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Verify file was created successfully
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
}
// TestDocxExporter_ExportItem tests the exportItem method indirectly through Export.
func TestDocxExporter_ExportItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
// Create course with different item types
course := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
ID: "test-course",
Title: "Item Test Course",
Lessons: []models.Lesson{
{
ID: "lesson-1",
Title: "Item Types Lesson",
Type: "lesson",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Title: "Text Item",
Paragraph: "<p>Text content</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>List item 1</p>"},
{Paragraph: "<p>List item 2</p>"},
},
},
{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "<p>What is the answer?</p>",
Answers: []models.Answer{
{Title: "Option A", Correct: false},
{Title: "Option B", Correct: true},
},
Feedback: "<p>Correct answer explanation</p>",
},
},
},
},
},
},
},
}
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "items-test.docx")
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Verify file was created successfully
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
}
// TestDocxExporter_ExportSubItem tests the exportSubItem method indirectly through Export.
func TestDocxExporter_ExportSubItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
// Create course with sub-item containing all possible fields
course := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
ID: "test-course",
Title: "SubItem Test Course",
Lessons: []models.Lesson{
{
ID: "lesson-1",
Title: "SubItem Test Lesson",
Type: "lesson",
Items: []models.Item{
{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "<p>Question Title</p>",
Heading: "<h3>Question Heading</h3>",
Paragraph: "<p>Question description with <em>emphasis</em>.</p>",
Answers: []models.Answer{
{Title: "Wrong answer", Correct: false},
{Title: "Correct answer", Correct: true},
{Title: "Another wrong answer", Correct: false},
},
Feedback: "<p>Feedback with <strong>formatting</strong>.</p>",
},
},
},
},
},
},
},
}
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "subitem-test.docx")
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Verify file was created successfully
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
}
// TestDocxExporter_ComplexCourse tests export of a complex course structure.
func TestDocxExporter_ComplexCourse(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
// Create complex test course
course := &models.Course{
ShareID: "complex-test-id",
Course: models.CourseInfo{
ID: "complex-course",
Title: "Complex Test Course",
Description: "<p>This is a <strong>complex</strong> course description with <em>formatting</em>.</p>",
Lessons: []models.Lesson{
{
ID: "section-1",
Title: "Course Section",
Type: "section",
},
{
ID: "lesson-1",
Title: "Introduction Lesson",
Type: "lesson",
Description: "<p>Introduction to the course with <code>code</code> and <a href='#'>links</a>.</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h2>Welcome</h2>",
Paragraph: "<p>Welcome to our comprehensive course!</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>Learn advanced concepts</p>"},
{Paragraph: "<p>Practice with real examples</p>"},
{Paragraph: "<p>Apply knowledge in projects</p>"},
},
},
{
Type: "multimedia",
Items: []models.SubItem{
{
Title: "<p>Video Introduction</p>",
Caption: "<p>Watch this introductory video</p>",
Media: &models.Media{
Video: &models.VideoMedia{
OriginalURL: "https://example.com/intro.mp4",
Duration: 300,
},
},
},
},
},
{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "<p>What will you learn in this course?</p>",
Answers: []models.Answer{
{Title: "Basic concepts only", Correct: false},
{Title: "Advanced concepts and practical application", Correct: true},
{Title: "Theory without practice", Correct: false},
},
Feedback: "<p>Excellent! This course covers both theory and practice.</p>",
},
},
},
{
Type: "image",
Items: []models.SubItem{
{
Caption: "<p>Course overview diagram</p>",
Media: &models.Media{
Image: &models.ImageMedia{
OriginalURL: "https://example.com/overview.png",
},
},
},
},
},
{
Type: "interactive",
Items: []models.SubItem{
{
Title: "<p>Interactive Exercise</p>",
},
},
},
},
},
{
ID: "lesson-2",
Title: "Advanced Topics",
Type: "lesson",
Items: []models.Item{
{
Type: "divider",
},
{
Type: "unknown",
Items: []models.SubItem{
{
Title: "<p>Custom Content</p>",
Paragraph: "<p>This is custom content type</p>",
},
},
},
},
},
},
},
}
// Create temporary output file
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "complex-course.docx")
// Export course
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Verify file was created and has reasonable size
fileInfo, err := os.Stat(outputPath)
if err != nil {
t.Fatalf("Failed to get file info: %v", err)
}
if fileInfo.Size() < 1000 {
t.Error("Output file seems too small for complex course content")
}
}
// TestDocxExporter_EmptyCourse tests export of an empty course.
func TestDocxExporter_EmptyCourse(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
// Create minimal course
course := &models.Course{
ShareID: "empty-id",
Course: models.CourseInfo{
ID: "empty-course",
Title: "Empty Course",
Lessons: []models.Lesson{},
},
}
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "empty-course.docx")
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Verify file was created
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
}
// TestDocxExporter_HTMLCleaning tests that HTML content is properly cleaned.
func TestDocxExporter_HTMLCleaning(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
// Create course with HTML content that needs cleaning
course := &models.Course{
ShareID: "html-test-id",
Course: models.CourseInfo{
ID: "html-test-course",
Title: "HTML Cleaning Test",
Description: "<p>Description with <script>alert('xss')</script> and <b>bold</b> text.</p>",
Lessons: []models.Lesson{
{
ID: "lesson-1",
Title: "Test Lesson",
Type: "lesson",
Description: "<div>Lesson description with <span style='color:red'>styled</span> content.</div>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h1>Heading with <em>emphasis</em> and &amp; entities</h1>",
Paragraph: "<p>Paragraph with &lt;code&gt; entities and <strong>formatting</strong>.</p>",
},
},
},
},
},
},
},
}
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "html-cleaning-test.docx")
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Verify file was created (basic check that HTML cleaning didn't break export)
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
}
// TestDocxExporter_ExistingDocxExtension tests that existing .docx extension is preserved.
func TestDocxExporter_ExistingDocxExtension(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
testCourse := createTestCourseForDocx()
// Use path that already has .docx extension
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "test-course.docx")
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Check that file was created at the exact path (no double extension)
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created at expected path")
}
// Ensure no double extension was created
doubleExtensionPath := outputPath + ".docx"
if _, err := os.Stat(doubleExtensionPath); err == nil {
t.Error("Double .docx extension file should not exist")
}
}
// TestDocxExporter_CaseInsensitiveExtension tests that extension checking is case-insensitive.
func TestDocxExporter_CaseInsensitiveExtension(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
testCourse := createTestCourseForDocx()
// Test various case combinations
testCases := []string{
"test-course.DOCX",
"test-course.Docx",
"test-course.DocX",
}
for i, testCase := range testCases {
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, testCase)
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed for case %d (%s): %v", i, testCase, err)
}
// Check that file was created at the exact path (no additional extension)
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatalf("Output file was not created at expected path for case %d (%s)", i, testCase)
}
}
}
// createTestCourseForDocx creates a test course for DOCX export testing.
func createTestCourseForDocx() *models.Course {
return &models.Course{
ShareID: "test-share-id",
Course: models.CourseInfo{
ID: "test-course-id",
Title: "Test Course",
Description: "<p>Test course description with <strong>formatting</strong>.</p>",
Lessons: []models.Lesson{
{
ID: "section-1",
Title: "Test Section",
Type: "section",
},
{
ID: "lesson-1",
Title: "Test Lesson",
Type: "lesson",
Description: "<p>Test lesson description</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h2>Test Heading</h2>",
Paragraph: "<p>Test paragraph content.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>First list item</p>"},
{Paragraph: "<p>Second list item</p>"},
},
},
},
},
},
},
}
}
// BenchmarkDocxExporter_Export benchmarks the Export method.
func BenchmarkDocxExporter_Export(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
course := createTestCourseForDocx()
// Create temporary directory
tempDir := b.TempDir()
for b.Loop() {
outputPath := filepath.Join(tempDir, "benchmark-course.docx")
_ = exporter.Export(course, outputPath)
// Clean up for next iteration. Remove errors are ignored because we've already
// benchmarked the export operation; cleanup failures don't affect the benchmark
// measurements or the validity of the next iteration's export.
_ = os.Remove(outputPath)
}
}
// BenchmarkDocxExporter_ComplexCourse benchmarks export of a complex course.
func BenchmarkDocxExporter_ComplexCourse(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewDocxExporter(htmlCleaner)
// Create complex course for benchmarking
course := &models.Course{
ShareID: "benchmark-id",
Course: models.CourseInfo{
ID: "benchmark-course",
Title: "Benchmark Course",
Description: "<p>Complex course for performance testing</p>",
Lessons: make([]models.Lesson, 10), // 10 lessons
},
}
// Fill with test data
for i := range 10 {
lesson := models.Lesson{
ID: "lesson-" + string(rune(i)),
Title: "Lesson " + string(rune(i)),
Type: "lesson",
Items: make([]models.Item, 5), // 5 items per lesson
}
for j := range 5 {
item := models.Item{
Type: "text",
Items: make([]models.SubItem, 3), // 3 sub-items per item
}
for k := range 3 {
item.Items[k] = models.SubItem{
Heading: "<h3>Heading " + string(rune(k)) + "</h3>",
Paragraph: "<p>Paragraph content with <strong>formatting</strong> for performance testing.</p>",
}
}
lesson.Items[j] = item
}
course.Course.Lessons[i] = lesson
}
tempDir := b.TempDir()
for b.Loop() {
outputPath := filepath.Join(tempDir, "benchmark-complex.docx")
_ = exporter.Export(course, outputPath)
// Remove errors are ignored because we're only benchmarking the export
// operation itself; cleanup failures don't affect the benchmark metrics.
_ = os.Remove(outputPath)
}
}

View File

@ -0,0 +1,101 @@
// Package exporters_test provides examples for the exporters package.
package exporters_test
import (
"fmt"
"log"
"github.com/kjanat/articulate-parser/internal/exporters"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// ExampleNewFactory demonstrates creating an exporter factory.
func ExampleNewFactory() {
htmlCleaner := services.NewHTMLCleaner()
factory := exporters.NewFactory(htmlCleaner)
// Get supported formats
formats := factory.SupportedFormats()
fmt.Printf("Supported formats: %d\n", len(formats))
// Output: Supported formats: 6
}
// ExampleFactory_CreateExporter demonstrates creating exporters.
func ExampleFactory_CreateExporter() {
htmlCleaner := services.NewHTMLCleaner()
factory := exporters.NewFactory(htmlCleaner)
// Create a markdown exporter
exporter, err := factory.CreateExporter("markdown")
if err != nil {
log.Fatal(err)
}
fmt.Printf("Created: %s exporter\n", exporter.SupportedFormat())
// Output: Created: markdown exporter
}
// ExampleFactory_CreateExporter_caseInsensitive demonstrates case-insensitive format names.
func ExampleFactory_CreateExporter_caseInsensitive() {
htmlCleaner := services.NewHTMLCleaner()
factory := exporters.NewFactory(htmlCleaner)
// All these work (case-insensitive)
formats := []string{"MARKDOWN", "Markdown", "markdown", "MD"}
for _, format := range formats {
exporter, _ := factory.CreateExporter(format)
fmt.Printf("%s -> %s\n", format, exporter.SupportedFormat())
}
// Output:
// MARKDOWN -> markdown
// Markdown -> markdown
// markdown -> markdown
// MD -> markdown
}
// ExampleMarkdownExporter_Export demonstrates exporting to Markdown.
func ExampleMarkdownExporter_Export() {
htmlCleaner := services.NewHTMLCleaner()
exporter := exporters.NewMarkdownExporter(htmlCleaner)
course := &models.Course{
ShareID: "example-id",
Course: models.CourseInfo{
Title: "Example Course",
Description: "<p>Course description</p>",
},
}
// Export to markdown file
err := exporter.Export(course, "output.md")
if err != nil {
log.Fatal(err)
}
fmt.Println("Export complete")
// Output: Export complete
}
// ExampleDocxExporter_Export demonstrates exporting to DOCX.
func ExampleDocxExporter_Export() {
htmlCleaner := services.NewHTMLCleaner()
exporter := exporters.NewDocxExporter(htmlCleaner)
course := &models.Course{
ShareID: "example-id",
Course: models.CourseInfo{
Title: "Example Course",
},
}
// Export to Word document
err := exporter.Export(course, "output.docx")
if err != nil {
log.Fatal(err)
}
fmt.Println("DOCX export complete")
// Output: DOCX export complete
}

View File

@ -1,5 +1,3 @@
// Package exporters provides implementations of the Exporter interface
// for converting Articulate Rise courses into various file formats.
package exporters package exporters
import ( import (
@ -10,6 +8,13 @@ import (
"github.com/kjanat/articulate-parser/internal/services" "github.com/kjanat/articulate-parser/internal/services"
) )
// Format constants for supported export formats.
const (
FormatMarkdown = "markdown"
FormatDocx = "docx"
FormatHTML = "html"
)
// Factory implements the ExporterFactory interface. // Factory implements the ExporterFactory interface.
// It creates appropriate exporter instances based on the requested format. // It creates appropriate exporter instances based on the requested format.
type Factory struct { type Factory struct {
@ -33,31 +38,22 @@ func NewFactory(htmlCleaner *services.HTMLCleaner) interfaces.ExporterFactory {
} }
// CreateExporter creates an exporter for the specified format. // CreateExporter creates an exporter for the specified format.
// It returns an appropriate exporter implementation based on the format string. // Format strings are case-insensitive (e.g., "markdown", "DOCX").
// Format strings are case-insensitive.
//
// Parameters:
// - format: The desired export format (e.g., "markdown", "docx")
//
// Returns:
// - An implementation of the Exporter interface if the format is supported
// - An error if the format is not supported
func (f *Factory) CreateExporter(format string) (interfaces.Exporter, error) { func (f *Factory) CreateExporter(format string) (interfaces.Exporter, error) {
switch strings.ToLower(format) { switch strings.ToLower(format) {
case "markdown", "md": case FormatMarkdown, "md":
return NewMarkdownExporter(f.htmlCleaner), nil return NewMarkdownExporter(f.htmlCleaner), nil
case "docx", "word": case FormatDocx, "word":
return NewDocxExporter(f.htmlCleaner), nil return NewDocxExporter(f.htmlCleaner), nil
case FormatHTML, "htm":
return NewHTMLExporter(f.htmlCleaner), nil
default: default:
return nil, fmt.Errorf("unsupported export format: %s", format) return nil, fmt.Errorf("unsupported export format: %s", format)
} }
} }
// GetSupportedFormats returns a list of all supported export formats. // SupportedFormats returns a list of all supported export formats,
// This includes both primary format names and their aliases. // including both primary format names and their aliases.
// func (f *Factory) SupportedFormats() []string {
// Returns: return []string{FormatMarkdown, "md", FormatDocx, "word", FormatHTML, "htm"}
// - A string slice containing all supported format names
func (f *Factory) GetSupportedFormats() []string {
return []string{"markdown", "md", "docx", "word"}
} }

View File

@ -0,0 +1,473 @@
package exporters
import (
"reflect"
"sort"
"strings"
"testing"
"github.com/kjanat/articulate-parser/internal/services"
)
// TestNewFactory tests the NewFactory constructor.
func TestNewFactory(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
if factory == nil {
t.Fatal("NewFactory() returned nil")
}
// Type assertion to check internal structure
factoryImpl, ok := factory.(*Factory)
if !ok {
t.Fatal("NewFactory() returned wrong type")
}
if factoryImpl.htmlCleaner == nil {
t.Error("htmlCleaner should not be nil")
}
}
// TestFactory_CreateExporter tests the CreateExporter method for all supported formats.
func TestFactory_CreateExporter(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
testCases := []struct {
name string
format string
expectedType string
expectedFormat string
shouldError bool
}{
{
name: "markdown format",
format: "markdown",
expectedType: "*exporters.MarkdownExporter",
expectedFormat: "markdown",
shouldError: false,
},
{
name: "md format alias",
format: "md",
expectedType: "*exporters.MarkdownExporter",
expectedFormat: "markdown",
shouldError: false,
},
{
name: "docx format",
format: "docx",
expectedType: "*exporters.DocxExporter",
expectedFormat: "docx",
shouldError: false,
},
{
name: "word format alias",
format: "word",
expectedType: "*exporters.DocxExporter",
expectedFormat: "docx",
shouldError: false,
},
{
name: "html format",
format: "html",
expectedType: "*exporters.HTMLExporter",
expectedFormat: "html",
shouldError: false,
},
{
name: "htm format alias",
format: "htm",
expectedType: "*exporters.HTMLExporter",
expectedFormat: "html",
shouldError: false,
},
{
name: "unsupported format",
format: "pdf",
shouldError: true,
},
{
name: "empty format",
format: "",
shouldError: true,
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
exporter, err := factory.CreateExporter(tc.format)
if tc.shouldError {
if err == nil {
t.Errorf("Expected error for format '%s', but got nil", tc.format)
}
if exporter != nil {
t.Errorf("Expected nil exporter for unsupported format '%s'", tc.format)
}
return
}
if err != nil {
t.Fatalf("Unexpected error creating exporter for format '%s': %v", tc.format, err)
}
if exporter == nil {
t.Fatalf("CreateExporter returned nil for supported format '%s'", tc.format)
}
// Check type
exporterType := reflect.TypeOf(exporter).String()
if exporterType != tc.expectedType {
t.Errorf("Expected exporter type '%s' for format '%s', got '%s'", tc.expectedType, tc.format, exporterType)
}
// Check supported format
supportedFormat := exporter.SupportedFormat()
if supportedFormat != tc.expectedFormat {
t.Errorf("Expected supported format '%s' for format '%s', got '%s'", tc.expectedFormat, tc.format, supportedFormat)
}
})
}
}
// TestFactory_CreateExporter_CaseInsensitive tests that format strings are case-insensitive.
func TestFactory_CreateExporter_CaseInsensitive(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
testCases := []struct {
format string
expectedFormat string
}{
{"MARKDOWN", "markdown"},
{"Markdown", "markdown"},
{"MarkDown", "markdown"},
{"MD", "markdown"},
{"Md", "markdown"},
{"DOCX", "docx"},
{"Docx", "docx"},
{"DocX", "docx"},
{"WORD", "docx"},
{"Word", "docx"},
{"WoRd", "docx"},
{"HTML", "html"},
{"Html", "html"},
{"HtMl", "html"},
{"HTM", "html"},
{"Htm", "html"},
{"HtM", "html"},
}
for _, tc := range testCases {
t.Run(tc.format, func(t *testing.T) {
exporter, err := factory.CreateExporter(tc.format)
if err != nil {
t.Fatalf("Unexpected error for format '%s': %v", tc.format, err)
}
if exporter == nil {
t.Fatalf("CreateExporter returned nil for format '%s'", tc.format)
}
supportedFormat := exporter.SupportedFormat()
if supportedFormat != tc.expectedFormat {
t.Errorf("Expected supported format '%s' for format '%s', got '%s'", tc.expectedFormat, tc.format, supportedFormat)
}
})
}
}
// TestFactory_CreateExporter_ErrorMessages tests error messages for unsupported formats.
func TestFactory_CreateExporter_ErrorMessages(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
testCases := []string{
"pdf",
"txt",
"json",
"xml",
"unknown",
"123",
"markdown-invalid",
}
for _, format := range testCases {
t.Run(format, func(t *testing.T) {
exporter, err := factory.CreateExporter(format)
if err == nil {
t.Errorf("Expected error for unsupported format '%s', got nil", format)
}
if exporter != nil {
t.Errorf("Expected nil exporter for unsupported format '%s', got %v", format, exporter)
}
// Check error message contains the format
if err != nil && !strings.Contains(err.Error(), format) {
t.Errorf("Error message should contain the unsupported format '%s', got: %s", format, err.Error())
}
// Check error message has expected prefix
if err != nil && !strings.Contains(err.Error(), "unsupported export format") {
t.Errorf("Error message should contain 'unsupported export format', got: %s", err.Error())
}
})
}
}
// TestFactory_SupportedFormats tests the SupportedFormats method.
func TestFactory_SupportedFormats(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
formats := factory.SupportedFormats()
if formats == nil {
t.Fatal("SupportedFormats() returned nil")
}
expected := []string{"markdown", "md", "docx", "word", "html", "htm"}
// Sort both slices for comparison
sort.Strings(formats)
sort.Strings(expected)
if !reflect.DeepEqual(formats, expected) {
t.Errorf("Expected formats %v, got %v", expected, formats)
}
// Verify all returned formats can create exporters
for _, format := range formats {
exporter, err := factory.CreateExporter(format)
if err != nil {
t.Errorf("Format '%s' from SupportedFormats() should be creatable, got error: %v", format, err)
}
if exporter == nil {
t.Errorf("Format '%s' from SupportedFormats() should create non-nil exporter", format)
}
}
}
// TestFactory_SupportedFormats_Immutable tests that the returned slice is safe to modify.
func TestFactory_SupportedFormats_Immutable(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
// Get formats twice
formats1 := factory.SupportedFormats()
formats2 := factory.SupportedFormats()
// Modify first slice
if len(formats1) > 0 {
formats1[0] = "modified"
}
// Check that second call returns unmodified data
if len(formats2) > 0 && formats2[0] == "modified" {
t.Error("SupportedFormats() should return independent slices")
}
// Verify original functionality still works
formats3 := factory.SupportedFormats()
if len(formats3) == 0 {
t.Error("SupportedFormats() should still return formats after modification")
}
}
// TestFactory_ExporterTypes tests that created exporters are of correct types.
func TestFactory_ExporterTypes(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
// Test markdown exporter
markdownExporter, err := factory.CreateExporter("markdown")
if err != nil {
t.Fatalf("Failed to create markdown exporter: %v", err)
}
if _, ok := markdownExporter.(*MarkdownExporter); !ok {
t.Error("Markdown exporter should be of type *MarkdownExporter")
}
// Test docx exporter
docxExporter, err := factory.CreateExporter("docx")
if err != nil {
t.Fatalf("Failed to create docx exporter: %v", err)
}
if _, ok := docxExporter.(*DocxExporter); !ok {
t.Error("DOCX exporter should be of type *DocxExporter")
}
}
// TestFactory_HTMLCleanerPropagation tests that HTMLCleaner is properly passed to exporters.
func TestFactory_HTMLCleanerPropagation(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
// Test with markdown exporter
markdownExporter, err := factory.CreateExporter("markdown")
if err != nil {
t.Fatalf("Failed to create markdown exporter: %v", err)
}
markdownImpl, ok := markdownExporter.(*MarkdownExporter)
if !ok {
t.Fatal("Failed to cast to MarkdownExporter")
}
if markdownImpl.htmlCleaner == nil {
t.Error("HTMLCleaner should be propagated to MarkdownExporter")
}
// Test with docx exporter
docxExporter, err := factory.CreateExporter("docx")
if err != nil {
t.Fatalf("Failed to create docx exporter: %v", err)
}
docxImpl, ok := docxExporter.(*DocxExporter)
if !ok {
t.Fatal("Failed to cast to DocxExporter")
}
if docxImpl.htmlCleaner == nil {
t.Error("HTMLCleaner should be propagated to DocxExporter")
}
// Test with html exporter
htmlExporter, err := factory.CreateExporter("html")
if err != nil {
t.Fatalf("Failed to create html exporter: %v", err)
}
htmlImpl, ok := htmlExporter.(*HTMLExporter)
if !ok {
t.Fatal("Failed to cast to HTMLExporter")
}
if htmlImpl.htmlCleaner == nil {
t.Error("HTMLCleaner should be propagated to HTMLExporter")
}
}
// TestFactory_MultipleExporterCreation tests creating multiple exporters of same type.
func TestFactory_MultipleExporterCreation(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
// Create multiple markdown exporters
exporter1, err := factory.CreateExporter("markdown")
if err != nil {
t.Fatalf("Failed to create first markdown exporter: %v", err)
}
exporter2, err := factory.CreateExporter("md")
if err != nil {
t.Fatalf("Failed to create second markdown exporter: %v", err)
}
// They should be different instances
if exporter1 == exporter2 {
t.Error("Factory should create independent exporter instances")
}
// But both should be MarkdownExporter type
if _, ok := exporter1.(*MarkdownExporter); !ok {
t.Error("First exporter should be MarkdownExporter")
}
if _, ok := exporter2.(*MarkdownExporter); !ok {
t.Error("Second exporter should be MarkdownExporter")
}
}
// TestFactory_WithNilHTMLCleaner tests factory behavior with nil HTMLCleaner.
func TestFactory_WithNilHTMLCleaner(t *testing.T) {
// This tests edge case - should not panic but behavior may vary
defer func() {
if r := recover(); r != nil {
t.Errorf("Factory should handle nil HTMLCleaner gracefully, but panicked: %v", r)
}
}()
factory := NewFactory(nil)
if factory == nil {
t.Fatal("NewFactory(nil) returned nil")
}
// Try to create an exporter - this might fail or succeed depending on implementation
_, err := factory.CreateExporter("markdown")
// We don't assert on the error since nil HTMLCleaner handling is implementation-dependent
// The important thing is that it doesn't panic
_ = err
}
// TestFactory_FormatNormalization tests that format strings are properly normalized.
func TestFactory_FormatNormalization(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
// Test formats with extra whitespace
testCases := []struct {
input string
expected string
}{
{"markdown", "markdown"},
{"MARKDOWN", "markdown"},
{"Markdown", "markdown"},
{"docx", "docx"},
{"DOCX", "docx"},
{"Docx", "docx"},
}
for _, tc := range testCases {
t.Run(tc.input, func(t *testing.T) {
exporter, err := factory.CreateExporter(tc.input)
if err != nil {
t.Fatalf("Failed to create exporter for '%s': %v", tc.input, err)
}
format := exporter.SupportedFormat()
if format != tc.expected {
t.Errorf("Expected format '%s' for input '%s', got '%s'", tc.expected, tc.input, format)
}
})
}
}
// BenchmarkFactory_CreateExporter benchmarks the CreateExporter method.
func BenchmarkFactory_CreateExporter(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
for b.Loop() {
_, _ = factory.CreateExporter("markdown")
}
}
// BenchmarkFactory_CreateExporter_Docx benchmarks creating DOCX exporters.
func BenchmarkFactory_CreateExporter_Docx(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
for b.Loop() {
_, _ = factory.CreateExporter("docx")
}
}
// BenchmarkFactory_SupportedFormats benchmarks the SupportedFormats method.
func BenchmarkFactory_SupportedFormats(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
factory := NewFactory(htmlCleaner)
for b.Loop() {
_ = factory.SupportedFormats()
}
}

105
internal/exporters/html.go Normal file
View File

@ -0,0 +1,105 @@
package exporters
import (
_ "embed"
"fmt"
"html/template"
"io"
"os"
"github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
//go:embed html_styles.css
var defaultCSS string
//go:embed html_template.html
var htmlTemplate string
// HTMLExporter implements the Exporter interface for HTML format.
// It converts Articulate Rise course data into a structured HTML document using templates.
type HTMLExporter struct {
// htmlCleaner is used to convert HTML content to plain text when needed
htmlCleaner *services.HTMLCleaner
// tmpl holds the parsed HTML template
tmpl *template.Template
}
// NewHTMLExporter creates a new HTMLExporter instance.
// It takes an HTMLCleaner to handle HTML content conversion when plain text is needed.
//
// Parameters:
// - htmlCleaner: Service for cleaning HTML content in course data
//
// Returns:
// - An implementation of the Exporter interface for HTML format
func NewHTMLExporter(htmlCleaner *services.HTMLCleaner) interfaces.Exporter {
// Parse the template with custom functions
funcMap := template.FuncMap{
"safeHTML": func(s string) template.HTML {
return template.HTML(s) // #nosec G203 - HTML content is from trusted course data
},
"safeCSS": func(s string) template.CSS {
return template.CSS(s) // #nosec G203 - CSS content is from trusted embedded file
},
}
tmpl := template.Must(template.New("html").Funcs(funcMap).Parse(htmlTemplate))
return &HTMLExporter{
htmlCleaner: htmlCleaner,
tmpl: tmpl,
}
}
// Export exports a course to HTML format.
// It generates a structured HTML document from the course data
// and writes it to the specified output path.
//
// Parameters:
// - course: The course data model to export
// - outputPath: The file path where the HTML content will be written
//
// Returns:
// - An error if writing to the output file fails
func (e *HTMLExporter) Export(course *models.Course, outputPath string) error {
f, err := os.Create(outputPath)
if err != nil {
return fmt.Errorf("failed to create file: %w", err)
}
defer f.Close()
return e.WriteHTML(f, course)
}
// WriteHTML writes the HTML content to an io.Writer.
// This allows for better testability and flexibility in output destinations.
//
// Parameters:
// - w: The writer to output HTML content to
// - course: The course data model to export
//
// Returns:
// - An error if writing fails
func (e *HTMLExporter) WriteHTML(w io.Writer, course *models.Course) error {
// Prepare template data
data := prepareTemplateData(course, e.htmlCleaner)
// Execute template
if err := e.tmpl.Execute(w, data); err != nil {
return fmt.Errorf("failed to execute template: %w", err)
}
return nil
}
// SupportedFormat returns the format name this exporter supports
// It indicates the file format that the HTMLExporter can generate.
//
// Returns:
// - A string representing the supported format ("html")
func (e *HTMLExporter) SupportedFormat() string {
return FormatHTML
}

View File

@ -0,0 +1,173 @@
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
line-height: 1.6;
color: #333;
max-width: 800px;
margin: 0 auto;
padding: 20px;
background-color: #f9f9f9;
}
header {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 2rem;
border-radius: 10px;
margin-bottom: 2rem;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
}
header h1 {
margin: 0;
font-size: 2.5rem;
font-weight: 300;
}
.course-description {
margin-top: 1rem;
font-size: 1.1rem;
opacity: 0.9;
}
.course-info {
background: white;
padding: 1.5rem;
border-radius: 8px;
margin-bottom: 2rem;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}
.course-info h2 {
margin-top: 0;
color: #4a5568;
border-bottom: 2px solid #e2e8f0;
padding-bottom: 0.5rem;
}
.course-info ul {
list-style: none;
padding: 0;
}
.course-info li {
margin: 0.5rem 0;
padding: 0.5rem;
background: #f7fafc;
border-radius: 4px;
}
.course-section {
background: #4299e1;
color: white;
padding: 1.5rem;
border-radius: 8px;
margin: 2rem 0;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}
.course-section h2 {
margin: 0;
font-weight: 400;
}
.lesson {
background: white;
padding: 2rem;
border-radius: 8px;
margin: 2rem 0;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
border-left: 4px solid #4299e1;
}
.lesson h3 {
margin-top: 0;
color: #2d3748;
font-size: 1.5rem;
}
.lesson-description {
margin: 1rem 0;
padding: 1rem;
background: #f7fafc;
border-radius: 4px;
border-left: 3px solid #4299e1;
}
.item {
margin: 1.5rem 0;
padding: 1rem;
border-radius: 6px;
background: #fafafa;
border: 1px solid #e2e8f0;
}
.item h4 {
margin-top: 0;
color: #4a5568;
font-size: 1.2rem;
text-transform: capitalize;
}
.text-item {
background: #f0fff4;
border-left: 3px solid #48bb78;
}
.list-item {
background: #fffaf0;
border-left: 3px solid #ed8936;
}
.knowledge-check {
background: #e6fffa;
border-left: 3px solid #38b2ac;
}
.multimedia-item {
background: #faf5ff;
border-left: 3px solid #9f7aea;
}
.interactive-item {
background: #fff5f5;
border-left: 3px solid #f56565;
}
.unknown-item {
background: #f7fafc;
border-left: 3px solid #a0aec0;
}
.answers {
margin: 1rem 0;
}
.answers h5 {
margin: 0.5rem 0;
color: #4a5568;
}
.answers ol {
margin: 0.5rem 0;
padding-left: 1.5rem;
}
.answers li {
margin: 0.3rem 0;
padding: 0.3rem;
}
.correct-answer {
background: #c6f6d5;
border-radius: 3px;
font-weight: bold;
}
.correct-answer::after {
content: " ✓";
color: #38a169;
}
.feedback {
margin: 1rem 0;
padding: 1rem;
background: #edf2f7;
border-radius: 4px;
border-left: 3px solid #4299e1;
font-style: italic;
}
.media-info {
background: #edf2f7;
padding: 1rem;
border-radius: 4px;
margin: 0.5rem 0;
}
.media-info strong {
color: #4a5568;
}
hr {
border: none;
height: 2px;
background: linear-gradient(to right, #667eea, #764ba2);
margin: 2rem 0;
border-radius: 1px;
}
ul {
padding-left: 1.5rem;
}
li {
margin: 0.5rem 0;
}

View File

@ -0,0 +1,183 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{{.Course.Title}}</title>
<style>
{{safeCSS .CSS}}
</style>
</head>
<body>
<header>
<h1>{{.Course.Title}}</h1>
{{if .Course.Description}}
<div class="course-description">{{safeHTML .Course.Description}}</div>
{{end}}
</header>
<section class="course-info">
<h2>Course Information</h2>
<ul>
<li><strong>Course ID:</strong> {{.Course.ID}}</li>
<li><strong>Share ID:</strong> {{.ShareID}}</li>
<li><strong>Navigation Mode:</strong> {{.Course.NavigationMode}}</li>
{{if .Course.ExportSettings}}
<li><strong>Export Format:</strong> {{.Course.ExportSettings.Format}}</li>
{{end}}
</ul>
</section>
{{range .Sections}}
{{if eq .Type "section"}}
<section class="course-section">
<h2>{{.Title}}</h2>
</section>
{{else}}
<section class="lesson">
<h3>Lesson {{.Number}}: {{.Title}}</h3>
{{if .Description}}
<div class="lesson-description">{{safeHTML .Description}}</div>
{{end}}
{{range .Items}}
{{template "item" .}}
{{end}}
</section>
{{end}}
{{end}}
</body>
</html>
{{define "item"}}
{{if eq .Type "text"}}{{template "textItem" .}}
{{else if eq .Type "list"}}{{template "listItem" .}}
{{else if eq .Type "knowledgecheck"}}{{template "knowledgeCheckItem" .}}
{{else if eq .Type "multimedia"}}{{template "multimediaItem" .}}
{{else if eq .Type "image"}}{{template "imageItem" .}}
{{else if eq .Type "interactive"}}{{template "interactiveItem" .}}
{{else if eq .Type "divider"}}{{template "dividerItem" .}}
{{else}}{{template "unknownItem" .}}
{{end}}
{{end}}
{{define "textItem"}}
<div class="item text-item">
<h4>Text Content</h4>
{{range .Items}}
{{if .Heading}}
{{safeHTML .Heading}}
{{end}}
{{if .Paragraph}}
<div>{{safeHTML .Paragraph}}</div>
{{end}}
{{end}}
</div>
{{end}}
{{define "listItem"}}
<div class="item list-item">
<h4>List</h4>
<ul>
{{range .Items}}
{{if .Paragraph}}
<li>{{.CleanText}}</li>
{{end}}
{{end}}
</ul>
</div>
{{end}}
{{define "knowledgeCheckItem"}}
<div class="item knowledge-check">
<h4>Knowledge Check</h4>
{{range .Items}}
{{if .Title}}
<p><strong>Question:</strong> {{safeHTML .Title}}</p>
{{end}}
{{if .Answers}}
<div class="answers">
<h5>Answers:</h5>
<ol>
{{range .Answers}}
<li{{if .Correct}} class="correct-answer"{{end}}>{{.Title}}</li>
{{end}}
</ol>
</div>
{{end}}
{{if .Feedback}}
<div class="feedback"><strong>Feedback:</strong> {{safeHTML .Feedback}}</div>
{{end}}
{{end}}
</div>
{{end}}
{{define "multimediaItem"}}
<div class="item multimedia-item">
<h4>Media Content</h4>
{{range .Items}}
{{if .Title}}
<h5>{{.Title}}</h5>
{{end}}
{{if .Media}}
{{if .Media.Video}}
<div class="media-info">
<p><strong>Video:</strong> {{.Media.Video.OriginalURL}}</p>
{{if gt .Media.Video.Duration 0}}
<p><strong>Duration:</strong> {{.Media.Video.Duration}} seconds</p>
{{end}}
</div>
{{end}}
{{end}}
{{if .Caption}}
<div><em>{{.Caption}}</em></div>
{{end}}
{{end}}
</div>
{{end}}
{{define "imageItem"}}
<div class="item multimedia-item">
<h4>Image</h4>
{{range .Items}}
{{if and .Media .Media.Image}}
<div class="media-info">
<p><strong>Image:</strong> {{.Media.Image.OriginalURL}}</p>
</div>
{{end}}
{{if .Caption}}
<div><em>{{.Caption}}</em></div>
{{end}}
{{end}}
</div>
{{end}}
{{define "interactiveItem"}}
<div class="item interactive-item">
<h4>Interactive Content</h4>
{{range .Items}}
{{if .Title}}
<p><strong>{{.Title}}</strong></p>
{{end}}
{{if .Paragraph}}
<div>{{safeHTML .Paragraph}}</div>
{{end}}
{{end}}
</div>
{{end}}
{{define "dividerItem"}}
<hr>
{{end}}
{{define "unknownItem"}}
<div class="item unknown-item">
<h4>{{.TypeTitle}} Content</h4>
{{range .Items}}
{{if .Title}}
<p><strong>{{.Title}}</strong></p>
{{end}}
{{if .Paragraph}}
<div>{{safeHTML .Paragraph}}</div>
{{end}}
{{end}}
</div>
{{end}}

View File

@ -0,0 +1,131 @@
package exporters
import (
"strings"
"golang.org/x/text/cases"
"golang.org/x/text/language"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// Item type constants.
const (
itemTypeText = "text"
itemTypeList = "list"
itemTypeKnowledgeCheck = "knowledgecheck"
itemTypeMultimedia = "multimedia"
itemTypeImage = "image"
itemTypeInteractive = "interactive"
itemTypeDivider = "divider"
)
// templateData represents the data structure passed to the HTML template.
type templateData struct {
Course models.CourseInfo
ShareID string
Sections []templateSection
CSS string
}
// templateSection represents a course section or lesson.
type templateSection struct {
Type string
Title string
Number int
Description string
Items []templateItem
}
// templateItem represents a course item with preprocessed data.
type templateItem struct {
Type string
TypeTitle string
Items []templateSubItem
}
// templateSubItem represents a sub-item with preprocessed data.
type templateSubItem struct {
Heading string
Paragraph string
Title string
Caption string
CleanText string
Answers []models.Answer
Feedback string
Media *models.Media
}
// prepareTemplateData converts a Course model into template-friendly data.
func prepareTemplateData(course *models.Course, htmlCleaner *services.HTMLCleaner) *templateData {
data := &templateData{
Course: course.Course,
ShareID: course.ShareID,
Sections: make([]templateSection, 0, len(course.Course.Lessons)),
CSS: defaultCSS,
}
lessonCounter := 0
for _, lesson := range course.Course.Lessons {
section := templateSection{
Type: lesson.Type,
Title: lesson.Title,
Description: lesson.Description,
}
if lesson.Type != "section" {
lessonCounter++
section.Number = lessonCounter
section.Items = prepareItems(lesson.Items, htmlCleaner)
}
data.Sections = append(data.Sections, section)
}
return data
}
// prepareItems converts model Items to template Items.
func prepareItems(items []models.Item, htmlCleaner *services.HTMLCleaner) []templateItem {
result := make([]templateItem, 0, len(items))
for _, item := range items {
tItem := templateItem{
Type: strings.ToLower(item.Type),
Items: make([]templateSubItem, 0, len(item.Items)),
}
// Set type title for unknown items
if tItem.Type != itemTypeText && tItem.Type != itemTypeList && tItem.Type != itemTypeKnowledgeCheck &&
tItem.Type != itemTypeMultimedia && tItem.Type != itemTypeImage && tItem.Type != itemTypeInteractive &&
tItem.Type != itemTypeDivider {
caser := cases.Title(language.English)
tItem.TypeTitle = caser.String(item.Type)
}
// Process sub-items
for _, subItem := range item.Items {
tSubItem := templateSubItem{
Heading: subItem.Heading,
Paragraph: subItem.Paragraph,
Title: subItem.Title,
Caption: subItem.Caption,
Answers: subItem.Answers,
Feedback: subItem.Feedback,
Media: subItem.Media,
}
// Clean HTML for list items
if tItem.Type == itemTypeList && subItem.Paragraph != "" {
tSubItem.CleanText = htmlCleaner.CleanHTML(subItem.Paragraph)
}
tItem.Items = append(tItem.Items, tSubItem)
}
result = append(result, tItem)
}
return result
}

View File

@ -0,0 +1,508 @@
package exporters
import (
"os"
"path/filepath"
"strings"
"testing"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// TestNewHTMLExporter tests the NewHTMLExporter constructor.
func TestNewHTMLExporter(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
if exporter == nil {
t.Fatal("NewHTMLExporter() returned nil")
}
// Type assertion to check internal structure
htmlExporter, ok := exporter.(*HTMLExporter)
if !ok {
t.Fatal("NewHTMLExporter() returned wrong type")
}
if htmlExporter.htmlCleaner == nil {
t.Error("htmlCleaner should not be nil")
}
if htmlExporter.tmpl == nil {
t.Error("template should not be nil")
}
}
// TestHTMLExporter_SupportedFormat tests the SupportedFormat method.
func TestHTMLExporter_SupportedFormat(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
expected := "html"
result := exporter.SupportedFormat()
if result != expected {
t.Errorf("Expected format '%s', got '%s'", expected, result)
}
}
// TestHTMLExporter_Export tests the Export method.
func TestHTMLExporter_Export(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
// Create test course
testCourse := createTestCourseForHTML()
// Create temporary directory and file
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "test-course.html")
// Test successful export
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Check that file was created
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
// Read and verify content
content, err := os.ReadFile(outputPath)
if err != nil {
t.Fatalf("Failed to read output file: %v", err)
}
contentStr := string(content)
// Verify HTML structure
if !strings.Contains(contentStr, "<!DOCTYPE html>") {
t.Error("Output should contain HTML doctype")
}
if !strings.Contains(contentStr, "<html lang=\"en\">") {
t.Error("Output should contain HTML tag with lang attribute")
}
if !strings.Contains(contentStr, "<title>Test Course</title>") {
t.Error("Output should contain course title in head")
}
// Verify main course title
if !strings.Contains(contentStr, "<h1>Test Course</h1>") {
t.Error("Output should contain course title as main heading")
}
// Verify course information section
if !strings.Contains(contentStr, "Course Information") {
t.Error("Output should contain course information section")
}
// Verify course metadata
if !strings.Contains(contentStr, "Course ID") {
t.Error("Output should contain course ID")
}
if !strings.Contains(contentStr, "Share ID") {
t.Error("Output should contain share ID")
}
// Verify lesson content
if !strings.Contains(contentStr, "Lesson 1: Test Lesson") {
t.Error("Output should contain lesson heading")
}
// Verify CSS is included
if !strings.Contains(contentStr, "<style>") {
t.Error("Output should contain CSS styles")
}
if !strings.Contains(contentStr, "font-family") {
t.Logf("Generated HTML (first 500 chars):\n%s", contentStr[:min(500, len(contentStr))])
t.Error("Output should contain CSS font-family")
}
}
// TestHTMLExporter_Export_InvalidPath tests export with invalid output path.
func TestHTMLExporter_Export_InvalidPath(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
testCourse := createTestCourseForHTML()
// Try to export to invalid path (non-existent directory)
invalidPath := "/non/existent/path/test.html"
err := exporter.Export(testCourse, invalidPath)
if err == nil {
t.Error("Expected error for invalid output path, but got nil")
}
}
// TestHTMLExporter_ComplexCourse tests export of a course with complex content.
func TestHTMLExporter_ComplexCourse(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
// Create complex test course
course := &models.Course{
ShareID: "complex-test-id",
Author: "Test Author",
Course: models.CourseInfo{
ID: "complex-course",
Title: "Complex Test Course",
Description: "<p>This is a <strong>complex</strong> course description.</p>",
NavigationMode: "menu",
ExportSettings: &models.ExportSettings{
Format: "scorm",
},
Lessons: []models.Lesson{
{
ID: "section-1",
Title: "Course Section",
Type: "section",
},
{
ID: "lesson-1",
Title: "Introduction Lesson",
Type: "lesson",
Description: "<p>Introduction to the course</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h2>Welcome</h2>",
Paragraph: "<p>Welcome to our course!</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>First objective</p>"},
{Paragraph: "<p>Second objective</p>"},
},
},
{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "<p>What will you learn?</p>",
Answers: []models.Answer{
{Title: "Nothing", Correct: false},
{Title: "Everything", Correct: true},
},
Feedback: "<p>Great choice!</p>",
},
},
},
},
},
},
},
}
// Create temporary output file
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "complex-course.html")
// Export course
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Read and verify content
content, err := os.ReadFile(outputPath)
if err != nil {
t.Fatalf("Failed to read output file: %v", err)
}
contentStr := string(content)
// Verify various elements are present
checks := []string{
"<title>Complex Test Course</title>",
"<h1>Complex Test Course</h1>",
"This is a <strong>complex</strong> course description.",
"Course Information",
"complex-course",
"complex-test-id",
"menu",
"scorm",
"Course Section",
"Lesson 1: Introduction Lesson",
"Introduction to the course",
"<h2>Welcome</h2>",
"Welcome to our course!",
"First objective",
"Second objective",
"Knowledge Check",
"What will you learn?",
"Nothing",
"Everything",
"correct-answer",
"Great choice!",
}
for _, check := range checks {
if !strings.Contains(contentStr, check) {
t.Errorf("Output should contain: %q", check)
}
}
// Verify HTML structure
structureChecks := []string{
"<!DOCTYPE html>",
"<html lang=\"en\">",
"<head>",
"<body>",
"</html>",
"<style>",
"font-family",
}
for _, check := range structureChecks {
if !strings.Contains(contentStr, check) {
t.Errorf("Output should contain HTML structure element: %q", check)
}
}
}
// TestHTMLExporter_EmptyCourse tests export of an empty course.
func TestHTMLExporter_EmptyCourse(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
// Create minimal course
course := &models.Course{
ShareID: "empty-id",
Course: models.CourseInfo{
ID: "empty-course",
Title: "Empty Course",
Lessons: []models.Lesson{},
},
}
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "empty-course.html")
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Verify file was created
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
// Read and verify basic structure
content, err := os.ReadFile(outputPath)
if err != nil {
t.Fatalf("Failed to read output file: %v", err)
}
contentStr := string(content)
// Verify basic HTML structure even for empty course
if !strings.Contains(contentStr, "<!DOCTYPE html>") {
t.Error("Output should contain HTML doctype")
}
if !strings.Contains(contentStr, "<title>Empty Course</title>") {
t.Error("Output should contain course title")
}
if !strings.Contains(contentStr, "<h1>Empty Course</h1>") {
t.Error("Output should contain course heading")
}
}
// TestHTMLExporter_HTMLCleaning tests that HTML content is properly handled.
func TestHTMLExporter_HTMLCleaning(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
// Create course with HTML content that needs cleaning in some places
course := &models.Course{
ShareID: "html-test-id",
Course: models.CourseInfo{
ID: "html-test-course",
Title: "HTML Test Course",
Description: "<p>Description with <script>alert('xss')</script> and <b>bold</b> text.</p>",
Lessons: []models.Lesson{
{
ID: "lesson-1",
Title: "Test Lesson",
Type: "lesson",
Description: "<div>Lesson description with <span style='color:red'>styled</span> content.</div>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h2>HTML Heading</h2>",
Paragraph: "<p>Content with <em>emphasis</em> and <strong>strong</strong> text.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>List item with <b>bold</b> text</p>"},
},
},
},
},
},
},
}
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "html-test.html")
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
content, err := os.ReadFile(outputPath)
if err != nil {
t.Fatalf("Failed to read output file: %v", err)
}
contentStr := string(content)
// HTML content in descriptions should be preserved
if !strings.Contains(contentStr, "<b>bold</b>") {
t.Error("Should preserve HTML formatting in descriptions")
}
// HTML content in headings should be preserved
if !strings.Contains(contentStr, "<h2>HTML Heading</h2>") {
t.Error("Should preserve HTML in headings")
}
// List items should have HTML tags stripped (cleaned)
if !strings.Contains(contentStr, "List item with bold text") {
t.Error("Should clean HTML from list items")
}
}
// createTestCourseForHTML creates a test course for HTML export tests.
func createTestCourseForHTML() *models.Course {
return &models.Course{
ShareID: "test-share-id",
Course: models.CourseInfo{
ID: "test-course-id",
Title: "Test Course",
Description: "<p>Test course description with <strong>formatting</strong>.</p>",
NavigationMode: "free",
Lessons: []models.Lesson{
{
ID: "section-1",
Title: "Test Section",
Type: "section",
},
{
ID: "lesson-1",
Title: "Test Lesson",
Type: "lesson",
Description: "<p>Test lesson description</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h2>Test Heading</h2>",
Paragraph: "<p>Test paragraph content.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>First list item</p>"},
{Paragraph: "<p>Second list item</p>"},
},
},
},
},
},
},
}
}
// BenchmarkHTMLExporter_Export benchmarks the Export method.
func BenchmarkHTMLExporter_Export(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
course := createTestCourseForHTML()
tempDir := b.TempDir()
for i := range b.N {
outputPath := filepath.Join(tempDir, "bench-course-"+string(rune(i))+".html")
if err := exporter.Export(course, outputPath); err != nil {
b.Fatalf("Export failed: %v", err)
}
}
}
// BenchmarkHTMLExporter_ComplexCourse benchmarks export of a complex course.
func BenchmarkHTMLExporter_ComplexCourse(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewHTMLExporter(htmlCleaner)
// Create complex course for benchmarking
course := &models.Course{
ShareID: "benchmark-id",
Course: models.CourseInfo{
ID: "benchmark-course",
Title: "Benchmark Course",
Description: "<p>Complex course for performance testing</p>",
Lessons: make([]models.Lesson, 10), // 10 lessons
},
}
// Fill with test data
for i := range 10 {
lesson := models.Lesson{
ID: "lesson-" + string(rune(i)),
Title: "Benchmark Lesson " + string(rune(i)),
Type: "lesson",
Description: "<p>Lesson description</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h2>Heading</h2>",
Paragraph: "<p>Paragraph with content.</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>Item 1</p>"},
{Paragraph: "<p>Item 2</p>"},
},
},
},
}
course.Course.Lessons[i] = lesson
}
tempDir := b.TempDir()
for i := range b.N {
outputPath := filepath.Join(tempDir, "bench-complex-"+string(rune(i))+".html")
if err := exporter.Export(course, outputPath); err != nil {
b.Fatalf("Export failed: %v", err)
}
}
}

View File

@ -1,5 +1,3 @@
// Package exporters provides implementations of the Exporter interface
// for converting Articulate Rise courses into various file formats.
package exporters package exporters
import ( import (
@ -8,6 +6,9 @@ import (
"os" "os"
"strings" "strings"
"golang.org/x/text/cases"
"golang.org/x/text/language"
"github.com/kjanat/articulate-parser/internal/interfaces" "github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/models" "github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services" "github.com/kjanat/articulate-parser/internal/services"
@ -34,16 +35,7 @@ func NewMarkdownExporter(htmlCleaner *services.HTMLCleaner) interfaces.Exporter
} }
} }
// Export exports a course to Markdown format. // Export converts the course to Markdown format and writes it to the output path.
// It generates a structured Markdown document from the course data
// and writes it to the specified output path.
//
// Parameters:
// - course: The course data model to export
// - outputPath: The file path where the Markdown content will be written
//
// Returns:
// - An error if writing to the output file fails
func (e *MarkdownExporter) Export(course *models.Course, outputPath string) error { func (e *MarkdownExporter) Export(course *models.Course, outputPath string) error {
var buf bytes.Buffer var buf bytes.Buffer
@ -65,13 +57,15 @@ func (e *MarkdownExporter) Export(course *models.Course, outputPath string) erro
buf.WriteString("\n---\n\n") buf.WriteString("\n---\n\n")
// Process lessons // Process lessons
for i, lesson := range course.Course.Lessons { lessonCounter := 0
for _, lesson := range course.Course.Lessons {
if lesson.Type == "section" { if lesson.Type == "section" {
buf.WriteString(fmt.Sprintf("# %s\n\n", lesson.Title)) buf.WriteString(fmt.Sprintf("# %s\n\n", lesson.Title))
continue continue
} }
buf.WriteString(fmt.Sprintf("## Lesson %d: %s\n\n", i+1, lesson.Title)) lessonCounter++
buf.WriteString(fmt.Sprintf("## Lesson %d: %s\n\n", lessonCounter, lesson.Title))
if lesson.Description != "" { if lesson.Description != "" {
buf.WriteString(fmt.Sprintf("%s\n\n", e.htmlCleaner.CleanHTML(lesson.Description))) buf.WriteString(fmt.Sprintf("%s\n\n", e.htmlCleaner.CleanHTML(lesson.Description)))
@ -85,141 +79,198 @@ func (e *MarkdownExporter) Export(course *models.Course, outputPath string) erro
buf.WriteString("\n---\n\n") buf.WriteString("\n---\n\n")
} }
return os.WriteFile(outputPath, buf.Bytes(), 0644) // #nosec G306 - 0644 is appropriate for export files that should be readable by others
if err := os.WriteFile(outputPath, buf.Bytes(), 0o644); err != nil {
return fmt.Errorf("failed to write markdown file: %w", err)
}
return nil
} }
// GetSupportedFormat returns the format name this exporter supports // SupportedFormat returns "markdown".
// It indicates the file format that the MarkdownExporter can generate. func (e *MarkdownExporter) SupportedFormat() string {
// return FormatMarkdown
// Returns:
// - A string representing the supported format ("markdown")
func (e *MarkdownExporter) GetSupportedFormat() string {
return "markdown"
} }
// processItemToMarkdown converts a course item into Markdown format // processItemToMarkdown converts a course item into Markdown format.
// and appends it to the provided buffer. It handles different item types // The level parameter determines the heading level (number of # characters).
// with appropriate Markdown formatting.
//
// Parameters:
// - buf: The buffer to write the Markdown content to
// - item: The course item to process
// - level: The heading level for the item (determines the number of # characters)
func (e *MarkdownExporter) processItemToMarkdown(buf *bytes.Buffer, item models.Item, level int) { func (e *MarkdownExporter) processItemToMarkdown(buf *bytes.Buffer, item models.Item, level int) {
headingPrefix := strings.Repeat("#", level) headingPrefix := strings.Repeat("#", level)
switch item.Type { switch item.Type {
case "text": case "text":
for _, subItem := range item.Items { e.processTextItem(buf, item, headingPrefix)
if subItem.Heading != "" {
heading := e.htmlCleaner.CleanHTML(subItem.Heading)
if heading != "" {
buf.WriteString(fmt.Sprintf("%s %s\n\n", headingPrefix, heading))
}
}
if subItem.Paragraph != "" {
paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
if paragraph != "" {
buf.WriteString(fmt.Sprintf("%s\n\n", paragraph))
}
}
}
case "list": case "list":
for _, subItem := range item.Items { e.processListItem(buf, item)
if subItem.Paragraph != "" {
paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
if paragraph != "" {
buf.WriteString(fmt.Sprintf("- %s\n", paragraph))
}
}
}
buf.WriteString("\n")
case "multimedia": case "multimedia":
buf.WriteString(fmt.Sprintf("%s Media Content\n\n", headingPrefix)) e.processMultimediaItem(buf, item, headingPrefix)
for _, subItem := range item.Items {
if subItem.Media != nil {
if subItem.Media.Video != nil {
buf.WriteString(fmt.Sprintf("**Video**: %s\n", subItem.Media.Video.OriginalUrl))
if subItem.Media.Video.Duration > 0 {
buf.WriteString(fmt.Sprintf("**Duration**: %d seconds\n", subItem.Media.Video.Duration))
}
}
if subItem.Media.Image != nil {
buf.WriteString(fmt.Sprintf("**Image**: %s\n", subItem.Media.Image.OriginalUrl))
}
}
if subItem.Caption != "" {
caption := e.htmlCleaner.CleanHTML(subItem.Caption)
buf.WriteString(fmt.Sprintf("*%s*\n", caption))
}
}
buf.WriteString("\n")
case "image": case "image":
buf.WriteString(fmt.Sprintf("%s Image\n\n", headingPrefix)) e.processImageItem(buf, item, headingPrefix)
for _, subItem := range item.Items {
if subItem.Media != nil && subItem.Media.Image != nil {
buf.WriteString(fmt.Sprintf("**Image**: %s\n", subItem.Media.Image.OriginalUrl))
}
if subItem.Caption != "" {
caption := e.htmlCleaner.CleanHTML(subItem.Caption)
buf.WriteString(fmt.Sprintf("*%s*\n", caption))
}
}
buf.WriteString("\n")
case "knowledgeCheck": case "knowledgeCheck":
buf.WriteString(fmt.Sprintf("%s Knowledge Check\n\n", headingPrefix)) e.processKnowledgeCheckItem(buf, item, headingPrefix)
for _, subItem := range item.Items {
if subItem.Title != "" {
title := e.htmlCleaner.CleanHTML(subItem.Title)
buf.WriteString(fmt.Sprintf("**Question**: %s\n\n", title))
}
buf.WriteString("**Answers**:\n")
for i, answer := range subItem.Answers {
correctMark := ""
if answer.Correct {
correctMark = " ✓"
}
buf.WriteString(fmt.Sprintf("%d. %s%s\n", i+1, answer.Title, correctMark))
}
if subItem.Feedback != "" {
feedback := e.htmlCleaner.CleanHTML(subItem.Feedback)
buf.WriteString(fmt.Sprintf("\n**Feedback**: %s\n", feedback))
}
}
buf.WriteString("\n")
case "interactive": case "interactive":
buf.WriteString(fmt.Sprintf("%s Interactive Content\n\n", headingPrefix)) e.processInteractiveItem(buf, item, headingPrefix)
for _, subItem := range item.Items { case "divider":
if subItem.Title != "" { e.processDividerItem(buf)
title := e.htmlCleaner.CleanHTML(subItem.Title) default:
buf.WriteString(fmt.Sprintf("**%s**\n\n", title)) e.processUnknownItem(buf, item, headingPrefix)
}
}
// processTextItem handles text content with headings and paragraphs.
func (e *MarkdownExporter) processTextItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
for _, subItem := range item.Items {
if subItem.Heading != "" {
heading := e.htmlCleaner.CleanHTML(subItem.Heading)
if heading != "" {
fmt.Fprintf(buf, "%s %s\n\n", headingPrefix, heading)
} }
} }
if subItem.Paragraph != "" {
case "divider": paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
buf.WriteString("---\n\n") if paragraph != "" {
fmt.Fprintf(buf, "%s\n\n", paragraph)
default:
// Handle unknown types
if len(item.Items) > 0 {
buf.WriteString(fmt.Sprintf("%s %s Content\n\n", headingPrefix, strings.Title(item.Type)))
for _, subItem := range item.Items {
if subItem.Title != "" {
title := e.htmlCleaner.CleanHTML(subItem.Title)
buf.WriteString(fmt.Sprintf("**%s**\n\n", title))
}
if subItem.Paragraph != "" {
paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
buf.WriteString(fmt.Sprintf("%s\n\n", paragraph))
}
} }
} }
} }
} }
// processListItem handles list items with bullet points.
func (e *MarkdownExporter) processListItem(buf *bytes.Buffer, item models.Item) {
for _, subItem := range item.Items {
if subItem.Paragraph != "" {
paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
if paragraph != "" {
fmt.Fprintf(buf, "- %s\n", paragraph)
}
}
}
buf.WriteString("\n")
}
// processMultimediaItem handles multimedia content including videos and images.
func (e *MarkdownExporter) processMultimediaItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
fmt.Fprintf(buf, "%s Media Content\n\n", headingPrefix)
for _, subItem := range item.Items {
e.processMediaSubItem(buf, subItem)
}
buf.WriteString("\n")
}
// processMediaSubItem processes individual media items (video/image).
func (e *MarkdownExporter) processMediaSubItem(buf *bytes.Buffer, subItem models.SubItem) {
if subItem.Media != nil {
e.processVideoMedia(buf, subItem.Media)
e.processImageMedia(buf, subItem.Media)
}
if subItem.Caption != "" {
caption := e.htmlCleaner.CleanHTML(subItem.Caption)
fmt.Fprintf(buf, "*%s*\n", caption)
}
}
// processVideoMedia processes video media content.
func (e *MarkdownExporter) processVideoMedia(buf *bytes.Buffer, media *models.Media) {
if media.Video != nil {
fmt.Fprintf(buf, "**Video**: %s\n", media.Video.OriginalURL)
if media.Video.Duration > 0 {
fmt.Fprintf(buf, "**Duration**: %d seconds\n", media.Video.Duration)
}
}
}
// processImageMedia processes image media content.
func (e *MarkdownExporter) processImageMedia(buf *bytes.Buffer, media *models.Media) {
if media.Image != nil {
fmt.Fprintf(buf, "**Image**: %s\n", media.Image.OriginalURL)
}
}
// processImageItem handles standalone image items.
func (e *MarkdownExporter) processImageItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
fmt.Fprintf(buf, "%s Image\n\n", headingPrefix)
for _, subItem := range item.Items {
if subItem.Media != nil && subItem.Media.Image != nil {
fmt.Fprintf(buf, "**Image**: %s\n", subItem.Media.Image.OriginalURL)
}
if subItem.Caption != "" {
caption := e.htmlCleaner.CleanHTML(subItem.Caption)
fmt.Fprintf(buf, "*%s*\n", caption)
}
}
buf.WriteString("\n")
}
// processKnowledgeCheckItem handles quiz questions and knowledge checks.
func (e *MarkdownExporter) processKnowledgeCheckItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
fmt.Fprintf(buf, "%s Knowledge Check\n\n", headingPrefix)
for _, subItem := range item.Items {
e.processQuestionSubItem(buf, subItem)
}
buf.WriteString("\n")
}
// processQuestionSubItem processes individual question items.
func (e *MarkdownExporter) processQuestionSubItem(buf *bytes.Buffer, subItem models.SubItem) {
if subItem.Title != "" {
title := e.htmlCleaner.CleanHTML(subItem.Title)
fmt.Fprintf(buf, "**Question**: %s\n\n", title)
}
e.processAnswers(buf, subItem.Answers)
if subItem.Feedback != "" {
feedback := e.htmlCleaner.CleanHTML(subItem.Feedback)
fmt.Fprintf(buf, "\n**Feedback**: %s\n", feedback)
}
}
// processAnswers processes answer choices for quiz questions.
func (e *MarkdownExporter) processAnswers(buf *bytes.Buffer, answers []models.Answer) {
buf.WriteString("**Answers**:\n")
for i, answer := range answers {
correctMark := ""
if answer.Correct {
correctMark = " ✓"
}
fmt.Fprintf(buf, "%d. %s%s\n", i+1, answer.Title, correctMark)
}
}
// processInteractiveItem handles interactive content.
func (e *MarkdownExporter) processInteractiveItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
fmt.Fprintf(buf, "%s Interactive Content\n\n", headingPrefix)
for _, subItem := range item.Items {
if subItem.Title != "" {
title := e.htmlCleaner.CleanHTML(subItem.Title)
fmt.Fprintf(buf, "**%s**\n\n", title)
}
}
}
// processDividerItem handles divider elements.
func (e *MarkdownExporter) processDividerItem(buf *bytes.Buffer) {
buf.WriteString("---\n\n")
}
// processUnknownItem handles unknown or unsupported item types.
func (e *MarkdownExporter) processUnknownItem(buf *bytes.Buffer, item models.Item, headingPrefix string) {
if len(item.Items) > 0 {
caser := cases.Title(language.English)
fmt.Fprintf(buf, "%s %s Content\n\n", headingPrefix, caser.String(item.Type))
for _, subItem := range item.Items {
e.processGenericSubItem(buf, subItem)
}
}
}
// processGenericSubItem processes sub-items for unknown types.
func (e *MarkdownExporter) processGenericSubItem(buf *bytes.Buffer, subItem models.SubItem) {
if subItem.Title != "" {
title := e.htmlCleaner.CleanHTML(subItem.Title)
fmt.Fprintf(buf, "**%s**\n\n", title)
}
if subItem.Paragraph != "" {
paragraph := e.htmlCleaner.CleanHTML(subItem.Paragraph)
fmt.Fprintf(buf, "%s\n\n", paragraph)
}
}

View File

@ -0,0 +1,692 @@
package exporters
import (
"bytes"
"os"
"path/filepath"
"strings"
"testing"
"github.com/kjanat/articulate-parser/internal/models"
"github.com/kjanat/articulate-parser/internal/services"
)
// TestNewMarkdownExporter tests the NewMarkdownExporter constructor.
func TestNewMarkdownExporter(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewMarkdownExporter(htmlCleaner)
if exporter == nil {
t.Fatal("NewMarkdownExporter() returned nil")
}
// Type assertion to check internal structure
markdownExporter, ok := exporter.(*MarkdownExporter)
if !ok {
t.Fatal("NewMarkdownExporter() returned wrong type")
}
if markdownExporter.htmlCleaner == nil {
t.Error("htmlCleaner should not be nil")
}
}
// TestMarkdownExporter_SupportedFormat tests the SupportedFormat method.
func TestMarkdownExporter_SupportedFormat(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewMarkdownExporter(htmlCleaner)
expected := "markdown"
result := exporter.SupportedFormat()
if result != expected {
t.Errorf("Expected format '%s', got '%s'", expected, result)
}
}
// TestMarkdownExporter_Export tests the Export method.
func TestMarkdownExporter_Export(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewMarkdownExporter(htmlCleaner)
// Create test course
testCourse := createTestCourseForMarkdown()
// Create temporary directory and file
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "test-course.md")
// Test successful export
err := exporter.Export(testCourse, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Check that file was created
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
t.Fatal("Output file was not created")
}
// Read and verify content
content, err := os.ReadFile(outputPath)
if err != nil {
t.Fatalf("Failed to read output file: %v", err)
}
contentStr := string(content)
// Verify main course title
if !strings.Contains(contentStr, "# Test Course") {
t.Error("Output should contain course title as main heading")
}
// Verify course information section
if !strings.Contains(contentStr, "## Course Information") {
t.Error("Output should contain course information section")
}
// Verify course metadata
if !strings.Contains(contentStr, "- **Course ID**: test-course-id") {
t.Error("Output should contain course ID")
}
if !strings.Contains(contentStr, "- **Share ID**: test-share-id") {
t.Error("Output should contain share ID")
}
// Verify lesson content
if !strings.Contains(contentStr, "## Lesson 1: Test Lesson") {
t.Error("Output should contain lesson heading")
}
// Verify section handling
if !strings.Contains(contentStr, "# Test Section") {
t.Error("Output should contain section as main heading")
}
}
// TestMarkdownExporter_Export_InvalidPath tests export with invalid output path.
func TestMarkdownExporter_Export_InvalidPath(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewMarkdownExporter(htmlCleaner)
testCourse := createTestCourseForMarkdown()
// Try to write to invalid path
invalidPath := "/invalid/path/that/does/not/exist/file.md"
err := exporter.Export(testCourse, invalidPath)
if err == nil {
t.Fatal("Expected error for invalid path, got nil")
}
}
// TestMarkdownExporter_ProcessTextItem tests the processTextItem method.
func TestMarkdownExporter_ProcessTextItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h1>Test Heading</h1>",
Paragraph: "<p>Test paragraph with <strong>bold</strong> text.</p>",
},
{
Paragraph: "<p>Another paragraph.</p>",
},
},
}
exporter.processTextItem(&buf, item, "###")
result := buf.String()
expected := "### Test Heading\n\nTest paragraph with bold text.\n\nAnother paragraph.\n\n"
if result != expected {
t.Errorf("Expected:\n%q\nGot:\n%q", expected, result)
}
}
// TestMarkdownExporter_ProcessListItem tests the processListItem method.
func TestMarkdownExporter_ProcessListItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>First item</p>"},
{Paragraph: "<p>Second item with <em>emphasis</em></p>"},
{Paragraph: "<p>Third item</p>"},
},
}
exporter.processListItem(&buf, item)
result := buf.String()
expected := "- First item\n- Second item with emphasis\n- Third item\n\n"
if result != expected {
t.Errorf("Expected:\n%q\nGot:\n%q", expected, result)
}
}
// TestMarkdownExporter_ProcessMultimediaItem tests the processMultimediaItem method.
func TestMarkdownExporter_ProcessMultimediaItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "multimedia",
Items: []models.SubItem{
{
Media: &models.Media{
Video: &models.VideoMedia{
OriginalURL: "https://example.com/video.mp4",
Duration: 120,
},
},
Caption: "<p>Video caption</p>",
},
},
}
exporter.processMultimediaItem(&buf, item, "###")
result := buf.String()
if !strings.Contains(result, "### Media Content") {
t.Error("Should contain media content heading")
}
if !strings.Contains(result, "**Video**: https://example.com/video.mp4") {
t.Error("Should contain video URL")
}
if !strings.Contains(result, "**Duration**: 120 seconds") {
t.Error("Should contain video duration")
}
if !strings.Contains(result, "*Video caption*") {
t.Error("Should contain video caption")
}
}
// TestMarkdownExporter_ProcessImageItem tests the processImageItem method.
func TestMarkdownExporter_ProcessImageItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "image",
Items: []models.SubItem{
{
Media: &models.Media{
Image: &models.ImageMedia{
OriginalURL: "https://example.com/image.jpg",
},
},
Caption: "<p>Image caption</p>",
},
},
}
exporter.processImageItem(&buf, item, "###")
result := buf.String()
if !strings.Contains(result, "### Image") {
t.Error("Should contain image heading")
}
if !strings.Contains(result, "**Image**: https://example.com/image.jpg") {
t.Error("Should contain image URL")
}
if !strings.Contains(result, "*Image caption*") {
t.Error("Should contain image caption")
}
}
// TestMarkdownExporter_ProcessKnowledgeCheckItem tests the processKnowledgeCheckItem method.
func TestMarkdownExporter_ProcessKnowledgeCheckItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "<p>What is the capital of France?</p>",
Answers: []models.Answer{
{Title: "London", Correct: false},
{Title: "Paris", Correct: true},
{Title: "Berlin", Correct: false},
},
Feedback: "<p>Paris is the capital of France.</p>",
},
},
}
exporter.processKnowledgeCheckItem(&buf, item, "###")
result := buf.String()
if !strings.Contains(result, "### Knowledge Check") {
t.Error("Should contain knowledge check heading")
}
if !strings.Contains(result, "**Question**: What is the capital of France?") {
t.Error("Should contain question")
}
if !strings.Contains(result, "**Answers**:") {
t.Error("Should contain answers heading")
}
if !strings.Contains(result, "2. Paris ✓") {
t.Error("Should mark correct answer")
}
if !strings.Contains(result, "**Feedback**: Paris is the capital of France.") {
t.Error("Should contain feedback")
}
}
// TestMarkdownExporter_ProcessInteractiveItem tests the processInteractiveItem method.
func TestMarkdownExporter_ProcessInteractiveItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "interactive",
Items: []models.SubItem{
{Title: "<p>Interactive element title</p>"},
},
}
exporter.processInteractiveItem(&buf, item, "###")
result := buf.String()
if !strings.Contains(result, "### Interactive Content") {
t.Error("Should contain interactive content heading")
}
if !strings.Contains(result, "**Interactive element title**") {
t.Error("Should contain interactive element title")
}
}
// TestMarkdownExporter_ProcessDividerItem tests the processDividerItem method.
func TestMarkdownExporter_ProcessDividerItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
exporter.processDividerItem(&buf)
result := buf.String()
expected := "---\n\n"
if result != expected {
t.Errorf("Expected %q, got %q", expected, result)
}
}
// TestMarkdownExporter_ProcessUnknownItem tests the processUnknownItem method.
func TestMarkdownExporter_ProcessUnknownItem(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
item := models.Item{
Type: "unknown",
Items: []models.SubItem{
{
Title: "<p>Unknown item title</p>",
Paragraph: "<p>Unknown item content</p>",
},
},
}
exporter.processUnknownItem(&buf, item, "###")
result := buf.String()
if !strings.Contains(result, "### Unknown Content") {
t.Error("Should contain unknown content heading")
}
if !strings.Contains(result, "**Unknown item title**") {
t.Error("Should contain unknown item title")
}
if !strings.Contains(result, "Unknown item content") {
t.Error("Should contain unknown item content")
}
}
// TestMarkdownExporter_ProcessVideoMedia tests the processVideoMedia method.
func TestMarkdownExporter_ProcessVideoMedia(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
media := &models.Media{
Video: &models.VideoMedia{
OriginalURL: "https://example.com/video.mp4",
Duration: 300,
},
}
exporter.processVideoMedia(&buf, media)
result := buf.String()
if !strings.Contains(result, "**Video**: https://example.com/video.mp4") {
t.Error("Should contain video URL")
}
if !strings.Contains(result, "**Duration**: 300 seconds") {
t.Error("Should contain video duration")
}
}
// TestMarkdownExporter_ProcessImageMedia tests the processImageMedia method.
func TestMarkdownExporter_ProcessImageMedia(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
media := &models.Media{
Image: &models.ImageMedia{
OriginalURL: "https://example.com/image.jpg",
},
}
exporter.processImageMedia(&buf, media)
result := buf.String()
expected := "**Image**: https://example.com/image.jpg\n"
if result != expected {
t.Errorf("Expected %q, got %q", expected, result)
}
}
// TestMarkdownExporter_ProcessAnswers tests the processAnswers method.
func TestMarkdownExporter_ProcessAnswers(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
var buf bytes.Buffer
answers := []models.Answer{
{Title: "Answer 1", Correct: false},
{Title: "Answer 2", Correct: true},
{Title: "Answer 3", Correct: false},
}
exporter.processAnswers(&buf, answers)
result := buf.String()
if !strings.Contains(result, "**Answers**:") {
t.Error("Should contain answers heading")
}
if !strings.Contains(result, "1. Answer 1") {
t.Error("Should contain first answer")
}
if !strings.Contains(result, "2. Answer 2 ✓") {
t.Error("Should mark correct answer")
}
if !strings.Contains(result, "3. Answer 3") {
t.Error("Should contain third answer")
}
}
// TestMarkdownExporter_ProcessItemToMarkdown_AllTypes tests all item types.
func TestMarkdownExporter_ProcessItemToMarkdown_AllTypes(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
tests := []struct {
name string
itemType string
expectedText string
}{
{
name: "text item",
itemType: "text",
expectedText: "", // processTextItem handles empty items
},
{
name: "list item",
itemType: "list",
expectedText: "\n", // Empty list adds newline
},
{
name: "multimedia item",
itemType: "multimedia",
expectedText: "### Media Content",
},
{
name: "image item",
itemType: "image",
expectedText: "### Image",
},
{
name: "knowledgeCheck item",
itemType: "knowledgeCheck",
expectedText: "### Knowledge Check",
},
{
name: "interactive item",
itemType: "interactive",
expectedText: "### Interactive Content",
},
{
name: "divider item",
itemType: "divider",
expectedText: "---",
},
{
name: "unknown item",
itemType: "unknown",
expectedText: "", // Empty unknown items don't add content
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
var buf bytes.Buffer
item := models.Item{Type: tt.itemType}
exporter.processItemToMarkdown(&buf, item, 3)
result := buf.String()
if tt.expectedText != "" && !strings.Contains(result, tt.expectedText) {
t.Errorf("Expected result to contain %q, got %q", tt.expectedText, result)
}
})
}
}
// TestMarkdownExporter_ComplexCourse tests export of a complex course structure.
func TestMarkdownExporter_ComplexCourse(t *testing.T) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewMarkdownExporter(htmlCleaner)
// Create complex test course
course := &models.Course{
ShareID: "complex-test-id",
Author: "Test Author",
Course: models.CourseInfo{
ID: "complex-course",
Title: "Complex Test Course",
Description: "<p>This is a <strong>complex</strong> course description.</p>",
NavigationMode: "menu",
ExportSettings: &models.ExportSettings{
Format: "scorm",
},
Lessons: []models.Lesson{
{
ID: "section-1",
Title: "Course Section",
Type: "section",
},
{
ID: "lesson-1",
Title: "Introduction Lesson",
Type: "lesson",
Description: "<p>Introduction to the course</p>",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h2>Welcome</h2>",
Paragraph: "<p>Welcome to our course!</p>",
},
},
},
{
Type: "list",
Items: []models.SubItem{
{Paragraph: "<p>First objective</p>"},
{Paragraph: "<p>Second objective</p>"},
},
},
{
Type: "knowledgeCheck",
Items: []models.SubItem{
{
Title: "<p>What will you learn?</p>",
Answers: []models.Answer{
{Title: "Nothing", Correct: false},
{Title: "Everything", Correct: true},
},
Feedback: "<p>Great choice!</p>",
},
},
},
},
},
},
},
}
// Create temporary output file
tempDir := t.TempDir()
outputPath := filepath.Join(tempDir, "complex-course.md")
// Export course
err := exporter.Export(course, outputPath)
if err != nil {
t.Fatalf("Export failed: %v", err)
}
// Read and verify content
content, err := os.ReadFile(outputPath)
if err != nil {
t.Fatalf("Failed to read output file: %v", err)
}
contentStr := string(content)
// Verify various elements are present
checks := []string{
"# Complex Test Course",
"This is a complex course description.",
"- **Export Format**: scorm",
"# Course Section",
"## Lesson 1: Introduction Lesson",
"Introduction to the course",
"### Welcome",
"Welcome to our course!",
"- First objective",
"- Second objective",
"### Knowledge Check",
"**Question**: What will you learn?",
"2. Everything ✓",
"**Feedback**: Great choice!",
}
for _, check := range checks {
if !strings.Contains(contentStr, check) {
t.Errorf("Output should contain: %q", check)
}
}
}
// createTestCourseForMarkdown creates a test course for markdown export testing.
func createTestCourseForMarkdown() *models.Course {
return &models.Course{
ShareID: "test-share-id",
Author: "Test Author",
Course: models.CourseInfo{
ID: "test-course-id",
Title: "Test Course",
Description: "Test course description",
NavigationMode: "menu",
Lessons: []models.Lesson{
{
ID: "section-1",
Title: "Test Section",
Type: "section",
},
{
ID: "lesson-1",
Title: "Test Lesson",
Type: "lesson",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "Test Heading",
Paragraph: "Test paragraph content",
},
},
},
},
},
},
},
}
}
// BenchmarkMarkdownExporter_Export benchmarks the Export method.
func BenchmarkMarkdownExporter_Export(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
exporter := NewMarkdownExporter(htmlCleaner)
course := createTestCourseForMarkdown()
// Create temporary directory
tempDir := b.TempDir()
for b.Loop() {
outputPath := filepath.Join(tempDir, "benchmark-course.md")
_ = exporter.Export(course, outputPath)
// Clean up for next iteration. Remove errors are ignored because we've already
// benchmarked the export operation; cleanup failures don't affect the benchmark
// measurements or the validity of the next iteration's export.
_ = os.Remove(outputPath)
}
}
// BenchmarkMarkdownExporter_ProcessTextItem benchmarks the processTextItem method.
func BenchmarkMarkdownExporter_ProcessTextItem(b *testing.B) {
htmlCleaner := services.NewHTMLCleaner()
exporter := &MarkdownExporter{htmlCleaner: htmlCleaner}
item := models.Item{
Type: "text",
Items: []models.SubItem{
{
Heading: "<h1>Benchmark Heading</h1>",
Paragraph: "<p>Benchmark paragraph with <strong>bold</strong> text.</p>",
},
},
}
for b.Loop() {
var buf bytes.Buffer
exporter.processTextItem(&buf, item, "###")
}
}

Binary file not shown.

View File

@ -0,0 +1,12 @@
# Example Course
Course description
## Course Information
- **Course ID**:
- **Share ID**: example-id
- **Navigation Mode**:
---

View File

@ -1,5 +1,3 @@
// Package interfaces provides the core contracts for the articulate-parser application.
// It defines interfaces for parsing and exporting Articulate Rise courses.
package interfaces package interfaces
import "github.com/kjanat/articulate-parser/internal/models" import "github.com/kjanat/articulate-parser/internal/models"
@ -12,9 +10,9 @@ type Exporter interface {
// specified output path. It returns an error if the export operation fails. // specified output path. It returns an error if the export operation fails.
Export(course *models.Course, outputPath string) error Export(course *models.Course, outputPath string) error
// GetSupportedFormat returns the name of the format this exporter supports. // SupportedFormat returns the name of the format this exporter supports.
// This is used to identify which exporter to use for a given format. // This is used to identify which exporter to use for a given format.
GetSupportedFormat() string SupportedFormat() string
} }
// ExporterFactory creates exporters for different formats. // ExporterFactory creates exporters for different formats.
@ -25,7 +23,7 @@ type ExporterFactory interface {
// It returns the appropriate exporter or an error if the format is not supported. // It returns the appropriate exporter or an error if the format is not supported.
CreateExporter(format string) (Exporter, error) CreateExporter(format string) (Exporter, error)
// GetSupportedFormats returns a list of all export formats supported by this factory. // SupportedFormats returns a list of all export formats supported by this factory.
// This is used to inform users of available export options. // This is used to inform users of available export options.
GetSupportedFormats() []string SupportedFormats() []string
} }

View File

@ -0,0 +1,25 @@
package interfaces
import "context"
// Logger defines the interface for structured logging.
// Implementations should provide leveled, structured logging capabilities.
type Logger interface {
// Debug logs a debug-level message with optional key-value pairs.
Debug(msg string, keysAndValues ...any)
// Info logs an info-level message with optional key-value pairs.
Info(msg string, keysAndValues ...any)
// Warn logs a warning-level message with optional key-value pairs.
Warn(msg string, keysAndValues ...any)
// Error logs an error-level message with optional key-value pairs.
Error(msg string, keysAndValues ...any)
// With returns a new logger with the given key-value pairs added as context.
With(keysAndValues ...any) Logger
// WithContext returns a new logger with context information.
WithContext(ctx context.Context) Logger
}

View File

@ -2,7 +2,11 @@
// It defines interfaces for parsing and exporting Articulate Rise courses. // It defines interfaces for parsing and exporting Articulate Rise courses.
package interfaces package interfaces
import "github.com/kjanat/articulate-parser/internal/models" import (
"context"
"github.com/kjanat/articulate-parser/internal/models"
)
// CourseParser defines the interface for loading course data. // CourseParser defines the interface for loading course data.
// It provides methods to fetch course content either from a remote URI // It provides methods to fetch course content either from a remote URI
@ -10,8 +14,9 @@ import "github.com/kjanat/articulate-parser/internal/models"
type CourseParser interface { type CourseParser interface {
// FetchCourse loads a course from a URI (typically an Articulate Rise share URL). // FetchCourse loads a course from a URI (typically an Articulate Rise share URL).
// It retrieves the course data from the remote location and returns a parsed Course model. // It retrieves the course data from the remote location and returns a parsed Course model.
// The context can be used for cancellation and timeout control.
// Returns an error if the fetch operation fails or if the data cannot be parsed. // Returns an error if the fetch operation fails or if the data cannot be parsed.
FetchCourse(uri string) (*models.Course, error) FetchCourse(ctx context.Context, uri string) (*models.Course, error)
// LoadCourseFromFile loads a course from a local file. // LoadCourseFromFile loads a course from a local file.
// It reads and parses the course data from the specified file path. // It reads and parses the course data from the specified file path.

View File

@ -1,5 +1,3 @@
// Package models defines the data structures representing Articulate Rise courses.
// These structures closely match the JSON format used by Articulate Rise.
package models package models
// Lesson represents a single lesson or section within an Articulate Rise course. // Lesson represents a single lesson or section within an Articulate Rise course.
@ -18,7 +16,7 @@ type Lesson struct {
// Items is an ordered array of content items within the lesson // Items is an ordered array of content items within the lesson
Items []Item `json:"items"` Items []Item `json:"items"`
// Position stores the ordering information for the lesson // Position stores the ordering information for the lesson
Position interface{} `json:"position"` Position any `json:"position"`
// Ready indicates whether the lesson is marked as complete // Ready indicates whether the lesson is marked as complete
Ready bool `json:"ready"` Ready bool `json:"ready"`
// CreatedAt is the timestamp when the lesson was created // CreatedAt is the timestamp when the lesson was created
@ -41,9 +39,9 @@ type Item struct {
// Items contains the actual content elements (sub-items) of this item // Items contains the actual content elements (sub-items) of this item
Items []SubItem `json:"items"` Items []SubItem `json:"items"`
// Settings contains configuration options specific to this item type // Settings contains configuration options specific to this item type
Settings interface{} `json:"settings"` Settings any `json:"settings"`
// Data contains additional structured data for the item // Data contains additional structured data for the item
Data interface{} `json:"data"` Data any `json:"data"`
// Media contains any associated media for the item // Media contains any associated media for the item
Media *Media `json:"media,omitempty"` Media *Media `json:"media,omitempty"`
} }

View File

@ -1,5 +1,3 @@
// Package models defines the data structures representing Articulate Rise courses.
// These structures closely match the JSON format used by Articulate Rise.
package models package models
// Media represents a media element that can be either an image or a video. // Media represents a media element that can be either an image or a video.
@ -23,8 +21,8 @@ type ImageMedia struct {
Height int `json:"height,omitempty"` Height int `json:"height,omitempty"`
// CrushedKey is the identifier for a compressed version of the image // CrushedKey is the identifier for a compressed version of the image
CrushedKey string `json:"crushedKey,omitempty"` CrushedKey string `json:"crushedKey,omitempty"`
// OriginalUrl is the URL to the full-resolution image // OriginalURL is the URL to the full-resolution image
OriginalUrl string `json:"originalUrl"` OriginalURL string `json:"originalUrl"`
// UseCrushedKey indicates whether to use the compressed version // UseCrushedKey indicates whether to use the compressed version
UseCrushedKey bool `json:"useCrushedKey,omitempty"` UseCrushedKey bool `json:"useCrushedKey,omitempty"`
} }
@ -45,6 +43,6 @@ type VideoMedia struct {
InputKey string `json:"inputKey,omitempty"` InputKey string `json:"inputKey,omitempty"`
// Thumbnail is the URL to a smaller preview image // Thumbnail is the URL to a smaller preview image
Thumbnail string `json:"thumbnail,omitempty"` Thumbnail string `json:"thumbnail,omitempty"`
// OriginalUrl is the URL to the source video file // OriginalURL is the URL to the source video file
OriginalUrl string `json:"originalUrl"` OriginalURL string `json:"originalUrl"`
} }

View File

@ -0,0 +1,787 @@
package models
import (
"encoding/json"
"reflect"
"testing"
)
// TestCourse_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of Course.
func TestCourse_JSONMarshalUnmarshal(t *testing.T) {
original := Course{
ShareID: "test-share-id",
Author: "Test Author",
Course: CourseInfo{
ID: "course-123",
Title: "Test Course",
Description: "A test course description",
Color: "#FF5733",
NavigationMode: "menu",
Lessons: []Lesson{
{
ID: "lesson-1",
Title: "First Lesson",
Description: "Lesson description",
Type: "lesson",
Icon: "icon-1",
Ready: true,
CreatedAt: "2023-01-01T00:00:00Z",
UpdatedAt: "2023-01-02T00:00:00Z",
},
},
ExportSettings: &ExportSettings{
Title: "Export Title",
Format: "scorm",
},
},
LabelSet: LabelSet{
ID: "labelset-1",
Name: "Test Labels",
},
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal Course to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled Course
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal Course from JSON: %v", err)
}
// Compare structures
if !reflect.DeepEqual(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled Course structs do not match")
t.Logf("Original: %+v", original)
t.Logf("Unmarshaled: %+v", unmarshaled)
}
}
// TestCourseInfo_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of CourseInfo.
func TestCourseInfo_JSONMarshalUnmarshal(t *testing.T) {
original := CourseInfo{
ID: "course-456",
Title: "Another Test Course",
Description: "Another test description",
Color: "#33FF57",
NavigationMode: "linear",
Lessons: []Lesson{
{
ID: "lesson-2",
Title: "Second Lesson",
Type: "section",
Items: []Item{
{
ID: "item-1",
Type: "text",
Family: "text",
Variant: "paragraph",
Items: []SubItem{
{
Title: "Sub Item Title",
Heading: "Sub Item Heading",
Paragraph: "Sub item paragraph content",
},
},
},
},
},
},
CoverImage: &Media{
Image: &ImageMedia{
Key: "img-123",
Type: "jpg",
Width: 800,
Height: 600,
OriginalURL: "https://example.com/image.jpg",
},
},
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal CourseInfo to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled CourseInfo
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal CourseInfo from JSON: %v", err)
}
// Compare structures
if !reflect.DeepEqual(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled CourseInfo structs do not match")
}
}
// TestLesson_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of Lesson.
func TestLesson_JSONMarshalUnmarshal(t *testing.T) {
original := Lesson{
ID: "lesson-test",
Title: "Test Lesson",
Description: "Test lesson description",
Type: "lesson",
Icon: "lesson-icon",
Ready: true,
CreatedAt: "2023-06-01T12:00:00Z",
UpdatedAt: "2023-06-01T13:00:00Z",
Position: map[string]any{"x": 1, "y": 2},
Items: []Item{
{
ID: "item-test",
Type: "multimedia",
Family: "media",
Variant: "video",
Items: []SubItem{
{
Caption: "Video caption",
Media: &Media{
Video: &VideoMedia{
Key: "video-123",
URL: "https://example.com/video.mp4",
Type: "mp4",
Duration: 120,
OriginalURL: "https://example.com/video.mp4",
},
},
},
},
Settings: map[string]any{"autoplay": false},
Data: map[string]any{"metadata": "test"},
},
},
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal Lesson to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled Lesson
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal Lesson from JSON: %v", err)
}
// Compare structures
if !compareLessons(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled Lesson structs do not match")
}
}
// TestItem_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of Item.
func TestItem_JSONMarshalUnmarshal(t *testing.T) {
original := Item{
ID: "item-json-test",
Type: "knowledgeCheck",
Family: "assessment",
Variant: "multipleChoice",
Items: []SubItem{
{
Title: "What is the answer?",
Answers: []Answer{
{Title: "Option A", Correct: false},
{Title: "Option B", Correct: true},
{Title: "Option C", Correct: false},
},
Feedback: "Well done!",
},
},
Settings: map[string]any{
"allowRetry": true,
"showAnswer": true,
},
Data: map[string]any{
"points": 10,
"weight": 1.5,
},
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal Item to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled Item
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal Item from JSON: %v", err)
}
// Compare structures
if !compareItem(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled Item structs do not match")
}
}
// TestSubItem_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of SubItem.
func TestSubItem_JSONMarshalUnmarshal(t *testing.T) {
original := SubItem{
Title: "Test SubItem Title",
Heading: "Test SubItem Heading",
Paragraph: "Test paragraph with content",
Caption: "Test caption",
Feedback: "Test feedback message",
Answers: []Answer{
{Title: "First answer", Correct: true},
{Title: "Second answer", Correct: false},
},
Media: &Media{
Image: &ImageMedia{
Key: "subitem-img",
Type: "png",
Width: 400,
Height: 300,
OriginalURL: "https://example.com/subitem.png",
CrushedKey: "crushed-123",
UseCrushedKey: true,
},
},
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal SubItem to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled SubItem
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal SubItem from JSON: %v", err)
}
// Compare structures
if !reflect.DeepEqual(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled SubItem structs do not match")
}
}
// TestAnswer_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of Answer.
func TestAnswer_JSONMarshalUnmarshal(t *testing.T) {
original := Answer{
Title: "Test answer text",
Correct: true,
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal Answer to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled Answer
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal Answer from JSON: %v", err)
}
// Compare structures
if !reflect.DeepEqual(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled Answer structs do not match")
}
}
// TestMedia_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of Media.
func TestMedia_JSONMarshalUnmarshal(t *testing.T) {
// Test with Image
originalImage := Media{
Image: &ImageMedia{
Key: "media-img-test",
Type: "jpeg",
Width: 1200,
Height: 800,
OriginalURL: "https://example.com/media.jpg",
CrushedKey: "crushed-media",
UseCrushedKey: false,
},
}
jsonData, err := json.Marshal(originalImage)
if err != nil {
t.Fatalf("Failed to marshal Media with Image to JSON: %v", err)
}
var unmarshaledImage Media
err = json.Unmarshal(jsonData, &unmarshaledImage)
if err != nil {
t.Fatalf("Failed to unmarshal Media with Image from JSON: %v", err)
}
if !reflect.DeepEqual(originalImage, unmarshaledImage) {
t.Errorf("Marshaled and unmarshaled Media with Image do not match")
}
// Test with Video
originalVideo := Media{
Video: &VideoMedia{
Key: "media-video-test",
URL: "https://example.com/media.mp4",
Type: "mp4",
Duration: 300,
Poster: "https://example.com/poster.jpg",
Thumbnail: "https://example.com/thumb.jpg",
InputKey: "input-123",
OriginalURL: "https://example.com/original.mp4",
},
}
jsonData, err = json.Marshal(originalVideo)
if err != nil {
t.Fatalf("Failed to marshal Media with Video to JSON: %v", err)
}
var unmarshaledVideo Media
err = json.Unmarshal(jsonData, &unmarshaledVideo)
if err != nil {
t.Fatalf("Failed to unmarshal Media with Video from JSON: %v", err)
}
if !reflect.DeepEqual(originalVideo, unmarshaledVideo) {
t.Errorf("Marshaled and unmarshaled Media with Video do not match")
}
}
// TestImageMedia_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of ImageMedia.
func TestImageMedia_JSONMarshalUnmarshal(t *testing.T) {
original := ImageMedia{
Key: "image-media-test",
Type: "gif",
Width: 640,
Height: 480,
OriginalURL: "https://example.com/image.gif",
CrushedKey: "crushed-gif",
UseCrushedKey: true,
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal ImageMedia to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled ImageMedia
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal ImageMedia from JSON: %v", err)
}
// Compare structures
if !reflect.DeepEqual(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled ImageMedia structs do not match")
}
}
// TestVideoMedia_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of VideoMedia.
func TestVideoMedia_JSONMarshalUnmarshal(t *testing.T) {
original := VideoMedia{
Key: "video-media-test",
URL: "https://example.com/video.webm",
Type: "webm",
Duration: 450,
Poster: "https://example.com/poster.jpg",
Thumbnail: "https://example.com/thumbnail.jpg",
InputKey: "upload-456",
OriginalURL: "https://example.com/original.webm",
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal VideoMedia to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled VideoMedia
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal VideoMedia from JSON: %v", err)
}
// Compare structures
if !reflect.DeepEqual(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled VideoMedia structs do not match")
}
}
// TestExportSettings_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of ExportSettings.
func TestExportSettings_JSONMarshalUnmarshal(t *testing.T) {
original := ExportSettings{
Title: "Custom Export Title",
Format: "xAPI",
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal ExportSettings to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled ExportSettings
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal ExportSettings from JSON: %v", err)
}
// Compare structures
if !reflect.DeepEqual(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled ExportSettings structs do not match")
}
}
// TestLabelSet_JSONMarshalUnmarshal tests JSON marshaling and unmarshaling of LabelSet.
func TestLabelSet_JSONMarshalUnmarshal(t *testing.T) {
original := LabelSet{
ID: "labelset-test",
Name: "Test Label Set",
}
// Marshal to JSON
jsonData, err := json.Marshal(original)
if err != nil {
t.Fatalf("Failed to marshal LabelSet to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled LabelSet
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal LabelSet from JSON: %v", err)
}
// Compare structures
if !reflect.DeepEqual(original, unmarshaled) {
t.Errorf("Marshaled and unmarshaled LabelSet structs do not match")
}
}
// TestEmptyStructures tests marshaling and unmarshaling of empty structures.
func TestEmptyStructures(t *testing.T) {
testCases := []struct {
name string
data any
}{
{"Empty Course", Course{}},
{"Empty CourseInfo", CourseInfo{}},
{"Empty Lesson", Lesson{}},
{"Empty Item", Item{}},
{"Empty SubItem", SubItem{}},
{"Empty Answer", Answer{}},
{"Empty Media", Media{}},
{"Empty ImageMedia", ImageMedia{}},
{"Empty VideoMedia", VideoMedia{}},
{"Empty ExportSettings", ExportSettings{}},
{"Empty LabelSet", LabelSet{}},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
// Marshal to JSON
jsonData, err := json.Marshal(tc.data)
if err != nil {
t.Fatalf("Failed to marshal %s to JSON: %v", tc.name, err)
}
// Unmarshal from JSON
result := reflect.New(reflect.TypeOf(tc.data)).Interface()
err = json.Unmarshal(jsonData, result)
if err != nil {
t.Fatalf("Failed to unmarshal %s from JSON: %v", tc.name, err)
}
// Basic validation that no errors occurred
if len(jsonData) == 0 {
t.Errorf("%s should produce some JSON output", tc.name)
}
})
}
}
// TestNilPointerSafety tests that nil pointers in optional fields are handled correctly.
func TestNilPointerSafety(t *testing.T) {
course := Course{
ShareID: "nil-test",
Course: CourseInfo{
ID: "nil-course",
Title: "Nil Pointer Test",
CoverImage: nil, // Test nil pointer
ExportSettings: nil, // Test nil pointer
Lessons: []Lesson{
{
ID: "lesson-nil",
Title: "Lesson with nil media",
Items: []Item{
{
ID: "item-nil",
Type: "text",
Items: []SubItem{
{
Title: "SubItem with nil media",
Media: nil, // Test nil pointer
},
},
Media: nil, // Test nil pointer
},
},
},
},
},
}
// Marshal to JSON
jsonData, err := json.Marshal(course)
if err != nil {
t.Fatalf("Failed to marshal Course with nil pointers to JSON: %v", err)
}
// Unmarshal from JSON
var unmarshaled Course
err = json.Unmarshal(jsonData, &unmarshaled)
if err != nil {
t.Fatalf("Failed to unmarshal Course with nil pointers from JSON: %v", err)
}
// Basic validation
if unmarshaled.ShareID != "nil-test" {
t.Error("ShareID should be preserved")
}
if unmarshaled.Course.Title != "Nil Pointer Test" {
t.Error("Course title should be preserved")
}
}
// TestJSONTagsPresence tests that JSON tags are properly defined.
func TestJSONTagsPresence(t *testing.T) {
// Test that important fields have JSON tags
courseType := reflect.TypeFor[Course]()
if courseType.Kind() == reflect.Struct {
field, found := courseType.FieldByName("ShareID")
if !found {
t.Error("ShareID field not found")
} else {
tag := field.Tag.Get("json")
if tag == "" {
t.Error("ShareID should have json tag")
}
if tag != "shareId" {
t.Errorf("ShareID json tag should be 'shareId', got '%s'", tag)
}
}
}
// Test CourseInfo
courseInfoType := reflect.TypeFor[CourseInfo]()
if courseInfoType.Kind() == reflect.Struct {
field, found := courseInfoType.FieldByName("NavigationMode")
if !found {
t.Error("NavigationMode field not found")
} else {
tag := field.Tag.Get("json")
if tag == "" {
t.Error("NavigationMode should have json tag")
}
}
}
}
// BenchmarkCourse_JSONMarshal benchmarks JSON marshaling of Course.
func BenchmarkCourse_JSONMarshal(b *testing.B) {
course := Course{
ShareID: "benchmark-id",
Author: "Benchmark Author",
Course: CourseInfo{
ID: "benchmark-course",
Title: "Benchmark Course",
Lessons: []Lesson{
{
ID: "lesson-1",
Title: "Lesson 1",
Items: []Item{
{
ID: "item-1",
Type: "text",
Items: []SubItem{
{Title: "SubItem 1"},
},
},
},
},
},
},
}
for b.Loop() {
_, _ = json.Marshal(course)
}
}
// BenchmarkCourse_JSONUnmarshal benchmarks JSON unmarshaling of Course.
func BenchmarkCourse_JSONUnmarshal(b *testing.B) {
course := Course{
ShareID: "benchmark-id",
Author: "Benchmark Author",
Course: CourseInfo{
ID: "benchmark-course",
Title: "Benchmark Course",
Lessons: []Lesson{
{
ID: "lesson-1",
Title: "Lesson 1",
Items: []Item{
{
ID: "item-1",
Type: "text",
Items: []SubItem{
{Title: "SubItem 1"},
},
},
},
},
},
},
}
jsonData, _ := json.Marshal(course)
for b.Loop() {
var result Course
_ = json.Unmarshal(jsonData, &result)
}
}
// compareMaps compares two any values that should be maps.
func compareMaps(original, unmarshaled any) bool {
origMap, origOk := original.(map[string]any)
unMap, unOk := unmarshaled.(map[string]any)
if !origOk || !unOk {
// If not maps, use deep equal
return reflect.DeepEqual(original, unmarshaled)
}
if len(origMap) != len(unMap) {
return false
}
for key, origVal := range origMap {
unVal, exists := unMap[key]
if !exists {
return false
}
// Handle numeric type conversion from JSON
switch origVal := origVal.(type) {
case int:
if unFloat, ok := unVal.(float64); ok {
if float64(origVal) != unFloat {
return false
}
} else {
return false
}
case float64:
if unFloat, ok := unVal.(float64); ok {
if origVal != unFloat {
return false
}
} else {
return false
}
default:
if !reflect.DeepEqual(origVal, unVal) {
return false
}
}
}
return true
}
// compareLessons compares two Lesson structs accounting for JSON type conversion.
func compareLessons(original, unmarshaled Lesson) bool {
// Compare all fields except Position and Items
if original.ID != unmarshaled.ID ||
original.Title != unmarshaled.Title ||
original.Description != unmarshaled.Description ||
original.Type != unmarshaled.Type ||
original.Icon != unmarshaled.Icon ||
original.Ready != unmarshaled.Ready ||
original.CreatedAt != unmarshaled.CreatedAt ||
original.UpdatedAt != unmarshaled.UpdatedAt {
return false
}
// Compare Position
if !compareMaps(original.Position, unmarshaled.Position) {
return false
}
// Compare Items
return compareItems(original.Items, unmarshaled.Items)
}
// compareItems compares two Item slices accounting for JSON type conversion.
func compareItems(original, unmarshaled []Item) bool {
if len(original) != len(unmarshaled) {
return false
}
for i := range original {
if !compareItem(original[i], unmarshaled[i]) {
return false
}
}
return true
}
// compareItem compares two Item structs accounting for JSON type conversion.
func compareItem(original, unmarshaled Item) bool {
// Compare basic fields
if original.ID != unmarshaled.ID ||
original.Type != unmarshaled.Type ||
original.Family != unmarshaled.Family ||
original.Variant != unmarshaled.Variant {
return false
}
// Compare Settings and Data
if !compareMaps(original.Settings, unmarshaled.Settings) {
return false
}
if !compareMaps(original.Data, unmarshaled.Data) {
return false
}
// Compare Items (SubItems)
if len(original.Items) != len(unmarshaled.Items) {
return false
}
for i := range original.Items {
if !reflect.DeepEqual(original.Items[i], unmarshaled.Items[i]) {
return false
}
}
// Compare Media
if !reflect.DeepEqual(original.Media, unmarshaled.Media) {
return false
}
return true
}

View File

@ -3,6 +3,7 @@
package services package services
import ( import (
"context"
"fmt" "fmt"
"github.com/kjanat/articulate-parser/internal/interfaces" "github.com/kjanat/articulate-parser/internal/interfaces"
@ -44,8 +45,8 @@ func (a *App) ProcessCourseFromFile(filePath, format, outputPath string) error {
// ProcessCourseFromURI fetches a course from the provided URI and exports it to the specified format. // ProcessCourseFromURI fetches a course from the provided URI and exports it to the specified format.
// It takes the URI to fetch the course from, the desired export format, and the output file path. // It takes the URI to fetch the course from, the desired export format, and the output file path.
// Returns an error if fetching or exporting fails. // Returns an error if fetching or exporting fails.
func (a *App) ProcessCourseFromURI(uri, format, outputPath string) error { func (a *App) ProcessCourseFromURI(ctx context.Context, uri, format, outputPath string) error {
course, err := a.parser.FetchCourse(uri) course, err := a.parser.FetchCourse(ctx, uri)
if err != nil { if err != nil {
return fmt.Errorf("failed to fetch course: %w", err) return fmt.Errorf("failed to fetch course: %w", err)
} }
@ -69,8 +70,8 @@ func (a *App) exportCourse(course *models.Course, format, outputPath string) err
return nil return nil
} }
// GetSupportedFormats returns a list of all export formats supported by the application. // SupportedFormats returns a list of all export formats supported by the application.
// This information is provided by the ExporterFactory. // This information is provided by the ExporterFactory.
func (a *App) GetSupportedFormats() []string { func (a *App) SupportedFormats() []string {
return a.exporterFactory.GetSupportedFormats() return a.exporterFactory.SupportedFormats()
} }

View File

@ -0,0 +1,349 @@
package services
import (
"context"
"errors"
"testing"
"github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/models"
)
// MockCourseParser is a mock implementation of interfaces.CourseParser for testing.
type MockCourseParser struct {
mockFetchCourse func(ctx context.Context, uri string) (*models.Course, error)
mockLoadCourseFromFile func(filePath string) (*models.Course, error)
}
func (m *MockCourseParser) FetchCourse(ctx context.Context, uri string) (*models.Course, error) {
if m.mockFetchCourse != nil {
return m.mockFetchCourse(ctx, uri)
}
return nil, errors.New("not implemented")
}
func (m *MockCourseParser) LoadCourseFromFile(filePath string) (*models.Course, error) {
if m.mockLoadCourseFromFile != nil {
return m.mockLoadCourseFromFile(filePath)
}
return nil, errors.New("not implemented")
}
// MockExporter is a mock implementation of interfaces.Exporter for testing.
type MockExporter struct {
mockExport func(course *models.Course, outputPath string) error
mockSupportedFormat func() string
}
func (m *MockExporter) Export(course *models.Course, outputPath string) error {
if m.mockExport != nil {
return m.mockExport(course, outputPath)
}
return nil
}
func (m *MockExporter) SupportedFormat() string {
if m.mockSupportedFormat != nil {
return m.mockSupportedFormat()
}
return "mock"
}
// MockExporterFactory is a mock implementation of interfaces.ExporterFactory for testing.
type MockExporterFactory struct {
mockCreateExporter func(format string) (*MockExporter, error)
mockSupportedFormats func() []string
}
func (m *MockExporterFactory) CreateExporter(format string) (interfaces.Exporter, error) {
if m.mockCreateExporter != nil {
exporter, err := m.mockCreateExporter(format)
return exporter, err
}
return &MockExporter{}, nil
}
func (m *MockExporterFactory) SupportedFormats() []string {
if m.mockSupportedFormats != nil {
return m.mockSupportedFormats()
}
return []string{"mock"}
}
// createTestCourse creates a sample course for testing purposes.
func createTestCourse() *models.Course {
return &models.Course{
ShareID: "test-share-id",
Author: "Test Author",
Course: models.CourseInfo{
ID: "test-course-id",
Title: "Test Course",
Description: "This is a test course",
Lessons: []models.Lesson{
{
ID: "lesson-1",
Title: "Test Lesson",
Type: "lesson",
Items: []models.Item{
{
ID: "item-1",
Type: "text",
Items: []models.SubItem{
{
ID: "subitem-1",
Title: "Test Title",
Paragraph: "Test paragraph content",
},
},
},
},
},
},
},
}
}
// TestNewApp tests the NewApp constructor.
func TestNewApp(t *testing.T) {
parser := &MockCourseParser{}
factory := &MockExporterFactory{}
app := NewApp(parser, factory)
if app == nil {
t.Fatal("NewApp() returned nil")
}
if app.parser != parser {
t.Error("App parser was not set correctly")
}
// Test that the factory is set (we can't directly compare interface values)
formats := app.SupportedFormats()
if len(formats) == 0 {
t.Error("App exporterFactory was not set correctly - no supported formats")
}
}
// TestApp_ProcessCourseFromFile tests the ProcessCourseFromFile method.
func TestApp_ProcessCourseFromFile(t *testing.T) {
testCourse := createTestCourse()
tests := []struct {
name string
filePath string
format string
outputPath string
setupMocks func(*MockCourseParser, *MockExporterFactory, *MockExporter)
expectedError string
}{
{
name: "successful processing",
filePath: "test.json",
format: "markdown",
outputPath: "output.md",
setupMocks: func(parser *MockCourseParser, factory *MockExporterFactory, exporter *MockExporter) {
parser.mockLoadCourseFromFile = func(filePath string) (*models.Course, error) {
if filePath != "test.json" {
t.Errorf("Expected filePath 'test.json', got '%s'", filePath)
}
return testCourse, nil
}
factory.mockCreateExporter = func(format string) (*MockExporter, error) {
if format != "markdown" {
t.Errorf("Expected format 'markdown', got '%s'", format)
}
return exporter, nil
}
exporter.mockExport = func(course *models.Course, outputPath string) error {
if outputPath != "output.md" {
t.Errorf("Expected outputPath 'output.md', got '%s'", outputPath)
}
if course != testCourse {
t.Error("Expected course to match testCourse")
}
return nil
}
},
},
{
name: "file loading error",
filePath: "nonexistent.json",
format: "markdown",
outputPath: "output.md",
setupMocks: func(parser *MockCourseParser, factory *MockExporterFactory, exporter *MockExporter) {
parser.mockLoadCourseFromFile = func(filePath string) (*models.Course, error) {
return nil, errors.New("file not found")
}
},
expectedError: "failed to load course from file",
},
{
name: "exporter creation error",
filePath: "test.json",
format: "unsupported",
outputPath: "output.txt",
setupMocks: func(parser *MockCourseParser, factory *MockExporterFactory, exporter *MockExporter) {
parser.mockLoadCourseFromFile = func(filePath string) (*models.Course, error) {
return testCourse, nil
}
factory.mockCreateExporter = func(format string) (*MockExporter, error) {
return nil, errors.New("unsupported format")
}
},
expectedError: "failed to create exporter",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
parser := &MockCourseParser{}
exporter := &MockExporter{}
factory := &MockExporterFactory{}
tt.setupMocks(parser, factory, exporter)
app := NewApp(parser, factory)
err := app.ProcessCourseFromFile(tt.filePath, tt.format, tt.outputPath)
if tt.expectedError != "" {
if err == nil {
t.Fatalf("Expected error containing '%s', got nil", tt.expectedError)
}
if !contains(err.Error(), tt.expectedError) {
t.Errorf("Expected error containing '%s', got '%s'", tt.expectedError, err.Error())
}
} else if err != nil {
t.Errorf("Expected no error, got: %v", err)
}
})
}
}
// TestApp_ProcessCourseFromURI tests the ProcessCourseFromURI method.
func TestApp_ProcessCourseFromURI(t *testing.T) {
testCourse := createTestCourse()
tests := []struct {
name string
uri string
format string
outputPath string
setupMocks func(*MockCourseParser, *MockExporterFactory, *MockExporter)
expectedError string
}{
{
name: "successful processing",
uri: "https://rise.articulate.com/share/test123",
format: "docx",
outputPath: "output.docx",
setupMocks: func(parser *MockCourseParser, factory *MockExporterFactory, exporter *MockExporter) {
parser.mockFetchCourse = func(ctx context.Context, uri string) (*models.Course, error) {
if uri != "https://rise.articulate.com/share/test123" {
t.Errorf("Expected uri 'https://rise.articulate.com/share/test123', got '%s'", uri)
}
return testCourse, nil
}
factory.mockCreateExporter = func(format string) (*MockExporter, error) {
if format != "docx" {
t.Errorf("Expected format 'docx', got '%s'", format)
}
return exporter, nil
}
exporter.mockExport = func(course *models.Course, outputPath string) error {
if outputPath != "output.docx" {
t.Errorf("Expected outputPath 'output.docx', got '%s'", outputPath)
}
return nil
}
},
},
{
name: "fetch error",
uri: "invalid-uri",
format: "docx",
outputPath: "output.docx",
setupMocks: func(parser *MockCourseParser, factory *MockExporterFactory, exporter *MockExporter) {
parser.mockFetchCourse = func(ctx context.Context, uri string) (*models.Course, error) {
return nil, errors.New("network error")
}
},
expectedError: "failed to fetch course",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
parser := &MockCourseParser{}
exporter := &MockExporter{}
factory := &MockExporterFactory{}
tt.setupMocks(parser, factory, exporter)
app := NewApp(parser, factory)
err := app.ProcessCourseFromURI(context.Background(), tt.uri, tt.format, tt.outputPath)
if tt.expectedError != "" {
if err == nil {
t.Fatalf("Expected error containing '%s', got nil", tt.expectedError)
}
if !contains(err.Error(), tt.expectedError) {
t.Errorf("Expected error containing '%s', got '%s'", tt.expectedError, err.Error())
}
} else if err != nil {
t.Errorf("Expected no error, got: %v", err)
}
})
}
}
// TestApp_SupportedFormats tests the SupportedFormats method.
func TestApp_SupportedFormats(t *testing.T) {
expectedFormats := []string{"markdown", "docx", "pdf"}
parser := &MockCourseParser{}
factory := &MockExporterFactory{
mockSupportedFormats: func() []string {
return expectedFormats
},
}
app := NewApp(parser, factory)
formats := app.SupportedFormats()
if len(formats) != len(expectedFormats) {
t.Errorf("Expected %d formats, got %d", len(expectedFormats), len(formats))
}
for i, format := range formats {
if format != expectedFormats[i] {
t.Errorf("Expected format '%s' at index %d, got '%s'", expectedFormats[i], i, format)
}
}
}
// contains checks if a string contains a substring.
func contains(s, substr string) bool {
return len(s) >= len(substr) &&
(substr == "" ||
s == substr ||
(len(s) > len(substr) &&
(s[:len(substr)] == substr ||
s[len(s)-len(substr):] == substr ||
containsSubstring(s, substr))))
}
// containsSubstring checks if s contains substr as a substring.
func containsSubstring(s, substr string) bool {
for i := 0; i <= len(s)-len(substr); i++ {
if s[i:i+len(substr)] == substr {
return true
}
}
return false
}

View File

@ -0,0 +1,96 @@
// Package services_test provides examples for the services package.
package services_test
import (
"context"
"fmt"
"log"
"github.com/kjanat/articulate-parser/internal/services"
)
// ExampleNewArticulateParser demonstrates creating a new parser.
func ExampleNewArticulateParser() {
// Create a no-op logger for this example
logger := services.NewNoOpLogger()
// Create parser with defaults
parser := services.NewArticulateParser(logger, "", 0)
fmt.Printf("Parser created: %T\n", parser)
// Output: Parser created: *services.ArticulateParser
}
// ExampleNewArticulateParser_custom demonstrates creating a parser with custom configuration.
func ExampleNewArticulateParser_custom() {
logger := services.NewNoOpLogger()
// Create parser with custom base URL and timeout
parser := services.NewArticulateParser(
logger,
"https://custom.articulate.com",
60_000_000_000, // 60 seconds in nanoseconds
)
fmt.Printf("Parser configured: %T\n", parser)
// Output: Parser configured: *services.ArticulateParser
}
// ExampleArticulateParser_LoadCourseFromFile demonstrates loading a course from a file.
func ExampleArticulateParser_LoadCourseFromFile() {
logger := services.NewNoOpLogger()
parser := services.NewArticulateParser(logger, "", 0)
// In a real scenario, you'd have an actual file
// This example shows the API usage
_, err := parser.LoadCourseFromFile("course.json")
if err != nil {
log.Printf("Failed to load course: %v", err)
}
}
// ExampleArticulateParser_FetchCourse demonstrates fetching a course from a URI.
func ExampleArticulateParser_FetchCourse() {
logger := services.NewNoOpLogger()
parser := services.NewArticulateParser(logger, "", 0)
// Create a context with timeout
ctx := context.Background()
// In a real scenario, you'd use an actual share URL
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/YOUR_SHARE_ID")
if err != nil {
log.Printf("Failed to fetch course: %v", err)
}
}
// ExampleHTMLCleaner demonstrates cleaning HTML content.
func ExampleHTMLCleaner() {
cleaner := services.NewHTMLCleaner()
html := "<p>This is <strong>bold</strong> text with entities.</p>"
clean := cleaner.CleanHTML(html)
fmt.Println(clean)
// Output: This is bold text with entities.
}
// ExampleHTMLCleaner_CleanHTML demonstrates complex HTML cleaning.
func ExampleHTMLCleaner_CleanHTML() {
cleaner := services.NewHTMLCleaner()
html := `
<div>
<h1>Title</h1>
<p>Paragraph with <a href="#">link</a> and &amp; entity.</p>
<ul>
<li>Item 1</li>
<li>Item 2</li>
</ul>
</div>
`
clean := cleaner.CleanHTML(html)
fmt.Println(clean)
// Output: Title Paragraph with link and & entity. Item 1 Item 2
}

View File

@ -1,15 +1,17 @@
// Package services provides the core functionality for the articulate-parser application.
// It implements the interfaces defined in the interfaces package.
package services package services
import ( import (
"regexp" "bytes"
stdhtml "html"
"io"
"strings" "strings"
"golang.org/x/net/html"
) )
// HTMLCleaner provides utilities for converting HTML content to plain text. // HTMLCleaner provides utilities for converting HTML content to plain text.
// It removes HTML tags while preserving their content and converts HTML entities // It removes HTML tags while preserving their content and converts HTML entities
// to their plain text equivalents. // to their plain text equivalents using proper HTML parsing instead of regex.
type HTMLCleaner struct{} type HTMLCleaner struct{}
// NewHTMLCleaner creates a new HTML cleaner instance. // NewHTMLCleaner creates a new HTML cleaner instance.
@ -20,34 +22,47 @@ func NewHTMLCleaner() *HTMLCleaner {
} }
// CleanHTML removes HTML tags and converts entities, returning clean plain text. // CleanHTML removes HTML tags and converts entities, returning clean plain text.
// The function preserves the textual content of the HTML while removing markup. // It parses the HTML into a node tree and extracts only text content,
// It handles common HTML entities like &nbsp;, &amp;, etc., and normalizes whitespace. // skipping script and style tags. HTML entities are automatically handled
// // by the parser, and whitespace is normalized.
// Parameters: func (h *HTMLCleaner) CleanHTML(htmlStr string) string {
// - html: The HTML content to clean // Parse the HTML into a node tree
// doc, err := html.Parse(strings.NewReader(htmlStr))
// Returns: if err != nil {
// - A plain text string with all HTML elements and entities removed/converted // If parsing fails, return empty string
func (h *HTMLCleaner) CleanHTML(html string) string { // This maintains backward compatibility with the test expectations
// Remove HTML tags but preserve content return ""
re := regexp.MustCompile(`<[^>]*>`) }
cleaned := re.ReplaceAllString(html, "")
// Replace common HTML entities with their character equivalents // Extract text content from the node tree
cleaned = strings.ReplaceAll(cleaned, "&nbsp;", " ") var buf bytes.Buffer
cleaned = strings.ReplaceAll(cleaned, "&amp;", "&") extractText(&buf, doc)
cleaned = strings.ReplaceAll(cleaned, "&lt;", "<")
cleaned = strings.ReplaceAll(cleaned, "&gt;", ">")
cleaned = strings.ReplaceAll(cleaned, "&quot;", "\"")
cleaned = strings.ReplaceAll(cleaned, "&#39;", "'")
cleaned = strings.ReplaceAll(cleaned, "&iuml;", "ï")
cleaned = strings.ReplaceAll(cleaned, "&euml;", "ë")
cleaned = strings.ReplaceAll(cleaned, "&eacute;", "é")
// Clean up extra whitespace by replacing multiple spaces, tabs, and newlines // Unescape any remaining HTML entities
// with a single space, then trim any leading/trailing whitespace unescaped := stdhtml.UnescapeString(buf.String())
cleaned = regexp.MustCompile(`\s+`).ReplaceAllString(cleaned, " ")
cleaned = strings.TrimSpace(cleaned)
return cleaned // Normalize whitespace: replace multiple spaces, tabs, and newlines with a single space
cleaned := strings.Join(strings.Fields(unescaped), " ")
return strings.TrimSpace(cleaned)
}
// extractText recursively traverses the HTML node tree and extracts text content.
// It skips script and style tags to avoid including their content in the output.
func extractText(w io.Writer, n *html.Node) {
// Skip script and style tags entirely
if n.Type == html.ElementNode && (n.Data == "script" || n.Data == "style") {
return
}
// If this is a text node, write its content
if n.Type == html.TextNode {
// Write errors are ignored because we're writing to an in-memory buffer
// which cannot fail in normal circumstances
_, _ = w.Write([]byte(n.Data))
}
// Recursively process all child nodes
for c := n.FirstChild; c != nil; c = c.NextSibling {
extractText(w, c)
}
} }

View File

@ -0,0 +1,322 @@
package services
import (
"strings"
"testing"
)
// TestNewHTMLCleaner tests the NewHTMLCleaner constructor.
func TestNewHTMLCleaner(t *testing.T) {
cleaner := NewHTMLCleaner()
if cleaner == nil {
t.Fatal("NewHTMLCleaner() returned nil")
}
}
// TestHTMLCleaner_CleanHTML tests the CleanHTML method with various HTML inputs.
func TestHTMLCleaner_CleanHTML(t *testing.T) {
cleaner := NewHTMLCleaner()
tests := []struct {
name string
input string
expected string
}{
{
name: "plain text (no HTML)",
input: "This is plain text",
expected: "This is plain text",
},
{
name: "empty string",
input: "",
expected: "",
},
{
name: "simple HTML tag",
input: "<p>Hello world</p>",
expected: "Hello world",
},
{
name: "multiple HTML tags",
input: "<h1>Title</h1><p>Paragraph text</p>",
expected: "TitleParagraph text",
},
{
name: "nested HTML tags",
input: "<div><h1>Title</h1><p>Paragraph with <strong>bold</strong> text</p></div>",
expected: "TitleParagraph with bold text",
},
{
name: "HTML with attributes",
input: "<p class=\"test\" id=\"para1\">Text with attributes</p>",
expected: "Text with attributes",
},
{
name: "self-closing tags",
input: "Line 1<br/>Line 2<hr/>End",
expected: "Line 1Line 2End",
},
{
name: "HTML entities - basic",
input: "AT&amp;T &lt;company&gt; &quot;quoted&quot; &nbsp; text",
expected: "AT&T <company> \"quoted\" text",
},
{
name: "HTML entities - apostrophe",
input: "It&#39;s a test",
expected: "It's a test",
},
{
name: "HTML entities - special characters",
input: "&iuml;ber &euml;lite &eacute;cart&eacute;",
expected: "ïber ëlite écarté",
},
{
name: "HTML entities - nbsp",
input: "Word1&nbsp;&nbsp;&nbsp;Word2",
expected: "Word1 Word2",
},
{
name: "mixed HTML and entities",
input: "<p>Hello &amp; welcome to <strong>our</strong> site!</p>",
expected: "Hello & welcome to our site!",
},
{
name: "multiple whitespace",
input: "Text with\t\tmultiple\n\nspaces",
expected: "Text with multiple spaces",
},
{
name: "whitespace with HTML",
input: "<p> Text with </p> <div> spaces </div> ",
expected: "Text with spaces",
},
{
name: "complex content",
input: "<div class=\"content\"><h1>Course Title</h1><p>This is a <em>great</em> course about &amp; HTML entities like &nbsp; and &quot;quotes&quot;.</p></div>",
expected: "Course TitleThis is a great course about & HTML entities like and \"quotes\".",
},
{
name: "malformed HTML",
input: "<p>Unclosed paragraph<div>Another <span>tag</p></div>",
expected: "Unclosed paragraphAnother tag",
},
{
name: "HTML comments (should be removed)",
input: "Text before<!-- This is a comment -->Text after",
expected: "Text beforeText after",
},
{
name: "script and style tags content",
input: "<script>alert('test');</script>Content<style>body{color:red;}</style>",
expected: "Content", // Script and style tags are correctly skipped
},
{
name: "line breaks and formatting",
input: "<p>Line 1</p>\n<p>Line 2</p>\n<p>Line 3</p>",
expected: "Line 1 Line 2 Line 3",
},
{
name: "only whitespace",
input: " \t\n ",
expected: "",
},
{
name: "only HTML tags",
input: "<div><p></p></div>",
expected: "",
},
{
name: "HTML with newlines",
input: "<p>\n Paragraph with\n line breaks\n</p>",
expected: "Paragraph with line breaks",
},
{
name: "complex nested structure",
input: "<article><header><h1>Title</h1></header><section><p>First paragraph with <a href=\"#\">link</a>.</p><ul><li>Item 1</li><li>Item 2</li></ul></section></article>",
expected: "TitleFirst paragraph with link.Item 1Item 2",
},
{
name: "entities in attributes (should still be processed)",
input: "<p title=\"AT&amp;T\">Content</p>",
expected: "Content",
},
{
name: "special HTML5 entities",
input: "Left arrow &larr; Right arrow &rarr;",
expected: "Left arrow ← Right arrow →", // HTML5 entities are properly handled by the parser
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := cleaner.CleanHTML(tt.input)
if result != tt.expected {
t.Errorf("CleanHTML(%q) = %q, want %q", tt.input, result, tt.expected)
}
})
}
}
// TestHTMLCleaner_CleanHTML_LargeContent tests the CleanHTML method with large content.
func TestHTMLCleaner_CleanHTML_LargeContent(t *testing.T) {
cleaner := NewHTMLCleaner()
// Create a large HTML string
var builder strings.Builder
builder.WriteString("<html><body>")
for i := range 1000 {
builder.WriteString("<p>Paragraph ")
builder.WriteString(string(rune('0' + i%10)))
builder.WriteString(" with some content &amp; entities.</p>")
}
builder.WriteString("</body></html>")
input := builder.String()
result := cleaner.CleanHTML(input)
// Check that HTML tags are removed
if strings.Contains(result, "<") || strings.Contains(result, ">") {
t.Error("Result should not contain HTML tags")
}
// Check that content is preserved
if !strings.Contains(result, "Paragraph") {
t.Error("Result should contain paragraph content")
}
// Check that entities are converted
if strings.Contains(result, "&amp;") {
t.Error("Result should not contain unconverted HTML entities")
}
if !strings.Contains(result, "&") {
t.Error("Result should contain converted ampersand")
}
}
// TestHTMLCleaner_CleanHTML_EdgeCases tests edge cases for the CleanHTML method.
func TestHTMLCleaner_CleanHTML_EdgeCases(t *testing.T) {
cleaner := NewHTMLCleaner()
tests := []struct {
name string
input string
expected string
}{
{
name: "only entities",
input: "&amp;&lt;&gt;&quot;&#39;&nbsp;",
expected: "&<>\"'",
},
{
name: "repeated entities",
input: "&amp;&amp;&amp;",
expected: "&&&",
},
{
name: "entities without semicolon (properly converted)",
input: "&amp test &lt test",
expected: "& test < test", // Parser handles entities even without semicolons in some cases
},
{
name: "mixed valid and invalid entities",
input: "&amp; &invalid; &lt; &fake;",
expected: "& &invalid; < &fake;",
},
{
name: "unclosed tag at end",
input: "Content <p>with unclosed",
expected: "Content with unclosed",
},
{
name: "tag with no closing bracket",
input: "Content <p class='test' with no closing bracket",
expected: "Content", // Parser handles malformed HTML gracefully
},
{
name: "extremely nested tags",
input: "<div><div><div><div><div>Deep content</div></div></div></div></div>",
expected: "Deep content",
},
{
name: "empty tags with whitespace",
input: "<p> </p><div>\t\n</div>",
expected: "",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := cleaner.CleanHTML(tt.input)
if result != tt.expected {
t.Errorf("CleanHTML(%q) = %q, want %q", tt.input, result, tt.expected)
}
})
}
}
// TestHTMLCleaner_CleanHTML_Unicode tests Unicode content handling.
func TestHTMLCleaner_CleanHTML_Unicode(t *testing.T) {
cleaner := NewHTMLCleaner()
tests := []struct {
name string
input string
expected string
}{
{
name: "unicode characters",
input: "<p>Hello 世界! Café naïve résumé</p>",
expected: "Hello 世界! Café naïve résumé",
},
{
name: "unicode with entities",
input: "<p>Unicode: 你好 &amp; emoji: 🌍</p>",
expected: "Unicode: 你好 & emoji: 🌍",
},
{
name: "mixed scripts",
input: "<div>English العربية русский 日本語</div>",
expected: "English العربية русский 日本語",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := cleaner.CleanHTML(tt.input)
if result != tt.expected {
t.Errorf("CleanHTML(%q) = %q, want %q", tt.input, result, tt.expected)
}
})
}
}
// BenchmarkHTMLCleaner_CleanHTML benchmarks the CleanHTML method.
func BenchmarkHTMLCleaner_CleanHTML(b *testing.B) {
cleaner := NewHTMLCleaner()
input := "<div class=\"content\"><h1>Course Title</h1><p>This is a <em>great</em> course about &amp; HTML entities like &nbsp; and &quot;quotes&quot;.</p><ul><li>Item 1</li><li>Item 2</li></ul></div>"
for b.Loop() {
cleaner.CleanHTML(input)
}
}
// BenchmarkHTMLCleaner_CleanHTML_Large benchmarks the CleanHTML method with large content.
func BenchmarkHTMLCleaner_CleanHTML_Large(b *testing.B) {
cleaner := NewHTMLCleaner()
// Create a large HTML string
var builder strings.Builder
for i := range 100 {
builder.WriteString("<p>Paragraph ")
builder.WriteString(string(rune('0' + i%10)))
builder.WriteString(" with some content &amp; entities &lt;test&gt;.</p>")
}
input := builder.String()
for b.Loop() {
cleaner.CleanHTML(input)
}
}

104
internal/services/logger.go Normal file
View File

@ -0,0 +1,104 @@
package services
import (
"context"
"log/slog"
"os"
"github.com/kjanat/articulate-parser/internal/interfaces"
)
// SlogLogger implements the Logger interface using the standard library's slog package.
type SlogLogger struct {
logger *slog.Logger
}
// NewSlogLogger creates a new structured logger using slog.
// The level parameter controls the minimum log level (debug, info, warn, error).
func NewSlogLogger(level slog.Level) interfaces.Logger {
opts := &slog.HandlerOptions{
Level: level,
}
handler := slog.NewJSONHandler(os.Stdout, opts)
return &SlogLogger{
logger: slog.New(handler),
}
}
// NewTextLogger creates a new structured logger with human-readable text output.
// Useful for development and debugging.
func NewTextLogger(level slog.Level) interfaces.Logger {
opts := &slog.HandlerOptions{
Level: level,
}
handler := slog.NewTextHandler(os.Stdout, opts)
return &SlogLogger{
logger: slog.New(handler),
}
}
// Debug logs a debug-level message with optional key-value pairs.
func (l *SlogLogger) Debug(msg string, keysAndValues ...any) {
l.logger.Debug(msg, keysAndValues...)
}
// Info logs an info-level message with optional key-value pairs.
func (l *SlogLogger) Info(msg string, keysAndValues ...any) {
l.logger.Info(msg, keysAndValues...)
}
// Warn logs a warning-level message with optional key-value pairs.
func (l *SlogLogger) Warn(msg string, keysAndValues ...any) {
l.logger.Warn(msg, keysAndValues...)
}
// Error logs an error-level message with optional key-value pairs.
func (l *SlogLogger) Error(msg string, keysAndValues ...any) {
l.logger.Error(msg, keysAndValues...)
}
// With returns a new logger with the given key-value pairs added as context.
func (l *SlogLogger) With(keysAndValues ...any) interfaces.Logger {
return &SlogLogger{
logger: l.logger.With(keysAndValues...),
}
}
// WithContext returns a new logger with context information.
// Currently preserves the logger as-is, but can be extended to extract
// trace IDs or other context values in the future.
func (l *SlogLogger) WithContext(ctx context.Context) interfaces.Logger {
// Can be extended to extract trace IDs, request IDs, etc. from context
return l
}
// NoOpLogger is a logger that discards all log messages.
// Useful for testing or when logging should be disabled.
type NoOpLogger struct{}
// NewNoOpLogger creates a logger that discards all messages.
func NewNoOpLogger() interfaces.Logger {
return &NoOpLogger{}
}
// Debug does nothing.
func (l *NoOpLogger) Debug(msg string, keysAndValues ...any) {}
// Info does nothing.
func (l *NoOpLogger) Info(msg string, keysAndValues ...any) {}
// Warn does nothing.
func (l *NoOpLogger) Warn(msg string, keysAndValues ...any) {}
// Error does nothing.
func (l *NoOpLogger) Error(msg string, keysAndValues ...any) {}
// With returns the same no-op logger.
func (l *NoOpLogger) With(keysAndValues ...any) interfaces.Logger {
return l
}
// WithContext returns the same no-op logger.
func (l *NoOpLogger) WithContext(ctx context.Context) interfaces.Logger {
return l
}

View File

@ -0,0 +1,95 @@
package services
import (
"context"
"io"
"log/slog"
"testing"
)
// BenchmarkSlogLogger_Info benchmarks structured JSON logging.
func BenchmarkSlogLogger_Info(b *testing.B) {
// Create logger that writes to io.Discard to avoid benchmark noise
opts := &slog.HandlerOptions{Level: slog.LevelInfo}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
logger.Info("test message", "key1", "value1", "key2", 42, "key3", true)
}
}
// BenchmarkSlogLogger_Debug benchmarks debug level logging.
func BenchmarkSlogLogger_Debug(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelDebug}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
logger.Debug("debug message", "operation", "test", "duration", 123)
}
}
// BenchmarkSlogLogger_Error benchmarks error logging.
func BenchmarkSlogLogger_Error(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelError}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
logger.Error("error occurred", "error", "test error", "code", 500)
}
}
// BenchmarkTextLogger_Info benchmarks text logging.
func BenchmarkTextLogger_Info(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelInfo}
handler := slog.NewTextHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
logger.Info("test message", "key1", "value1", "key2", 42)
}
}
// BenchmarkNoOpLogger benchmarks the no-op logger.
func BenchmarkNoOpLogger(b *testing.B) {
logger := NewNoOpLogger()
b.ResetTimer()
for b.Loop() {
logger.Info("test message", "key1", "value1", "key2", 42)
logger.Error("error message", "error", "test")
}
}
// BenchmarkLogger_With benchmarks logger with context.
func BenchmarkLogger_With(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelInfo}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
b.ResetTimer()
for b.Loop() {
contextLogger := logger.With("request_id", "123", "user_id", "456")
contextLogger.Info("operation completed")
}
}
// BenchmarkLogger_WithContext benchmarks logger with Go context.
func BenchmarkLogger_WithContext(b *testing.B) {
opts := &slog.HandlerOptions{Level: slog.LevelInfo}
handler := slog.NewJSONHandler(io.Discard, opts)
logger := &SlogLogger{logger: slog.New(handler)}
ctx := context.Background()
b.ResetTimer()
for b.Loop() {
contextLogger := logger.WithContext(ctx)
contextLogger.Info("context operation")
}
}

View File

@ -1,12 +1,12 @@
// Package services provides the core functionality for the articulate-parser application.
// It implements the interfaces defined in the interfaces package.
package services package services
import ( import (
"context"
"encoding/json" "encoding/json"
"fmt" "fmt"
"io" "io"
"net/http" "net/http"
"net/url"
"os" "os"
"regexp" "regexp"
"time" "time"
@ -22,32 +22,36 @@ type ArticulateParser struct {
BaseURL string BaseURL string
// Client is the HTTP client used to make requests to the API // Client is the HTTP client used to make requests to the API
Client *http.Client Client *http.Client
// Logger for structured logging
Logger interfaces.Logger
} }
// NewArticulateParser creates a new ArticulateParser instance with default settings. // NewArticulateParser creates a new ArticulateParser instance.
// The default configuration uses the standard Articulate Rise API URL and a // If baseURL is empty, uses the default Articulate Rise API URL.
// HTTP client with a 30-second timeout. // If timeout is zero, uses a 30-second timeout.
func NewArticulateParser() interfaces.CourseParser { func NewArticulateParser(logger interfaces.Logger, baseURL string, timeout time.Duration) interfaces.CourseParser {
if logger == nil {
logger = NewNoOpLogger()
}
if baseURL == "" {
baseURL = "https://rise.articulate.com"
}
if timeout == 0 {
timeout = 30 * time.Second
}
return &ArticulateParser{ return &ArticulateParser{
BaseURL: "https://rise.articulate.com", BaseURL: baseURL,
Client: &http.Client{ Client: &http.Client{
Timeout: 30 * time.Second, Timeout: timeout,
}, },
Logger: logger,
} }
} }
// FetchCourse fetches a course from the given URI. // FetchCourse fetches a course from the given URI and returns the parsed course data.
// It extracts the share ID from the URI, constructs an API URL, and fetches the course data. // The URI should be an Articulate Rise share URL (e.g., https://rise.articulate.com/share/SHARE_ID).
// The course data is then unmarshalled into a Course model. // The context can be used for cancellation and timeout control.
// func (p *ArticulateParser) FetchCourse(ctx context.Context, uri string) (*models.Course, error) {
// Parameters:
// - uri: The Articulate Rise share URL (e.g., https://rise.articulate.com/share/SHARE_ID)
//
// Returns:
// - A parsed Course model if successful
// - An error if the fetch fails, if the share ID can't be extracted,
// or if the response can't be parsed
func (p *ArticulateParser) FetchCourse(uri string) (*models.Course, error) {
shareID, err := p.extractShareID(uri) shareID, err := p.extractShareID(uri)
if err != nil { if err != nil {
return nil, err return nil, err
@ -55,11 +59,24 @@ func (p *ArticulateParser) FetchCourse(uri string) (*models.Course, error) {
apiURL := p.buildAPIURL(shareID) apiURL := p.buildAPIURL(shareID)
resp, err := p.Client.Get(apiURL) req, err := http.NewRequestWithContext(ctx, http.MethodGet, apiURL, http.NoBody)
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
resp, err := p.Client.Do(req)
if err != nil { if err != nil {
return nil, fmt.Errorf("failed to fetch course data: %w", err) return nil, fmt.Errorf("failed to fetch course data: %w", err)
} }
defer resp.Body.Close() // Ensure response body is closed even if ReadAll fails. Close errors are logged
// but not fatal since the body content has already been read and parsed. In the
// context of HTTP responses, the body must be closed to release the underlying
// connection, but a close error doesn't invalidate the data already consumed.
defer func() {
if err := resp.Body.Close(); err != nil {
p.Logger.Warn("failed to close response body", "error", err, "url", apiURL)
}
}()
if resp.StatusCode != http.StatusOK { if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("API returned status %d", resp.StatusCode) return nil, fmt.Errorf("API returned status %d", resp.StatusCode)
@ -79,15 +96,8 @@ func (p *ArticulateParser) FetchCourse(uri string) (*models.Course, error) {
} }
// LoadCourseFromFile loads an Articulate Rise course from a local JSON file. // LoadCourseFromFile loads an Articulate Rise course from a local JSON file.
// The file should contain a valid JSON representation of an Articulate Rise course.
//
// Parameters:
// - filePath: The path to the JSON file containing the course data
//
// Returns:
// - A parsed Course model if successful
// - An error if the file can't be read or the JSON can't be parsed
func (p *ArticulateParser) LoadCourseFromFile(filePath string) (*models.Course, error) { func (p *ArticulateParser) LoadCourseFromFile(filePath string) (*models.Course, error) {
// #nosec G304 - File path is provided by user via CLI argument, which is expected behavior
data, err := os.ReadFile(filePath) data, err := os.ReadFile(filePath)
if err != nil { if err != nil {
return nil, fmt.Errorf("failed to read file: %w", err) return nil, fmt.Errorf("failed to read file: %w", err)
@ -112,6 +122,17 @@ func (p *ArticulateParser) LoadCourseFromFile(filePath string) (*models.Course,
// - The share ID string if found // - The share ID string if found
// - An error if the share ID can't be extracted from the URI // - An error if the share ID can't be extracted from the URI
func (p *ArticulateParser) extractShareID(uri string) (string, error) { func (p *ArticulateParser) extractShareID(uri string) (string, error) {
// Parse the URL to validate the domain
parsedURL, err := url.Parse(uri)
if err != nil {
return "", fmt.Errorf("invalid URI: %s", uri)
}
// Validate that it's an Articulate Rise domain
if parsedURL.Host != "rise.articulate.com" {
return "", fmt.Errorf("invalid domain for Articulate Rise URI: %s", parsedURL.Host)
}
re := regexp.MustCompile(`/share/([a-zA-Z0-9_-]+)`) re := regexp.MustCompile(`/share/([a-zA-Z0-9_-]+)`)
matches := re.FindStringSubmatch(uri) matches := re.FindStringSubmatch(uri)
if len(matches) < 2 { if len(matches) < 2 {

View File

@ -0,0 +1,219 @@
package services
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"github.com/kjanat/articulate-parser/internal/models"
)
// BenchmarkArticulateParser_FetchCourse benchmarks the FetchCourse method.
func BenchmarkArticulateParser_FetchCourse(b *testing.B) {
testCourse := &models.Course{
ShareID: "benchmark-id",
Author: "Benchmark Author",
Course: models.CourseInfo{
ID: "bench-course",
Title: "Benchmark Course",
Description: "Testing performance",
Lessons: []models.Lesson{
{
ID: "lesson1",
Title: "Lesson 1",
Type: "lesson",
},
},
},
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
// Encode errors are ignored in benchmarks; the test server's ResponseWriter
// writes are reliable and any encoding error would be a test setup issue
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{},
Logger: NewNoOpLogger(),
}
b.ResetTimer()
for b.Loop() {
_, err := parser.FetchCourse(context.Background(), "https://rise.articulate.com/share/benchmark-id")
if err != nil {
b.Fatalf("FetchCourse failed: %v", err)
}
}
}
// BenchmarkArticulateParser_FetchCourse_LargeCourse benchmarks with a large course.
func BenchmarkArticulateParser_FetchCourse_LargeCourse(b *testing.B) {
// Create a large course with many lessons
lessons := make([]models.Lesson, 100)
for i := range 100 {
lessons[i] = models.Lesson{
ID: string(rune(i)),
Title: "Lesson " + string(rune(i)),
Type: "lesson",
Description: "This is a test lesson with some description",
Items: []models.Item{
{
Type: "text",
Items: []models.SubItem{
{
Heading: "Test Heading",
Paragraph: "Test paragraph content with some text",
},
},
},
},
}
}
testCourse := &models.Course{
ShareID: "large-course-id",
Author: "Benchmark Author",
Course: models.CourseInfo{
ID: "large-course",
Title: "Large Benchmark Course",
Description: "Testing performance with large course",
Lessons: lessons,
},
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
// Encode errors are ignored in benchmarks; the test server's ResponseWriter
// writes are reliable and any encoding error would be a test setup issue
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{},
Logger: NewNoOpLogger(),
}
b.ResetTimer()
for b.Loop() {
_, err := parser.FetchCourse(context.Background(), "https://rise.articulate.com/share/large-course-id")
if err != nil {
b.Fatalf("FetchCourse failed: %v", err)
}
}
}
// BenchmarkArticulateParser_LoadCourseFromFile benchmarks loading from file.
func BenchmarkArticulateParser_LoadCourseFromFile(b *testing.B) {
testCourse := &models.Course{
ShareID: "file-test-id",
Course: models.CourseInfo{
Title: "File Test Course",
},
}
tempDir := b.TempDir()
tempFile := filepath.Join(tempDir, "benchmark.json")
data, err := json.Marshal(testCourse)
if err != nil {
b.Fatalf("Failed to marshal: %v", err)
}
if err := os.WriteFile(tempFile, data, 0o644); err != nil {
b.Fatalf("Failed to write file: %v", err)
}
parser := NewArticulateParser(nil, "", 0)
b.ResetTimer()
for b.Loop() {
_, err := parser.LoadCourseFromFile(tempFile)
if err != nil {
b.Fatalf("LoadCourseFromFile failed: %v", err)
}
}
}
// BenchmarkArticulateParser_LoadCourseFromFile_Large benchmarks with large file.
func BenchmarkArticulateParser_LoadCourseFromFile_Large(b *testing.B) {
// Create a large course
lessons := make([]models.Lesson, 200)
for i := range 200 {
lessons[i] = models.Lesson{
ID: string(rune(i)),
Title: "Lesson " + string(rune(i)),
Type: "lesson",
Items: []models.Item{
{Type: "text", Items: []models.SubItem{{Heading: "H", Paragraph: "P"}}},
{Type: "list", Items: []models.SubItem{{Paragraph: "Item 1"}, {Paragraph: "Item 2"}}},
},
}
}
testCourse := &models.Course{
ShareID: "large-file-id",
Course: models.CourseInfo{
Title: "Large File Course",
Lessons: lessons,
},
}
tempDir := b.TempDir()
tempFile := filepath.Join(tempDir, "large-benchmark.json")
data, err := json.Marshal(testCourse)
if err != nil {
b.Fatalf("Failed to marshal: %v", err)
}
if err := os.WriteFile(tempFile, data, 0o644); err != nil {
b.Fatalf("Failed to write file: %v", err)
}
parser := NewArticulateParser(nil, "", 0)
b.ResetTimer()
for b.Loop() {
_, err := parser.LoadCourseFromFile(tempFile)
if err != nil {
b.Fatalf("LoadCourseFromFile failed: %v", err)
}
}
}
// BenchmarkArticulateParser_ExtractShareID benchmarks share ID extraction.
func BenchmarkArticulateParser_ExtractShareID(b *testing.B) {
parser := &ArticulateParser{}
uri := "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/"
b.ResetTimer()
for b.Loop() {
_, err := parser.extractShareID(uri)
if err != nil {
b.Fatalf("extractShareID failed: %v", err)
}
}
}
// BenchmarkArticulateParser_BuildAPIURL benchmarks API URL building.
func BenchmarkArticulateParser_BuildAPIURL(b *testing.B) {
parser := &ArticulateParser{
BaseURL: "https://rise.articulate.com",
}
shareID := "test-share-id-12345"
b.ResetTimer()
for b.Loop() {
_ = parser.buildAPIURL(shareID)
}
}

View File

@ -0,0 +1,289 @@
package services
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"github.com/kjanat/articulate-parser/internal/models"
)
// TestArticulateParser_FetchCourse_ContextCancellation tests that FetchCourse
// respects context cancellation.
func TestArticulateParser_FetchCourse_ContextCancellation(t *testing.T) {
// Create a server that delays response
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Sleep to give time for context cancellation
time.Sleep(100 * time.Millisecond)
testCourse := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
Title: "Test Course",
},
}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Create a context that we'll cancel immediately
ctx, cancel := context.WithCancel(context.Background())
cancel() // Cancel immediately
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
// Should get a context cancellation error
if err == nil {
t.Fatal("Expected error due to context cancellation, got nil")
}
if !strings.Contains(err.Error(), "context canceled") {
t.Errorf("Expected context cancellation error, got: %v", err)
}
}
// TestArticulateParser_FetchCourse_ContextTimeout tests that FetchCourse
// respects context timeout.
func TestArticulateParser_FetchCourse_ContextTimeout(t *testing.T) {
// Create a server that delays response longer than timeout
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Sleep longer than the context timeout
time.Sleep(200 * time.Millisecond)
testCourse := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
Title: "Test Course",
},
}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Create a context with a very short timeout
ctx, cancel := context.WithTimeout(context.Background(), 50*time.Millisecond)
defer cancel()
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
// Should get a context deadline exceeded error
if err == nil {
t.Fatal("Expected error due to context timeout, got nil")
}
if !strings.Contains(err.Error(), "deadline exceeded") &&
!strings.Contains(err.Error(), "context deadline exceeded") {
t.Errorf("Expected context timeout error, got: %v", err)
}
}
// TestArticulateParser_FetchCourse_ContextDeadline tests that FetchCourse
// respects context deadline.
func TestArticulateParser_FetchCourse_ContextDeadline(t *testing.T) {
// Create a server that delays response
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(150 * time.Millisecond)
testCourse := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
Title: "Test Course",
},
}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Create a context with a deadline in the past
ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(10*time.Millisecond))
defer cancel()
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
// Should get a deadline exceeded error
if err == nil {
t.Fatal("Expected error due to context deadline, got nil")
}
if !strings.Contains(err.Error(), "deadline exceeded") &&
!strings.Contains(err.Error(), "context deadline exceeded") {
t.Errorf("Expected deadline exceeded error, got: %v", err)
}
}
// TestArticulateParser_FetchCourse_ContextSuccess tests that FetchCourse
// succeeds when context is not canceled.
func TestArticulateParser_FetchCourse_ContextSuccess(t *testing.T) {
testCourse := &models.Course{
ShareID: "test-id",
Course: models.CourseInfo{
Title: "Test Course",
},
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Respond quickly
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Create a context with generous timeout
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
course, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
if err != nil {
t.Fatalf("Expected no error, got: %v", err)
}
if course == nil {
t.Fatal("Expected course, got nil")
}
if course.Course.Title != testCourse.Course.Title {
t.Errorf("Expected title '%s', got '%s'", testCourse.Course.Title, course.Course.Title)
}
}
// TestArticulateParser_FetchCourse_CancellationDuringRequest tests cancellation
// during an in-flight request.
func TestArticulateParser_FetchCourse_CancellationDuringRequest(t *testing.T) {
requestStarted := make(chan bool)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
requestStarted <- true
// Keep the handler running to simulate slow response
time.Sleep(300 * time.Millisecond)
testCourse := &models.Course{
ShareID: "test-id",
}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
ctx, cancel := context.WithCancel(context.Background())
// Start the request in a goroutine
errChan := make(chan error, 1)
go func() {
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
errChan <- err
}()
// Wait for request to start
<-requestStarted
// Cancel after request has started
cancel()
// Get the error
err := <-errChan
if err == nil {
t.Fatal("Expected error due to context cancellation, got nil")
}
// Should contain context canceled somewhere in the error chain
if !strings.Contains(err.Error(), "context canceled") {
t.Errorf("Expected context canceled error, got: %v", err)
}
}
// TestArticulateParser_FetchCourse_MultipleTimeouts tests behavior with
// multiple concurrent requests and timeouts.
func TestArticulateParser_FetchCourse_MultipleTimeouts(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(100 * time.Millisecond)
testCourse := &models.Course{ShareID: "test"}
// Encode errors are ignored in test setup; httptest.ResponseWriter is reliable
_ = json.NewEncoder(w).Encode(testCourse)
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
Logger: NewNoOpLogger(),
}
// Launch multiple requests with different timeouts
tests := []struct {
name string
timeout time.Duration
shouldSucceed bool
}{
{"very short timeout", 10 * time.Millisecond, false},
{"short timeout", 50 * time.Millisecond, false},
{"adequate timeout", 500 * time.Millisecond, true},
{"long timeout", 2 * time.Second, true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), tt.timeout)
defer cancel()
_, err := parser.FetchCourse(ctx, "https://rise.articulate.com/share/test-id")
if tt.shouldSucceed && err != nil {
t.Errorf("Expected success with timeout %v, got error: %v", tt.timeout, err)
}
if !tt.shouldSucceed && err == nil {
t.Errorf("Expected timeout error with timeout %v, got success", tt.timeout)
}
})
}
}

View File

@ -0,0 +1,441 @@
package services
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"strings"
"testing"
"time"
"github.com/kjanat/articulate-parser/internal/models"
)
// TestNewArticulateParser tests the NewArticulateParser constructor.
func TestNewArticulateParser(t *testing.T) {
parser := NewArticulateParser(nil, "", 0)
if parser == nil {
t.Fatal("NewArticulateParser() returned nil")
}
// Type assertion to check internal structure
articulateParser, ok := parser.(*ArticulateParser)
if !ok {
t.Fatal("NewArticulateParser() returned wrong type")
}
expectedBaseURL := "https://rise.articulate.com"
if articulateParser.BaseURL != expectedBaseURL {
t.Errorf("Expected BaseURL '%s', got '%s'", expectedBaseURL, articulateParser.BaseURL)
}
if articulateParser.Client == nil {
t.Error("Client should not be nil")
}
expectedTimeout := 30 * time.Second
if articulateParser.Client.Timeout != expectedTimeout {
t.Errorf("Expected timeout %v, got %v", expectedTimeout, articulateParser.Client.Timeout)
}
}
// TestArticulateParser_FetchCourse tests the FetchCourse method.
func TestArticulateParser_FetchCourse(t *testing.T) {
// Create a test course object
testCourse := &models.Course{
ShareID: "test-share-id",
Author: "Test Author",
Course: models.CourseInfo{
ID: "test-course-id",
Title: "Test Course",
Description: "Test Description",
},
}
// Create test server
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Check request path
expectedPath := "/api/rise-runtime/boot/share/test-share-id"
if r.URL.Path != expectedPath {
t.Errorf("Expected path '%s', got '%s'", expectedPath, r.URL.Path)
}
// Check request method
if r.Method != http.MethodGet {
t.Errorf("Expected method GET, got %s", r.Method)
}
// Return mock response
w.Header().Set("Content-Type", "application/json")
if err := json.NewEncoder(w).Encode(testCourse); err != nil {
t.Fatalf("Failed to encode test course: %v", err)
}
}))
defer server.Close()
// Create parser with test server URL
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
}
tests := []struct {
name string
uri string
expectedError string
}{
{
name: "valid articulate rise URI",
uri: "https://rise.articulate.com/share/test-share-id#/",
},
{
name: "valid articulate rise URI without fragment",
uri: "https://rise.articulate.com/share/test-share-id",
},
{
name: "invalid URI format",
uri: "invalid-uri",
expectedError: "invalid domain for Articulate Rise URI:",
},
{
name: "empty URI",
uri: "",
expectedError: "invalid domain for Articulate Rise URI:",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
course, err := parser.FetchCourse(context.Background(), tt.uri)
if tt.expectedError != "" {
if err == nil {
t.Fatalf("Expected error containing '%s', got nil", tt.expectedError)
}
if !strings.Contains(err.Error(), tt.expectedError) {
t.Errorf("Expected error containing '%s', got '%s'", tt.expectedError, err.Error())
}
} else {
if err != nil {
t.Fatalf("Expected no error, got: %v", err)
}
if course == nil {
t.Fatal("Expected course, got nil")
}
if course.ShareID != testCourse.ShareID {
t.Errorf("Expected ShareID '%s', got '%s'", testCourse.ShareID, course.ShareID)
}
}
})
}
}
// TestArticulateParser_FetchCourse_NetworkError tests network error handling.
func TestArticulateParser_FetchCourse_NetworkError(t *testing.T) {
// Create parser with invalid URL to simulate network error
parser := &ArticulateParser{
BaseURL: "http://localhost:99999", // Invalid port
Client: &http.Client{
Timeout: 1 * time.Millisecond, // Very short timeout
},
}
_, err := parser.FetchCourse(context.Background(), "https://rise.articulate.com/share/test-share-id")
if err == nil {
t.Fatal("Expected network error, got nil")
}
if !strings.Contains(err.Error(), "failed to fetch course data") {
t.Errorf("Expected error to contain 'failed to fetch course data', got '%s'", err.Error())
}
}
// TestArticulateParser_FetchCourse_InvalidJSON tests invalid JSON response handling.
func TestArticulateParser_FetchCourse_InvalidJSON(t *testing.T) {
// Create test server that returns invalid JSON
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
// Write is used for its side effect; the test verifies error handling on
// the client side, not whether the write succeeds. Ignore the error since
// httptest.ResponseWriter writes are rarely problematic in test contexts.
_, _ = w.Write([]byte("invalid json"))
}))
defer server.Close()
parser := &ArticulateParser{
BaseURL: server.URL,
Client: &http.Client{
Timeout: 5 * time.Second,
},
}
_, err := parser.FetchCourse(context.Background(), "https://rise.articulate.com/share/test-share-id")
if err == nil {
t.Fatal("Expected JSON parsing error, got nil")
}
if !strings.Contains(err.Error(), "failed to unmarshal JSON") {
t.Errorf("Expected error to contain 'failed to unmarshal JSON', got '%s'", err.Error())
}
}
// TestArticulateParser_LoadCourseFromFile tests the LoadCourseFromFile method.
func TestArticulateParser_LoadCourseFromFile(t *testing.T) {
// Create a temporary test file
testCourse := &models.Course{
ShareID: "file-test-share-id",
Author: "File Test Author",
Course: models.CourseInfo{
ID: "file-test-course-id",
Title: "File Test Course",
Description: "File Test Description",
},
}
// Create temporary directory and file
tempDir := t.TempDir()
tempFile := filepath.Join(tempDir, "test-course.json")
// Write test data to file
data, err := json.Marshal(testCourse)
if err != nil {
t.Fatalf("Failed to marshal test course: %v", err)
}
if err := os.WriteFile(tempFile, data, 0o644); err != nil {
t.Fatalf("Failed to write test file: %v", err)
}
parser := NewArticulateParser(nil, "", 0)
tests := []struct {
name string
filePath string
expectedError string
}{
{
name: "valid file",
filePath: tempFile,
},
{
name: "nonexistent file",
filePath: filepath.Join(tempDir, "nonexistent.json"),
expectedError: "failed to read file",
},
{
name: "empty path",
filePath: "",
expectedError: "failed to read file",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
course, err := parser.LoadCourseFromFile(tt.filePath)
if tt.expectedError != "" {
if err == nil {
t.Fatalf("Expected error containing '%s', got nil", tt.expectedError)
}
if !strings.Contains(err.Error(), tt.expectedError) {
t.Errorf("Expected error containing '%s', got '%s'", tt.expectedError, err.Error())
}
} else {
if err != nil {
t.Fatalf("Expected no error, got: %v", err)
}
if course == nil {
t.Fatal("Expected course, got nil")
}
if course.ShareID != testCourse.ShareID {
t.Errorf("Expected ShareID '%s', got '%s'", testCourse.ShareID, course.ShareID)
}
}
})
}
}
// TestArticulateParser_LoadCourseFromFile_InvalidJSON tests invalid JSON file handling.
func TestArticulateParser_LoadCourseFromFile_InvalidJSON(t *testing.T) {
// Create temporary file with invalid JSON
tempDir := t.TempDir()
tempFile := filepath.Join(tempDir, "invalid.json")
if err := os.WriteFile(tempFile, []byte("invalid json content"), 0o644); err != nil {
t.Fatalf("Failed to write test file: %v", err)
}
parser := NewArticulateParser(nil, "", 0)
_, err := parser.LoadCourseFromFile(tempFile)
if err == nil {
t.Fatal("Expected JSON parsing error, got nil")
}
if !strings.Contains(err.Error(), "failed to unmarshal JSON") {
t.Errorf("Expected error to contain 'failed to unmarshal JSON', got '%s'", err.Error())
}
}
// TestExtractShareID tests the extractShareID method.
func TestExtractShareID(t *testing.T) {
parser := &ArticulateParser{}
tests := []struct {
name string
uri string
expected string
hasError bool
}{
{
name: "standard articulate rise URI with fragment",
uri: "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/",
expected: "N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO",
},
{
name: "standard articulate rise URI without fragment",
uri: "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO",
expected: "N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO",
},
{
name: "URI with trailing slash",
uri: "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO/",
expected: "N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO",
},
{
name: "short share ID",
uri: "https://rise.articulate.com/share/abc123",
expected: "abc123",
},
{
name: "share ID with hyphens and underscores",
uri: "https://rise.articulate.com/share/test_ID-123_abc",
expected: "test_ID-123_abc",
},
{
name: "invalid URI - no share path",
uri: "https://rise.articulate.com/",
hasError: true,
},
{
name: "invalid URI - wrong domain",
uri: "https://example.com/share/test123",
hasError: true,
},
{
name: "invalid URI - no share ID",
uri: "https://rise.articulate.com/share/",
hasError: true,
},
{
name: "empty URI",
uri: "",
hasError: true,
},
{
name: "malformed URI",
uri: "not-a-uri",
hasError: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result, err := parser.extractShareID(tt.uri)
if tt.hasError {
if err == nil {
t.Fatalf("Expected error for URI '%s', got nil", tt.uri)
}
} else {
if err != nil {
t.Fatalf("Expected no error for URI '%s', got: %v", tt.uri, err)
}
if result != tt.expected {
t.Errorf("Expected share ID '%s', got '%s'", tt.expected, result)
}
}
})
}
}
// TestBuildAPIURL tests the buildAPIURL method.
func TestBuildAPIURL(t *testing.T) {
parser := &ArticulateParser{
BaseURL: "https://rise.articulate.com",
}
tests := []struct {
name string
shareID string
expected string
}{
{
name: "standard share ID",
shareID: "N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO",
expected: "https://rise.articulate.com/api/rise-runtime/boot/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO",
},
{
name: "short share ID",
shareID: "abc123",
expected: "https://rise.articulate.com/api/rise-runtime/boot/share/abc123",
},
{
name: "share ID with special characters",
shareID: "test_ID-123_abc",
expected: "https://rise.articulate.com/api/rise-runtime/boot/share/test_ID-123_abc",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := parser.buildAPIURL(tt.shareID)
if result != tt.expected {
t.Errorf("Expected URL '%s', got '%s'", tt.expected, result)
}
})
}
}
// TestBuildAPIURL_DifferentBaseURL tests buildAPIURL with different base URLs.
func TestBuildAPIURL_DifferentBaseURL(t *testing.T) {
parser := &ArticulateParser{
BaseURL: "https://custom.domain.com",
}
shareID := "test123"
expected := "https://custom.domain.com/api/rise-runtime/boot/share/test123"
result := parser.buildAPIURL(shareID)
if result != expected {
t.Errorf("Expected URL '%s', got '%s'", expected, result)
}
}
// BenchmarkExtractShareID benchmarks the extractShareID method.
func BenchmarkExtractShareID(b *testing.B) {
parser := &ArticulateParser{}
uri := "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/"
for b.Loop() {
_, _ = parser.extractShareID(uri)
}
}
// BenchmarkBuildAPIURL benchmarks the buildAPIURL method.
func BenchmarkBuildAPIURL(b *testing.B) {
parser := &ArticulateParser{
BaseURL: "https://rise.articulate.com",
}
shareID := "N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO"
for b.Loop() {
_ = parser.buildAPIURL(shareID)
}
}

View File

@ -5,7 +5,17 @@ package version
// Version information. // Version information.
var ( var (
// Version is the current version of the application. // Version is the current version of the application.
Version = "0.1.1" // Breaking changes from 0.4.x:
// - Renamed GetSupportedFormat() -> SupportedFormat()
// - Renamed GetSupportedFormats() -> SupportedFormats()
// - FetchCourse now requires context.Context parameter
// - NewArticulateParser now accepts logger, baseURL, timeout
// New features:
// - Structured logging with slog
// - Configuration via environment variables
// - Context-aware HTTP requests
// - Comprehensive benchmarks and examples.
Version = "1.0.0"
// BuildTime is the time the binary was built. // BuildTime is the time the binary was built.
BuildTime = "unknown" BuildTime = "unknown"

98
main.go
View File

@ -4,54 +4,84 @@
package main package main
import ( import (
"context"
"fmt" "fmt"
"log"
"os" "os"
"strings"
"github.com/kjanat/articulate-parser/internal/config"
"github.com/kjanat/articulate-parser/internal/exporters" "github.com/kjanat/articulate-parser/internal/exporters"
"github.com/kjanat/articulate-parser/internal/interfaces"
"github.com/kjanat/articulate-parser/internal/services" "github.com/kjanat/articulate-parser/internal/services"
"github.com/kjanat/articulate-parser/internal/version"
) )
// main is the entry point of the application. // main is the entry point of the application.
// It handles command-line arguments, sets up dependencies, // It handles command-line arguments, sets up dependencies,
// and coordinates the parsing and exporting of courses. // and coordinates the parsing and exporting of courses.
func main() { func main() {
// Dependency injection setup os.Exit(run(os.Args))
}
// run contains the main application logic and returns an exit code.
// This function is testable as it doesn't call os.Exit directly.
func run(args []string) int {
// Load configuration
cfg := config.Load()
// Dependency injection setup with configuration
var logger interfaces.Logger
if cfg.LogFormat == "json" {
logger = services.NewSlogLogger(cfg.LogLevel)
} else {
logger = services.NewTextLogger(cfg.LogLevel)
}
htmlCleaner := services.NewHTMLCleaner() htmlCleaner := services.NewHTMLCleaner()
parser := services.NewArticulateParser() parser := services.NewArticulateParser(logger, cfg.BaseURL, cfg.RequestTimeout)
exporterFactory := exporters.NewFactory(htmlCleaner) exporterFactory := exporters.NewFactory(htmlCleaner)
app := services.NewApp(parser, exporterFactory) app := services.NewApp(parser, exporterFactory)
// Check for required command-line arguments // Check for version flag
if len(os.Args) < 4 { if len(args) > 1 && (args[1] == "--version" || args[1] == "-v") {
fmt.Printf("Usage: %s <source> <format> <output>\n", os.Args[0]) fmt.Printf("%s version %s\n", args[0], version.Version)
fmt.Printf(" source: URI or file path to the course\n") fmt.Printf("Build time: %s\n", version.BuildTime)
fmt.Printf(" format: export format (%s)\n", joinStrings(app.GetSupportedFormats(), ", ")) fmt.Printf("Git commit: %s\n", version.GitCommit)
fmt.Printf(" output: output file path\n") return 0
fmt.Println("\nExample:")
fmt.Printf(" %s articulate-sample.json markdown output.md\n", os.Args[0])
fmt.Printf(" %s https://rise.articulate.com/share/xyz docx output.docx\n", os.Args[0])
os.Exit(1)
} }
source := os.Args[1] // Check for help flag
format := os.Args[2] if len(args) > 1 && (args[1] == "--help" || args[1] == "-h" || args[1] == "help") {
output := os.Args[3] printUsage(args[0], app.SupportedFormats())
return 0
}
// Check for required command-line arguments
if len(args) < 4 {
printUsage(args[0], app.SupportedFormats())
return 1
}
source := args[1]
format := args[2]
output := args[3]
var err error var err error
// Determine if source is a URI or file path // Determine if source is a URI or file path
if isURI(source) { if isURI(source) {
err = app.ProcessCourseFromURI(source, format, output) err = app.ProcessCourseFromURI(context.Background(), source, format, output)
} else { } else {
err = app.ProcessCourseFromFile(source, format, output) err = app.ProcessCourseFromFile(source, format, output)
} }
if err != nil { if err != nil {
log.Fatalf("Error processing course: %v", err) logger.Error("failed to process course", "error", err, "source", source)
return 1
} }
fmt.Printf("Successfully exported course to %s\n", output) logger.Info("successfully exported course", "output", output, "format", format)
return 0
} }
// isURI checks if a string is a URI by looking for http:// or https:// prefixes. // isURI checks if a string is a URI by looking for http:// or https:// prefixes.
@ -65,25 +95,17 @@ func isURI(str string) bool {
return len(str) > 7 && (str[:7] == "http://" || str[:8] == "https://") return len(str) > 7 && (str[:7] == "http://" || str[:8] == "https://")
} }
// joinStrings concatenates a slice of strings using the specified separator. // printUsage prints the command-line usage information.
// //
// Parameters: // Parameters:
// - strs: The slice of strings to join // - programName: The name of the program (args[0])
// - sep: The separator to insert between each string // - supportedFormats: Slice of supported export formats
// func printUsage(programName string, supportedFormats []string) {
// Returns: fmt.Printf("Usage: %s <source> <format> <output>\n", programName)
// - A single string with all elements joined by the separator fmt.Printf(" source: URI or file path to the course\n")
func joinStrings(strs []string, sep string) string { fmt.Printf(" format: export format (%s)\n", strings.Join(supportedFormats, ", "))
if len(strs) == 0 { fmt.Printf(" output: output file path\n")
return "" fmt.Println("\nExample:")
} fmt.Printf(" %s articulate-sample.json markdown output.md\n", programName)
if len(strs) == 1 { fmt.Printf(" %s https://rise.articulate.com/share/xyz docx output.docx\n", programName)
return strs[0]
}
result := strs[0]
for i := 1; i < len(strs); i++ {
result += sep + strs[i]
}
return result
} }

526
main_test.go Normal file
View File

@ -0,0 +1,526 @@
package main
import (
"bytes"
"io"
"log"
"os"
"strings"
"testing"
)
// TestIsURI tests the isURI function with various input scenarios.
func TestIsURI(t *testing.T) {
tests := []struct {
name string
input string
expected bool
}{
{
name: "valid HTTP URI",
input: "http://example.com",
expected: true,
},
{
name: "valid HTTPS URI",
input: "https://example.com",
expected: true,
},
{
name: "valid Articulate Rise URI",
input: "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/",
expected: true,
},
{
name: "local file path",
input: "C:\\Users\\test\\file.json",
expected: false,
},
{
name: "relative file path",
input: "./sample.json",
expected: false,
},
{
name: "filename only",
input: "sample.json",
expected: false,
},
{
name: "empty string",
input: "",
expected: false,
},
{
name: "short string",
input: "http",
expected: false,
},
{
name: "malformed URI",
input: "htp://example.com",
expected: false,
},
{
name: "FTP URI",
input: "ftp://example.com",
expected: false,
},
{
name: "HTTP with extra characters",
input: "xhttp://example.com",
expected: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := isURI(tt.input)
if result != tt.expected {
t.Errorf("isURI(%q) = %v, want %v", tt.input, result, tt.expected)
}
})
}
}
// BenchmarkIsURI benchmarks the isURI function performance.
func BenchmarkIsURI(b *testing.B) {
testStr := "https://rise.articulate.com/share/N_APNg40Vr2CSH2xNz-ZLATM5kNviDIO#/"
for b.Loop() {
isURI(testStr)
}
}
// TestRunWithInsufficientArgs tests the run function with insufficient command-line arguments.
func TestRunWithInsufficientArgs(t *testing.T) {
tests := []struct {
name string
args []string
}{
{
name: "no arguments",
args: []string{"articulate-parser"},
},
{
name: "one argument",
args: []string{"articulate-parser", "source"},
},
{
name: "two arguments",
args: []string{"articulate-parser", "source", "format"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Capture stdout
oldStdout := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
// Run the function
exitCode := run(tt.args)
// Restore stdout. Close errors are ignored: we've already captured the
// output before closing, and any close error doesn't affect test validity.
_ = w.Close()
os.Stdout = oldStdout
// Read captured output. Copy errors are ignored: in this test context,
// reading from a pipe that was just closed is not expected to fail, and
// we're verifying the captured output regardless.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Verify exit code
if exitCode != 1 {
t.Errorf("Expected exit code 1, got %d", exitCode)
}
// Verify usage message is displayed
if !strings.Contains(output, "Usage:") {
t.Errorf("Expected usage message in output, got: %s", output)
}
if !strings.Contains(output, "export format") {
t.Errorf("Expected format information in output, got: %s", output)
}
})
}
}
// TestRunWithHelpFlags tests the run function with help flag arguments.
func TestRunWithHelpFlags(t *testing.T) {
helpFlags := []string{"--help", "-h", "help"}
for _, flag := range helpFlags {
t.Run("help_flag_"+flag, func(t *testing.T) {
// Capture stdout
oldStdout := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
// Run with help flag
args := []string{"articulate-parser", flag}
exitCode := run(args)
// Restore stdout. Close errors are ignored: the pipe write end is already
// closed before reading, and any close error doesn't affect the test.
_ = w.Close()
os.Stdout = oldStdout
// Read captured output. Copy errors are ignored: we successfully wrote
// the help output to the pipe and can verify it regardless of close semantics.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Verify exit code is 0 (success)
if exitCode != 0 {
t.Errorf("Expected exit code 0 for help flag %s, got %d", flag, exitCode)
}
// Verify help content is displayed
expectedContent := []string{
"Usage:",
"source: URI or file path to the course",
"format: export format",
"output: output file path",
"Example:",
"articulate-sample.json markdown output.md",
"https://rise.articulate.com/share/xyz docx output.docx",
}
for _, expected := range expectedContent {
if !strings.Contains(output, expected) {
t.Errorf("Expected help output to contain %q when using flag %s, got: %s", expected, flag, output)
}
}
})
}
}
// TestRunWithVersionFlags tests the run function with version flag arguments.
func TestRunWithVersionFlags(t *testing.T) {
versionFlags := []string{"--version", "-v"}
for _, flag := range versionFlags {
t.Run("version_flag_"+flag, func(t *testing.T) {
// Capture stdout
oldStdout := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
// Run with version flag
args := []string{"articulate-parser", flag}
exitCode := run(args)
// Restore stdout. Close errors are ignored: the version output has already
// been written and we're about to read it; close semantics don't affect correctness.
_ = w.Close()
os.Stdout = oldStdout
// Read captured output. Copy errors are ignored: the output was successfully
// produced and we can verify its contents regardless of any I/O edge cases.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Verify exit code is 0 (success)
if exitCode != 0 {
t.Errorf("Expected exit code 0 for version flag %s, got %d", flag, exitCode)
}
// Verify version content is displayed
expectedContent := []string{
"articulate-parser version",
"Build time:",
"Git commit:",
}
for _, expected := range expectedContent {
if !strings.Contains(output, expected) {
t.Errorf("Expected version output to contain %q when using flag %s, got: %s", expected, flag, output)
}
}
})
}
}
// TestRunWithInvalidFile tests the run function with a non-existent file.
func TestRunWithInvalidFile(t *testing.T) {
// Capture stdout and stderr
oldStdout := os.Stdout
oldStderr := os.Stderr
stdoutR, stdoutW, _ := os.Pipe()
stderrR, stderrW, _ := os.Pipe()
os.Stdout = stdoutW
os.Stderr = stderrW
// Also need to redirect log output
oldLogOutput := log.Writer()
log.SetOutput(stderrW)
// Run with non-existent file
args := []string{"articulate-parser", "nonexistent-file.json", "markdown", "output.md"}
exitCode := run(args)
// Restore stdout/stderr and log output. Close errors are ignored: we've already
// written all error messages to these pipes before closing them, and the test
// only cares about verifying the captured output.
_ = stdoutW.Close()
_ = stderrW.Close()
os.Stdout = oldStdout
os.Stderr = oldStderr
log.SetOutput(oldLogOutput)
// Read captured output. Copy errors are ignored: the error messages have been
// successfully written to the pipes, and we can verify the output content
// regardless of any edge cases in pipe closure or I/O completion.
var stdoutBuf, stderrBuf bytes.Buffer
_, _ = io.Copy(&stdoutBuf, stdoutR)
_, _ = io.Copy(&stderrBuf, stderrR)
// Close read ends of pipes. Errors ignored: we've already consumed all data
// from these pipes, and close errors don't affect test assertions.
_ = stdoutR.Close()
_ = stderrR.Close()
// Verify exit code
if exitCode != 1 {
t.Errorf("Expected exit code 1 for non-existent file, got %d", exitCode)
}
// Should have error output in structured log format
output := stdoutBuf.String()
if !strings.Contains(output, "level=ERROR") && !strings.Contains(output, "failed to process course") {
t.Errorf("Expected error message about processing course, got: %s", output)
}
}
// TestRunWithInvalidURI tests the run function with an invalid URI.
func TestRunWithInvalidURI(t *testing.T) {
// Capture stdout and stderr
oldStdout := os.Stdout
oldStderr := os.Stderr
stdoutR, stdoutW, _ := os.Pipe()
stderrR, stderrW, _ := os.Pipe()
os.Stdout = stdoutW
os.Stderr = stderrW
// Also need to redirect log output
oldLogOutput := log.Writer()
log.SetOutput(stderrW)
// Run with invalid URI (will fail because we can't actually fetch)
args := []string{"articulate-parser", "https://example.com/invalid", "markdown", "output.md"}
exitCode := run(args)
// Restore stdout/stderr and log output. Close errors are ignored: we've already
// written all error messages about the invalid URI to these pipes before closing,
// and test correctness only depends on verifying the captured error output.
_ = stdoutW.Close()
_ = stderrW.Close()
os.Stdout = oldStdout
os.Stderr = oldStderr
log.SetOutput(oldLogOutput)
// Read captured output. Copy errors are ignored: the error messages have been
// successfully written and we can verify the failure output content regardless
// of any edge cases in pipe lifecycle or I/O synchronization.
var stdoutBuf, stderrBuf bytes.Buffer
_, _ = io.Copy(&stdoutBuf, stdoutR)
_, _ = io.Copy(&stderrBuf, stderrR)
// Close read ends of pipes. Errors ignored: we've already consumed all data
// and close errors don't affect the validation of the error output.
_ = stdoutR.Close()
_ = stderrR.Close()
// Should fail because the URI is invalid/unreachable
if exitCode != 1 {
t.Errorf("Expected failure (exit code 1) for invalid URI, got %d", exitCode)
}
// Should have error output in structured log format
output := stdoutBuf.String()
if !strings.Contains(output, "level=ERROR") && !strings.Contains(output, "failed to process course") {
t.Errorf("Expected error message about processing course, got: %s", output)
}
}
// TestRunWithValidJSONFile tests the run function with a valid JSON file.
func TestRunWithValidJSONFile(t *testing.T) {
// Create a temporary test JSON file
testContent := `{
"title": "Test Course",
"lessons": [
{
"id": "lesson1",
"title": "Test Lesson",
"blocks": [
{
"type": "text",
"id": "block1",
"data": {
"text": "Test content"
}
}
]
}
]
}`
tmpFile, err := os.CreateTemp("", "test-course-*.json")
if err != nil {
t.Fatalf("Failed to create temp file: %v", err)
}
// Ensure temporary test file is cleaned up. Remove errors are ignored because
// the test has already used the file for its purpose, and cleanup failures don't
// invalidate the test results (the OS will eventually clean up temp files).
defer func() {
_ = os.Remove(tmpFile.Name())
}()
if _, err := tmpFile.WriteString(testContent); err != nil {
t.Fatalf("Failed to write test content: %v", err)
}
// Close the temporary file. Errors are ignored because we've already written
// the test content and the main test logic (loading the file) doesn't depend
// on the success of closing this file descriptor.
_ = tmpFile.Close()
// Test successful run with valid file
outputFile := "test-output.md"
// Ensure test output file is cleaned up. Remove errors are ignored because the
// test has already verified the export succeeded; cleanup failures don't affect
// the test assertions.
defer func() {
_ = os.Remove(outputFile)
}()
// Save original stdout
originalStdout := os.Stdout
defer func() { os.Stdout = originalStdout }()
// Capture stdout
r, w, _ := os.Pipe()
os.Stdout = w
args := []string{"articulate-parser", tmpFile.Name(), "markdown", outputFile}
exitCode := run(args)
// Close write end and restore stdout. Close errors are ignored: we've already
// written the success message before closing, and any close error doesn't affect
// the validity of the captured output or the test assertions.
_ = w.Close()
os.Stdout = originalStdout
// Read captured output. Copy errors are ignored: the success message was
// successfully written to the pipe, and we can verify it regardless of any
// edge cases in pipe closure or I/O synchronization.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Verify successful execution
if exitCode != 0 {
t.Errorf("Expected successful execution (exit code 0), got %d", exitCode)
}
// Verify success message in structured log format
if !strings.Contains(output, "level=INFO") || !strings.Contains(output, "successfully exported course") {
t.Errorf("Expected success message in output, got: %s", output)
}
// Verify output file was created
if _, err := os.Stat(outputFile); os.IsNotExist(err) {
t.Errorf("Expected output file %s to be created", outputFile)
}
}
// TestRunIntegration tests the run function with different output formats using sample file.
func TestRunIntegration(t *testing.T) {
// Skip if sample file doesn't exist
if _, err := os.Stat("articulate-sample.json"); os.IsNotExist(err) {
t.Skip("Skipping integration test: articulate-sample.json not found")
}
formats := []struct {
format string
output string
}{
{"markdown", "test-output.md"},
{"html", "test-output.html"},
{"docx", "test-output.docx"},
}
for _, format := range formats {
t.Run("format_"+format.format, func(t *testing.T) {
// Capture stdout
oldStdout := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
// Run the function
args := []string{"articulate-parser", "articulate-sample.json", format.format, format.output}
exitCode := run(args)
// Restore stdout. Close errors are ignored: the export success message
// has already been written and we're about to read it; close semantics
// don't affect the validity of the captured output.
_ = w.Close()
os.Stdout = oldStdout
// Read captured output. Copy errors are ignored: the output was successfully
// produced and we can verify its contents regardless of any I/O edge cases.
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
output := buf.String()
// Clean up test file. Remove errors are ignored because the test has
// already verified the export succeeded; cleanup failures don't affect
// the test assertions.
defer func() {
_ = os.Remove(format.output)
}()
// Verify successful execution
if exitCode != 0 {
t.Errorf("Expected successful execution (exit code 0), got %d", exitCode)
}
// Verify success message
expectedMsg := "Successfully exported course to " + format.output
if !strings.Contains(output, expectedMsg) {
t.Errorf("Expected success message '%s' in output, got: %s", expectedMsg, output)
}
// Verify output file was created
if _, err := os.Stat(format.output); os.IsNotExist(err) {
t.Errorf("Expected output file %s to be created", format.output)
}
})
}
}
// TestMainFunction tests that the main function exists and is properly structured.
// We can't test os.Exit behavior directly, but we can verify the main function
// calls the run function correctly by testing run function behavior.
func TestMainFunction(t *testing.T) {
// Test that insufficient args return exit code 1
exitCode := run([]string{"program"})
if exitCode != 1 {
t.Errorf("Expected run to return exit code 1 for insufficient args, got %d", exitCode)
}
// Test that main function exists (this is mainly for coverage)
// The main function just calls os.Exit(run(os.Args)), which we can't test directly
// but we've tested the run function thoroughly above.
}

View File

@ -531,6 +531,11 @@ try {
if ($Failed -gt 0) { if ($Failed -gt 0) {
exit 1 exit 1
} }
# Clean up environment variables to avoid contaminating future builds
Remove-Item Env:GOOS -ErrorAction SilentlyContinue
Remove-Item Env:GOARCH -ErrorAction SilentlyContinue
Remove-Item Env:CGO_ENABLED -ErrorAction SilentlyContinue
} finally { } finally {
Pop-Location Pop-Location
} }

View File

@ -217,6 +217,14 @@ if [ "$SHOW_TARGETS" = true ]; then
exit 0 exit 0
fi fi
# Validate Go installation
if ! command -v go >/dev/null 2>&1; then
echo "Error: Go is not installed or not in PATH"
echo "Please install Go from https://golang.org/dl/"
echo "Or if running on Windows, use the PowerShell script: scripts\\build.ps1"
exit 1
fi
# Validate entry point exists # Validate entry point exists
if [ ! -f "$ENTRYPOINT" ]; then if [ ! -f "$ENTRYPOINT" ]; then
echo "Error: Entry point file '$ENTRYPOINT' does not exist" echo "Error: Entry point file '$ENTRYPOINT' does not exist"
@ -315,7 +323,7 @@ for idx in "${!TARGETS[@]}"; do
fi fi
build_cmd+=("${GO_BUILD_FLAGS_ARRAY[@]}" -o "$OUTDIR/$BIN" "$ENTRYPOINT") build_cmd+=("${GO_BUILD_FLAGS_ARRAY[@]}" -o "$OUTDIR/$BIN" "$ENTRYPOINT")
if CGO_ENABLED=1 GOOS="$os" GOARCH="$arch" "${build_cmd[@]}" 2>"$OUTDIR/$BIN.log"; then if CGO_ENABLED=0 GOOS="$os" GOARCH="$arch" "${build_cmd[@]}" 2>"$OUTDIR/$BIN.log"; then
update_status $((idx + 1)) '✔' "$BIN done" update_status $((idx + 1)) '✔' "$BIN done"
rm -f "$OUTDIR/$BIN.log" rm -f "$OUTDIR/$BIN.log"
else else
@ -356,3 +364,6 @@ if [ "$VERBOSE" = true ]; then
echo " ────────────────────────────────────────────────" echo " ────────────────────────────────────────────────"
printf " Total: %d/%d successful, %s total size\n" "$success_count" "${#TARGETS[@]}" "$(numfmt --to=iec-i --suffix=B $total_size 2>/dev/null || echo "${total_size} bytes")" printf " Total: %d/%d successful, %s total size\n" "$success_count" "${#TARGETS[@]}" "$(numfmt --to=iec-i --suffix=B $total_size 2>/dev/null || echo "${total_size} bytes")"
fi fi
# Clean up environment variables to avoid contaminating future builds
unset GOOS GOARCH CGO_ENABLED