A Node.js tool to scrape Confluence spaces and convert them to Markdown files while preserving the page hierarchy.
- 📚 Scrape entire Confluence spaces or individual spaces
- 🔄 Convert Confluence storage format to Markdown
- 📁 Preserve page hierarchy in directory structure
- 🔁 Handle rate limiting with exponential backoff
- 🔗 Maintain page relationships and ordering
# Install dependencies
pnpm install
# Configure your Confluence instance
# Edit utils/index.js:
export const BASE_URL = "http://your-confluence-instance/rest/api";
export const ACCESS_TOKEN = "your-personal-access-token";
# Scrape all spaces
pnpm space:all
# Or scrape a specific space
pnpm space:single ENGINEERING
This project follows a documented decision-making process. Key architectural decisions:
-
- Native fetch with backoff
- Centralized API client
- Type-safe responses
-
- Feature-based organization
- Clear separation of concerns
- Consistent patterns
-
- Centralized error handling
- Retry mechanisms
- Consistent error messages
-
- Command-based interface
- Progress feedback
- Clear usage instructions
.
├── scripts/ # CLI Commands
│ ├── all-spaces.js # Scrape all spaces
│ └── all-space-content.js # Scrape single space
├── utils/ # Shared Utilities
│ └── index.js # API client, helpers
└── docs/ # Documentation
├── api-examples.md
├── api-integration.md
├── cli-interface.md
├── error-handling.md
├── file-structure.md
# Format code
pnpm format
The scraper creates a directory structure that mirrors your Confluence space:
confluence_markdown/
├── SPACE1/
│ ├── home/
│ │ ├── index.md (Space homepage)
│ │ └── Other Root Pages.md
│ └── Parent Page/
│ ├── index.md (Parent page content)
│ └── Child Page.md
└── SPACE2/
└── ...
Configure your Confluence instance in utils/index.js
:
export const BASE_URL = "http://your-confluence-instance/rest/api";
export const ACCESS_TOKEN = "your-personal-access-token";
export const OUTPUT_DIR = "confluence_markdown";
The scraper handles several error cases:
- Rate limiting (429) with exponential backoff
- Network errors with retries
- Invalid space keys
- Missing configuration
- File system errors
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
ISC
- markdown-it for Markdown conversion
- jsdom for HTML parsing