A TypeScript tool for building a curated catalog of European and open-source software alternatives.
This project scrapes government-vetted and expert-curated software catalogs to build a comprehensive dataset for sovereignsky.no - a website helping organizations find digital sovereignty-friendly software alternatives.
We prioritize government-vetted catalogs (trust level 5) for quality and legal compliance:
| Source | Country | Description |
|---|---|---|
| SILL France | FR | French government open source catalog |
| openCode.de | DE | German public sector software catalog |
| Developers Italia | IT | Italian government software catalog |
| EU OSS Catalogue | EU | European Commission open source catalog |
See catalog/software-sources.json for the complete source registry with trust levels.
- Multi-taxonomy categorization - Business, technical, developer, and platform categories
- Alternative mapping - Connect proprietary tools to open-source alternatives
- Government focus - Prioritize EU-based, GDPR-compliant solutions
- Automated validation - 135 tests ensure data quality
- Modular architecture - Domain-based structure for multi-domain support
All commands run inside the devcontainer. See CLAUDE.md.
# Install dependencies
npm install
# Run a scraper
npm run scrape euro-stack
# Run tests
npm test
# Build catalog
npm run catalog:build
# Validate
npm run catalog:validate| Document | Description |
|---|---|
| CLAUDE.md | LLM instructions, devcontainer setup |
| docs/DATA-FLOW.md | Data pipeline, scripts, test suite, weekly workflow |
| docs/FOLDER-STRUCTURE.md | Project directory organization |
| docs/software-catalog-spec.md | Schema specification |
| src/domains/software/scrapers/README.md | Scraper development guide |
| catalog/README.md | Catalog schema documentation |
src/
├── domains/software/ # Software catalog domain
│ ├── scrapers/ # Individual scraper implementations
│ ├── catalog/ # Modular catalog building (sources, enrichments, vendors)
│ └── lib/ # Domain utilities (category-resolver)
├── lib/ # Shared utilities (http, logger, text, dates)
├── types/ # Shared TypeScript interfaces
├── commands/ # CLI tools (check-duplicates, suggest-canonical)
└── catalog/ # Catalog build orchestration
scraped/normalized/ # Scraper output (JSON files)
catalog/ # Canonical catalog data and schemas
data/software/ # Deliverable output (for Hugo/PWA)
tests/ # Integration tests
docs/ # Documentation
See docs/FOLDER-STRUCTURE.md for complete structure.
| Scraper | Products | Description |
|---|---|---|
| euro-stack | 1,103 | European software alternatives |
| switching-software | 130 | Privacy-focused alternatives |
| cncf-landscape | 1,354 | Cloud-native infrastructure |
| cloud-service-map | 575 | AWS/Azure/GCP equivalents |
| openalternative | 656 | Open source alternatives |
| sill-france | 625 | French government catalog |
- Create folder
src/domains/software/scrapers/<source>/ - Implement
index.tswithrun()returningSoftware[] - Use
CategoryResolverfor category mapping - Add to test suite in
src/domains/software/scrapers/output-validation.test.ts - Run
npm testto validate
See src/domains/software/scrapers/SCRAPER-TEMPLATE.md for the full checklist.
npm test # Run all tests
npm run test:watch # Watch mode
npm run test:coverage # With coverageSee docs/DATA-FLOW.md#test-suite for test details.
- SILL France scraper (government catalog) - Implemented
- openCode.de scraper (government catalog)
- awesome-selfhosted scraper
- CI/CD pipeline