`okh-scraper`

A stand-alone service that scrapes(/crawls) Open Source Hardware (OSH) projects from different platforms and other hosting technologies.
The collected data conforms to the Open Know-How (OKH) and the Open Dataset (ODS) standards.

Usage

Fill out the config file (config.yml)
Start the scraper

It will continuously collect and update OSH project data, found on the supported and configured platforms and other locations. The fetched and the converted data, plus the related scraping meta-data, get stored in structured, text based file formats -- JSON, TOML, YAML and Turtle -- and committed and pushed to a git repo. That repo is then synced with its forks, which are directly pushed to by other instances of this scraper. This basically constitutes a distributed scraping mechanism, if configured correctly: The different scraper instances should ideally al fetch all platforms, but different "sections" of the total of the hosted projects. For example in the case of Thingiverse, which as of Early 2025 hosts about 3 million projects, one crawler would grase the ID range of 0 to 499'999, the second one from 500'000 to 999'999, and so on.

Building

# To get a binary for your system
cargo build --release

# To get a 64bit binary that is portable to all Linux systems
run/rp/build

Testing

To run unit-, doc- and integration-tests:

run/rp/test

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.github		.github
LICENSES		LICENSES
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.lock.license		Cargo.lock.license
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`okh-scraper`

Usage

Building

Testing

About

Releases

Packages

Languages

License

OSEGermany/okh-scraper

Folders and files

Latest commit

History

Repository files navigation

okh-scraper

Usage

Building

Testing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`okh-scraper`

Packages