Skip to content

Latest commit

 

History

History
30 lines (16 loc) · 2.6 KB

README.md

File metadata and controls

30 lines (16 loc) · 2.6 KB

stormcrawler-solr

Set of Apache Solr resources for StormCrawler that allows you to create topologies that consume from a Solr collection and store metrics, status or parsed content into Solr.

Getting started

The easiest way is currently to use the archetype for Solr with:

mvn archetype:generate -DarchetypeGroupId=org.apache.stormcrawler -DarchetypeArtifactId=stormcrawler-solr-archetype -DarchetypeVersion=3.2.0-SNAPSHOT

You'll be asked to enter a groupId (e.g. com.mycompany.crawler), an artefactId (e.g. stormcrawler), a version, a package name and details about the user agent to use.

This will not only create a fully formed project containing a POM with the dependency above but also a set of resources, configuration files and sample topology classes. Enter the directory you just created (should be the same as the artefactId you specified earlier) and follow the instructions on the README file.

You will of course need to have both Apache Storm (2.8.0) and Apache Solr (9.8.0) installed.

Official references:

Available resources