Skip to content

Latest commit

 

History

History
43 lines (34 loc) · 1.7 KB

File metadata and controls

43 lines (34 loc) · 1.7 KB

Twitter HashTag Count

This topology written in Java using Apache Maven, consists of one spout and two bolts. Basically, the spout tracks the Tweets and sent it to Tweet split bolt which gets the hashtags from Tweets and sent it to the hashtag count bolt. Finally, the results are placed in to a text file.

To build and test the topology locally, run the command below (you need to install Apache Storm Binaries to use this):

cd TwitterWordCount

# mvn compile exec:java -Dstorm.topology=io.github.carloshkayser.twitterwordcount.topologies.TwitterTopology -Dexec.args="keyWord1 keyWord2"

Configure the Twitter API in the src/main/resources/twitter4j.properties file.

oauth.consumerKey=
oauth.consumerSecret=
oauth.accessToken=
oauth.accessTokenSecret=

To build the topology and generate a jar file with our topology, consider the command below:

mvn clean compile assembly:single

To run the topology in a local Apache Storm cluster use the command below:

storm \
  local target/TwitterWordCount-1.0-SNAPSHOT-jar-with-dependencies.jar \
  io.github.carloshkayser.twitterwordcount.topologies.TwitterTopology \
  local keyWord1 keyWord2

To run the topology in an Apache Storm cluster use the command below:

storm \
  jar target/TwitterWordCount-1.0-SNAPSHOT-jar-with-dependencies.jar \
  io.github.carloshkayser.twitterwordcount.topologies.TwitterTopology \
  remote keyWord1 keyWord2

Resources