diff --git a/CHANGELOG.md b/CHANGELOG.md
index 65af6b5..09ed6f8 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,8 @@
-# 0.1.1 (unreleased)
+# 0.2.0 (unreleased)
+* Use Storm 0.9.2. This includes two notable improvements:
+ * We can and do use the Kafka 0.8 compatible Kafka spout included in Storm 0.9.2.
+ * We use ZooKeeper 3.4.5, up from 3.3.x before.
* AvroKafkaSinkBolt should not declare any output fields because it writes to Kafka only, it does not emit any tuples.
diff --git a/README.md b/README.md
index c058242..039c1a5 100644
--- a/README.md
+++ b/README.md
@@ -152,9 +152,9 @@ Kafka topic.
Note that this example will actually run _two_ in-memory instances of ZooKeeper: the first (listening at
`127.0.0.1:2181/tcp`) is used by the Kafka instance, the second (listening at `127.0.0.1:2000/tcp`) is automatically
-started and used by the in-memory Storm cluster. This is because, when running in local aka in-memory mode, Storm does
-not allow you to reconfigure or disable its own ZooKeeper instance (see the [Storm FAQ](#FAQ-Storm) below for further
-information).
+started and used by the in-memory Storm cluster. This is because, when running in local aka in-memory mode, Storm
+until version 0.9.2 does not allow you to reconfigure or disable its own ZooKeeper instance (see the
+[Storm FAQ](#FAQ-Storm) below for further information).
**To stop the demo application you must kill or `Ctrl-C` the process in the terminal.**
@@ -164,7 +164,6 @@ way to get started with such an infrastructure is by deploying Kafka, Storm, and
[Wirbelsturm](https://github.com/miguno/wirbelsturm).
-
# Features
@@ -235,16 +234,9 @@ What features do we showcase in kafka-storm-starter? Note that we focus on show
[custom Kryo serializer for Storm](src/main/scala/com/miguno/kafkastorm/storm/TweetAvroKryoDecorator.scala) that
handles our Avro-derived Java class `Tweet` from [twitter.avsc](src/main/avro/twitter.avsc).
* Unit and integration tests are implemented with [ScalaTest](http://scalatest.org/).
-* We use [ZooKeeper 3.3.4](https://zookeeper.apache.org/) instead of the latest version 3.4.5.
- See section _Known issues_ below for why we do that.
-* We use the Kafka spout [wurstmeister/storm-kafka-0.8-plus](https://github.com/wurstmeister/storm-kafka-0.8-plus).
- Unfortunately that spout is not yet released for Scala 2.10. For that reason [@miguno](https://github.com/miguno/)
- has [forked and branched](https://github.com/miguno/storm-kafka-0.8-plus/tree/miguno_clojars) the code to add Scala
- 2.10 support, and released such a version to [Clojars](https://clojars.org/com.miguno/storm-kafka-0.8-plus_2.10).
- See [build.sbt](build.sbt) for details.
- * _Once Storm 0.9.2 is released we will migrate to the new_
- _[Kafka spout that ships with Storm](https://github.com/apache/incubator-storm/tree/master/external/storm-kafka)_
- _(which is based on the spout developed by wurstmeister)._
+* We use [ZooKeeper 3.4.5](https://zookeeper.apache.org/).
+* We use the [official Kafka spout](https://github.com/apache/incubator-storm/tree/master/external/storm-kafka) of the
+ Storm project, which is compatible with Kafka 0.8.
@@ -433,25 +425,25 @@ To create a normal ("slim") jar:
$ ./sbt clean package
- >>> Generates `target/scala-2.10/kafka-storm-starter_2.10-0.1.0-SNAPSHOT.jar`
+ >>> Generates `target/scala-2.10/kafka-storm-starter_2.10-0.2.0-SNAPSHOT.jar`
To create a fat jar, which includes any dependencies of kafka-storm-starter:
$ ./sbt assembly
- >>> Generates `target/scala-2.10/kafka-storm-starter-assembly-0.1.0-SNAPSHOT.jar`
+ >>> Generates `target/scala-2.10/kafka-storm-starter-assembly-0.2.0-SNAPSHOT.jar`
To create a scaladoc/javadoc jar:
$ ./sbt packageDoc
- >>> Generates `target/scala-2.10/kafka-storm-starter_2.10-0.1.0-SNAPSHOT-javadoc.jar`
+ >>> Generates `target/scala-2.10/kafka-storm-starter_2.10-0.2.0-SNAPSHOT-javadoc.jar`
To create a sources jar:
$ ./sbt packageSrc
- >>> Generates `target/scala-2.10/kafka-storm-starter_2.10-0.1.0-SNAPSHOT-sources.jar`
+ >>> Generates `target/scala-2.10/kafka-storm-starter_2.10-0.2.0-SNAPSHOT-sources.jar`
To create API docs:
@@ -525,6 +517,7 @@ contain the messages that are being sent to the Kafka topics) under `/tmp/kafka-
You may need to manually remove this directory in case you want start from a clean state. At the moment the unit tests
do not remove this directory for you.
+
### ZooKeeper exceptions "KeeperException: NoNode for /[ZK path]" logged at INFO level
In short you can normally safely ignore those errors -- it's for a reason they are logged at INFO level and not at ERROR
@@ -567,11 +560,12 @@ for details):
where `zk-port` is the final port chosen.
-As of May 2014 it is not possible to launch a local Storm cluster via `LocalCluster` without its own embedded ZooKeeper.
-Likewise it is not possible to control on which port the embedded ZooKeeper process will listen -- it will always follow
-the `2000/tcp` based algorithm above to set the port. A JIRA ticket was opened to untangle this hard wiring between
-`LocalCluster` and ZooKeeper, cf.
-[STORM-213: Decouple In-Process ZooKeeper from LocalCluster](https://issues.apache.org/jira/browse/STORM-213).
+In Storm versions <= 0.9.2 it is not possible to launch a local Storm cluster via `LocalCluster` without its own embedded
+ZooKeeper. Likewise it is not possible to control on which port the embedded ZooKeeper process will listen -- it will
+always follow the `2000/tcp` based algorithm above to set the port.
+
+In Storm 0.9.3 and later you can configure `LocalCluster` to use a custom ZooKeeper instance, thanks to
+[STORM-213](https://issues.apache.org/jira/browse/STORM-213).
@@ -586,19 +580,6 @@ own code.
## Upstream code
-### Kryo version conflict in Storm
-
-_Note: This problem is resolved in the upcoming 0.9.2 version of Storm._
-
-There is a Kryo version conflict between Storm 0.9.1 (uses Kryo 2.17) and Twitter Chill (uses Kryo 2.21).
-
-In this code project we use the workaround to exclude Kryo (2.21) from the Twitter Chill dependency, but this may not
-be a universal workaround. Twitter have apparently run into data corruption issues with Kryo 2.17, and for that reason
-have built their own version of Storm using Kryo 2.21.
-See [CHILL-173: Kryo version conflict between Chill and Storm 0.9.1-incubating causes Avro serialization to fail](https://github.com/twitter/chill/issues/173)
-for details.
-
-
### ZooKeeper throws InstanceAlreadyExistsException during tests
You will see the following exception when running the integration tests, which you can safely ignore:
@@ -612,49 +593,26 @@ instances trying to use the same JMX setup. Since the JMX setup is not relevant
safely ignored, albeit we'd prefer to come up with a proper fix, of course.
-### ZooKeeper version 3.3.x recommended for use with Storm 0.9.1 and Kafka 0.8.x
-
-_Note: The upcoming version 0.9.2 of Storm uses ZooKeeper 3.4.5._
+### ZooKeeper version 3.3.4 recommended for use with Kafka 0.8
-At the time of writing both Storm (<= 0.9.1) and Kafka (<= 0.8.1.1) are not officially compatible with ZooKeeper 3.4.x
-yet, which is the latest stable version of ZooKeeper. Instead the use of ZooKeeper 3.3.x is recommended.
+At the time of writing Kafka 0.8 is not officially compatible with ZooKeeper 3.4.x, which is the latest stable version
+of ZooKeeper. Instead the Kafka project
+[recommends ZooKeeper 3.3.4](https://kafka.apache.org/documentation.html#zkversion).
So which version of ZooKeeper should you do pick, particularly if you are already running a ZooKeeper cluster for other
parts of your infrastructure (such as an Hadoop cluster)?
**The TL;DR version is:** Try using ZooKeeper 3.4.5 for both Kafka and Storm, but see the caveats and workarounds
-below. If you do run into problems, consider downgrading to ZooKeeper 3.3.6. If that fails, too, try 3.3.4. In the
-worst case use separate ZooKeeper clusters/versions for Storm (3.3.3) and Kafka (3.3.4).
-
-**The longer version is:** Storm versions up to and including 0.9.1 want ZK 3.3.3, but the upcoming 0.9.2 version
-relies on ZooKeeper 3.4.x.
-[All current versions of Kafka still prefer ZK 3.3.4](https://kafka.apache.org/documentation.html#zkversion).
-Generally speaking though, the best 3.3.x version of ZooKeeper is 3.3.6, which is the latest stable 3.3.x version. This
-is because 3.3.6 fixed a number of serious bugs that could lead to data corruption.
-
-_Tip: You can verify against which ZK version the code in this project is actually built by running_
-_`./sbt dependency-graph`._
-
-**The really long version is:** In the _code and tests_ of this project we cannot use ZK 3.4.x just yet because Storm
-0.9.1 is not 100% incompatible with ZK 3.4.x. For instance, Storm will throw errors if you try to run a Storm
-`LocalCluster` (for unit testing) against ZK 3.4.x. At the same time, and somewhat surprisingly, you can run a "real"
-Storm cluster against ZK 3.4.x. For instance, Netflix have reportedly been using ZK 3.4.5 in production since some
-time.
-
-* Storm and ZooKeeper: Storm versions up to and including 0.9.1 are built against ZooKeeper 3.3.3 because of Storm's
- dependency on [Netflix Curator 1.0.1](https://github.com/Netflix/curator). These versions of Zookeeper and Curator
- are very old, and the upcoming Storm 0.9.2 therefore switches to Apache Curator 2.4.0 with ZooKeeper 3.4.x.
-* Kafka and ZooKeeper: LinkedIn recommend the use of ZK 3.3.x but warn against the use of 3.3.3 because that
- version has known serious issues regarding ephemeral node deletion and session expirations. For these reasons
- LinkedIn run ZK 3.3.4 in production.
- See [ZooKeeper version](https://kafka.apache.org/documentation.html#zkversion) in the Kafka documentation.
- Lastly, there is an open Kafka JIRA ticket that covers upgrading Kafka to ZK 3.4.5, see
+below. In the worst case use separate ZooKeeper clusters/versions for Storm (3.4.5) and Kafka (3.3.4). Generally
+speaking though, the best 3.3.x version of ZooKeeper is 3.3.6, which is the latest stable 3.3.x version. This is
+because 3.3.6 fixed a number of serious bugs that could lead to data corruption.
+
+_Tip: You can verify the exact ZK version used in kafka-storm-starter by running `./sbt dependency-graph`._
+
+Notes:
+
+* There is an open Kafka JIRA ticket that covers upgrading Kafka to ZK 3.4.5, see
[KAFKA-854: Upgrade dependencies for 0.8](https://issues.apache.org/jira/browse/KAFKA-854).
-* Storm and Cloudera CDH 4.5:
- * [Storm cannot run in combination with a recent Hadoop/HBase version](http://mail-archives.apache.org/mod_mbox/storm-user/201402.mbox/%3CCADoiZqom8Wuzi9uiqT4d01cTNn2r_nOmXyZyCSqEko-vOyrQBA@mail.gmail.com%3E)
- -- The author ran into problems when using Storm in combination with Cloudera CDH 4. It looks as if he is trying
- to build a code project that lists both Storm and Hadoop/HBase as its dependencies (similar to how we combine
- Storm with Kafka), and due to that runs into ZooKeeper version conflicts as CDH 4 runs ZooKeeper 3.4.5.
* If in a production environment you run into problems when using ZooKeeper 3.4.5 with Storm <= 0.9.1, you can try
a [workaround using Google jarjar](https://groups.google.com/forum/#!topic/storm-user/TVVF_jqvD_A) in order to
deploy ZooKeeper 3.4.5 alongside Storm's/Curator's hard dependency on ZooKeeper 3.3.3.
diff --git a/build.sbt b/build.sbt
index 151b9a1..cf1094d 100644
--- a/build.sbt
+++ b/build.sbt
@@ -25,18 +25,9 @@ resolvers ++= Seq(
libraryDependencies ++= Seq(
"com.twitter" %% "bijection-core" % "0.6.2",
"com.twitter" %% "bijection-avro" % "0.6.2",
- // Chill uses Kryo 2.21, which is not fully compatible with 2.17 (used by Storm).
- // We must exclude the newer Kryo version, otherwise we run into the problem described at
- // https://github.com/thinkaurelius/titan/issues/301.
- //
- // TODO: Once Storm 0.9.2 is released we can update our dependencies to use Chill as-is (without excludes) because
- // Storm then uses Kryo 2.21 (via Carbonite 1.3.3) just like Chill does.
- "com.twitter" %% "chill" % "0.3.6"
- exclude("com.esotericsoftware.kryo", "kryo"),
- "com.twitter" % "chill-avro" % "0.3.6"
- exclude("com.esotericsoftware.kryo", "kryo"),
- "com.twitter" %% "chill-bijection" % "0.3.6"
- exclude("com.esotericsoftware.kryo", "kryo"),
+ "com.twitter" %% "chill" % "0.3.6",
+ "com.twitter" % "chill-avro" % "0.3.6",
+ "com.twitter" %% "chill-bijection" % "0.3.6",
// The excludes of jms, jmxtools and jmxri are required as per https://issues.apache.org/jira/browse/KAFKA-974.
// The exclude of slf4j-simple is because it overlaps with our use of logback with slf4j facade; without the exclude
// we get slf4j warnings and logback's configuration is not picked up.
@@ -44,17 +35,15 @@ libraryDependencies ++= Seq(
exclude("javax.jms", "jms")
exclude("com.sun.jdmk", "jmxtools")
exclude("com.sun.jmx", "jmxri")
- exclude("org.slf4j", "slf4j-simple"),
- "org.apache.storm" % "storm-core" % "0.9.1-incubating" % "provided"
+ exclude("org.slf4j", "slf4j-simple")
+ exclude("log4j", "log4j")
+ exclude("org.apache.zookeeper", "zookeeper"),
+ "org.apache.storm" % "storm-core" % "0.9.2-incubating" % "provided"
+ exclude("org.apache.zookeeper", "zookeeper")
exclude("org.slf4j", "log4j-over-slf4j"),
- // We exclude curator-framework because storm-kafka-0.8-plus recently switched from curator 1.0.1 to 1.3.3, which
- // pulls in a newer version of ZooKeeper with which Storm 0.9.1 is not yet compatible.
- //
- // TODO: Remove the exclude once Storm 0.9.2 is released, because that version depends on a newer version (3.4.x) of
- // ZooKeeper.
- "com.miguno" %% "storm-kafka-0.8-plus" % "0.5.0-SNAPSHOT"
- exclude("com.netflix.curator", "curator-framework"),
- "com.netflix.curator" % "curator-test" % "1.0.1",
+ "org.apache.storm" % "storm-kafka" % "0.9.2-incubating"
+ exclude("org.apache.zookeeper", "zookeeper"),
+ "org.apache.curator" % "curator-test" % "2.4.0",
"com.101tec" % "zkclient" % "0.4",
// Logback with slf4j facade
"ch.qos.logback" % "logback-classic" % "1.1.2",
diff --git a/sonar-project.properties b/sonar-project.properties
index 2d178aa..d88277b 100644
--- a/sonar-project.properties
+++ b/sonar-project.properties
@@ -1,7 +1,7 @@
# Required metadata
sonar.projectKey=com.miguno.kafkastorm:kafka-storm-starter
sonar.projectName=kafka-storm-starter
-sonar.projectVersion=0.1.1-SNAPSHOT
+sonar.projectVersion=0.2.0-SNAPSHOT
# Base configuration of paths
sonar.sources=src/main/java,src/main/scala
diff --git a/src/main/scala/com/miguno/kafkastorm/zookeeper/ZooKeeperEmbedded.scala b/src/main/scala/com/miguno/kafkastorm/zookeeper/ZooKeeperEmbedded.scala
index f7a9e7a..ef46cd1 100644
--- a/src/main/scala/com/miguno/kafkastorm/zookeeper/ZooKeeperEmbedded.scala
+++ b/src/main/scala/com/miguno/kafkastorm/zookeeper/ZooKeeperEmbedded.scala
@@ -1,7 +1,7 @@
package com.miguno.kafkastorm.zookeeper
-import com.netflix.curator.test.TestingServer
import kafka.utils.Logging
+import org.apache.curator.test.TestingServer
/**
* Runs an in-memory, "embedded" instance of a ZooKeeper server.
diff --git a/version.sbt b/version.sbt
index 1be9a63..03a8b07 100644
--- a/version.sbt
+++ b/version.sbt
@@ -1 +1 @@
-version in ThisBuild := "0.1.1-SNAPSHOT"
+version in ThisBuild := "0.2.0-SNAPSHOT"