Skip to content

Commit

Permalink
Merge branch 'kafka' of github.com:davies/spark into kafka
Browse files Browse the repository at this point in the history
  • Loading branch information
Davies Liu committed Jan 28, 2015
2 parents a74da87 + dc1eed0 commit 9af51c4
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 8 deletions.
11 changes: 4 additions & 7 deletions examples/src/main/python/streaming/kafka_wordcount.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,9 @@
Counts words in UTF8 encoded, '\n' delimited text received from the network every second.
Usage: network_wordcount.py <zk> <topic>
To run this on your local machine, you need to setup Kafka and create a producer first
$ bin/zookeeper-server-start.sh config/zookeeper.properties
$ bin/kafka-server-start.sh config/server.properties
$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --partitions 1 --topic test
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
To run this on your local machine, you need to setup Kafka and create a producer first, see
http://kafka.apache.org/documentation.html#quickstart
and then run the example
`$ bin/spark-submit --driver-class-path external/kafka-assembly/target/scala-*/\
spark-streaming-kafka-assembly-*.jar examples/src/main/python/streaming/kafka_wordcount.py \
Expand All @@ -39,7 +36,7 @@

if __name__ == "__main__":
if len(sys.argv) != 3:
print >> sys.stderr, "Usage: network_wordcount.py <zk> <topic>"
print >> sys.stderr, "Usage: kafka_wordcount.py <zk> <topic>"
exit(-1)

sc = SparkContext(appName="PythonStreamingKafkaWordCount")
Expand Down
2 changes: 1 addition & 1 deletion python/pyspark/streaming/kafka.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def getClassByName(name):
# TODO: use --jar once it also work on driver
if not e.message or 'call a package' in e.message:
print "No kafka package, please put the assembly jar into classpath:"
print " $ bin/submit --driver-class-path external/kafka-assembly/target/" + \
print " $ bin/spark-submit --driver-class-path external/kafka-assembly/target/" + \
"scala-*/spark-streaming-kafka-assembly-*.jar"
raise e
ser = PairDeserializer(NoOpSerializer(), NoOpSerializer())
Expand Down

0 comments on commit 9af51c4

Please sign in to comment.