Hello, everyone!
This post will share with you spark streaming app can't connect to kafka.
Problem phenomenon
Exception in thread "main" org.apache.spark.SparkException: java.io.EOFException: Received -1
when reading from channel, socket has probably been closed.
java.io.EOFException: Received -1 when reading from channel, socket has probably been closed. java.io.EOFException: Received -1 when reading from channel, socket has probably been closed. org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366) org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366) scala.util.Either.fold(Either.scala:97) org.apache.spark.streaming.kafka.KafkaCluster$.checkErrors(KafkaCluster.scala:365) org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:422) com.huawei.bigdata.spark.examples.FemaleInfoCollectionPrint$.main(FemaleInfoCollectionPrint.scala:45) com.huawei.bigdata.spark.examples.FemaleInfoCollectionPrint.main(FemaleInfoCollectionPrint.scala) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:762) org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:183) org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:208) org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:123) org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Possible reason
When specifying the application's brokers, the port used is the secure port 21007, and the connection cannot be established.
Solution
When starting the task, specify the brokers port 21005 as follows:
Spark-submit --master yarn --jars **.jar --class myclass myown.jar ip1:21005,ip2:21005,ip3:21005
Cause Analysis
FI's kafka opened the security mode in C60 and later versions, and the current cluster Sparkstreaming cannot connect to the secure kafka, so if you use the sparkstreaming application, you need to turn off the acl setting of acl in kafka. If the user-created topic sets acl, it will also cause the sparkstreming application to fail to dock. Port 21007 is a secure port for kafka, and the spark application cannot be used.
That's all, thanks!
