Kafka is a distributed, partitioned, and replicated message publishing and subscription system that provides features similar to the Java Message Service (JMS). Kafka features message persistence, high throughput, distribution, multi-client support, and real-time processing, and applies to online and offline message consumption. It is ideal for Internet service data collection scenarios, such as conventional data collection, active website tracing, aggregation of operation data in statistics systems (monitoring data), and log collection.
After a Kafka Broker receives a message, it persistently stores the message on a disk. In addition, each partition of a topic has multiple replicas stored on different Broker nodes. If one node is faulty, the replicas on other nodes can be used.