filebeat
kafka cluster 3개 node중에 1개를 stop시켰는데, filebeat가 produce를 못한다. leaderless란다.
2019-01-04T20:31:18.060+0900 INFO kafka/log.go:53 Connected to broker at KAFKA-01:9092 (unregistered)
2019-01-04T20:31:18.067+0900 INFO kafka/log.go:53 client/brokers registered new broker #1 at KAFKA-01:9092
2019-01-04T20:31:18.067+0900 INFO kafka/log.go:53 kafka message: client/metadata found some partitions to be leaderless
2019-01-04T20:31:18.067+0900 INFO kafka/log.go:53 client/metadata retrying after 250ms... (2 attempt remaining)
2019-01-04T20:31:18.318+0900 INFO kafka/log.go:53 client/metadata fetching metadata for [topic-ui5.0-action] from broker KAFKA-01:9092
2019-01-04T20:31:18.321+0900 INFO kafka/log.go:53 kafka message: client/metadata found some partitions to be leaderless
2019-01-04T20:31:18.321+0900 INFO kafka/log.go:53 client/metadata retrying after 250ms... (1 attempts remaining)
혹시나싶어서 topic describe를 해보니, leader=-1처럼 에러가 뜬다. leader=-1이 안뜨게 하려면 아래와 같이 replication-factor를 node갯수만큰 줘야 한다.
Topic:topic-ui4.0-anonym PartitionCount:1 ReplicationFactor:1 Configs:
Topic: topic-ui5.0-action Partition: 0 Leader: -1 Replicas: 2 Isr: 2
replication-factor=3을 주고, kafka node=1를 죽이면, 아래와 같은 결과가 나온다.
Topic:topic-ui5.0-action PartitionCount:3 ReplicationFactor:3 Configs:
Topic: topic-ui5.0-action Partition: 0 Leader: 2 Replicas: 1,2,3 Isr: 1,3,2
Topic: topic-ui5.0-action Partition: 1 Leader: 2 Replicas: 2,3,1 Isr: 1,3,2
Topic: topic-ui5.0-action Partition: 2 Leader: 3 Replicas: 3,1,2 Isr: 1,3,2
Consumer, Streams
Consumer와 Streams가 Kafka 3대중에 1대만 kill시켜도 이상하게, 작동안하는 경우가 있다. 어떨때는 될때도 있다 ㅠㅜ. 왔다리 갔다리 할때가 제일 힘들다 그래서, Kafka server.log를 보니 아래와 같은 에러로그가 있다.
DEBUG [MetadataCache brokerId=1] Error while fetching metadata for __consumer_offsets-1: listener ListenerName(SASL_PLAINTEXT) not found on leader -1 (kafka.server.MetadataCache)
DEBUG [MetadataCache brokerId=1] Error while fetching metadata for __consumer_offsets-19: listener ListenerName(SASL_PLAINTEXT) not found on leader -1 (kafka.server.MetadataCache)
DEBUG [MetadataCache brokerId=1] Error while fetching metadata for __consumer_offsets-28: listener ListenerName(SASL_PLAINTEXT) not found on leader -1 (kafka.server.MetadataCache)
토픽 __consumer_offset이 뭘까? kafka-topic --describe해보자. 이상하다 내가 싫어하는 leader=-1이 보인다. 그리고, replication-factor가 1이다. 1이면 fail-over가 안될텐데. 또한 server.properties의 offsets.topic.replication.factor=3 설정을 3을 줬는데도, replicationFactor=1로 되어 있다. 버그다
$ kafka-topics --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 1 Leader: -1 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 2 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 3 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 4 Leader: -1 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 5 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 6 Leader: 1 Replicas: 1 Isr: 1
# consumer-offsets-replication-factor.json
{"version":1,
"partitions":[
{"topic":"__consumer_offsets", "partition":0, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":1, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":2, "replicas":[1, 2, 3]},
{"topic":"__consumer_offsets", "partition":3, "replicas":[0, 1, 2]},
...
{"topic":"__consumer_offsets", "partition":3, "replicas":[3, 2, 1]} # 50개
$ kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file ./consumer-offsets-replication-factor.json --execute
$ kafka-topics --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 1,3,2
Topic: __consumer_offsets Partition: 1 Leader: 2 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 2 Leader: 3 Replicas: 1,2,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 3 Leader: 1 Replicas: 1,2,3 Isr: 1,3,2
Topic: __consumer_offsets Partition: 4 Leader: 2 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 5 Leader: 3 Replicas: 1,2,3 Isr: 3,1,2
... # 50개
Connector
Kafka cluster중에 1개 node를 stop시켰는데, connector가 consuming을 못한다. 또 connect-offset-13등이 leader가 없단다.
WARN [Consumer clientId=consumer-1, groupId=connect-cluster] 8 partitions have leader brokers without a matching listener, including [connect-offsets-13, connect-offsets-4, connect-offsets-22, connect-offsets-16, connect-offsets-7, connect-offsets-10, connect-offsets-1, connect-offsets-19] (org.apache.kafka.clients.NetworkClient:961)
그래서, connector가 만든 topic들 ***connect.offset, connect.status, connect.configs***를 describe해보니, 또 replication-factor=1이란다.
Topic:connect.configs PartitionCount:1 ReplicationFactor:1 Configs:
Topic: connect.configs Partition: 0 Leader: -1 Replicas: 2 Isr: 2
문서를 찾아보니, 아래와 같이 replication-factor=3을 수동으로 줘야 한단다. 물론 아래와 같이 하려면, topic delete를 한후 connector-register도 다시 해줘야한다. http://docs.confluent.io 참고문서
bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-configs --replication-factor 3 --partitions 1 --config cleanup.policy=compact
bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-offsets --replication-factor 3 --partitions 50 --config cleanup.policy=compact
bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-status --replication-factor 3 --partitions 10 --config cleanup.policy=compact
중요!! connctor관련 Topic들이 replication-factor=3으로 설정해도, 나중에 1로 바뀌는 경우가 있다. 추측컨데 __consumer_offsets토픽의 replication-factor=3으로 먼저 설정해야 할듯하다
Topic 설정 변경
partition 변경
kafka-topics --alter --zookeeper ZOOKEEPER-01 --topic topic-ui5.0-action-json --partitions 3
replication-factor변경
replicas의 앞을 1,2,3으로 다르게 주는 것이 중요하다. replicas[0]=1이면 다음 Election시 leader=1이 되고, 3이면 다음선거시 leader=3이 된다.
{"version":1,
"partitions":[
{"topic":"topic-ui5.0-action","partition":0,"replicas":[1,2,3]},
{"topic":"topic-ui5.0-action","partition":1,"replicas":[2,3,1]},
{"topic":"topic-ui5.0-action","partition":2,"replicas":[3,2,1]}
]}
kafka-reassign-partitions --zookeeper ZOOKEEPER-01 --reassignment-json-file increase-replication-factor.json --executehis