Suffice it to say, on zookeeper 3.4.6 (at least) you will have all kinds of grief in a kerberos secured cluster if the zookeeper nodes can’t be looked up by passing the IP to get the hostname.
Specifically, in the /var/log/krb5kdc.log file, you will see the following…
Feb 25 09:02:48 hdp.howard.local krb5kdc[1416](info): TGS_REQ (6 etypes {18 17 16 23 1 3}) 192.168.56.101:
UNKNOWN_SERVER: authtime 0, zookeeper/[email protected] for zookeeper/[email protected],
Server not found in Kerberos database
When the reverse lookup record is present in DNS, you will see the following…
Feb 25 09:06:48 hdp.howard.local krb5kdc[1416](info): TGS_REQ (6 etypes {18 17 16 23 1 3}) 192.168.56.101:
ISSUE: authtime 1424871759, etypes {rep=18 tkt=18 ses=18}, zookeeper/[email protected]
for zookeeper/[email protected]
There was an issue for reverse lookups that was performance related, but I don’t think it is the same issue.
Regardless, the org/apache/zookeeper/ClientCnxn class snippet below is the one that does the call…
private void startConnect() throws IOException {
if(!isFirstConnect){
try {
Thread.sleep(r.nextInt(1000));
} catch (InterruptedException e) {
LOG.warn("Unexpected exception", e);
}
}
state = States.CONNECTING;
InetSocketAddress addr;
if (rwServerAddress != null) {
addr = rwServerAddress;
rwServerAddress = null;
} else {
addr = hostProvider.next(1000);
}
setName(getName().replaceAll("\\(.*\\)",
"(" + addr.getHostName() + ":" + addr.getPort() + ")"));
if (ZooKeeperSaslClient.isEnabled()) {
try {
String principalUserName = System.getProperty(
ZK_SASL_CLIENT_USERNAME, "zookeeper");
zooKeeperSaslClient = new ZooKeeperSaslClient(principalUserName+"/"+addr.getHostName());
} catch (LoginException e) {