Today I discovered way to handle changes of IP address for Hadoop cluster managed by CDH 5.3. I think this should also be applicable for CDH 5.x. Took me couple of hours to figure this one out :/
I've done this previously but it was on CDH 4.x. On CDH 4.x, you need to access the DB holding hosts information. You can find the tutorial here. Since CDH 5.x, this is no longer valid, even if you change through the DB, it will get overwritten by Cloudera Manager (CM).
How it works
It's best to get a little bit background on how CM works in CDH 5.x. If you actually went to the link mentioned above and followed the steps up until accessing the DB under CDH 5.x, you will notice that the column host_identifier
inside the table hosts
is no longer the FQDN of each host but instead some sort of hash.
This hash (or uuid) is actually random string generated by Cloudera Manager agent (run by the daemon cloudera-scm-agent
). Now, instead of each host identified by its IP like in CDH 4.x, each host is identified by its uuid. One of the advantage is that how ever you changed the IP it will always get propagated/updated to the CM server (run by the daemon cloudera-scm-server
).
When IP address changes, whenever cloudera-scm-agent
sent heartbeat to its cloudera-scm-server
(set in /etc/cloudera-scm-agent/config.ini
), it will send together its uuid
and if there is any changes in hostname or ip address, the DB will get updated based on the uuid
. Smart right compared to CDH 4.x?
Conflict
Say for example that there are nodes having the same uuid
, in my case it happens since for development I run a small cluster in ESX and I create one master copy and clone it to other servers, thus it has the same uuid
. If you have the same uuid
, you will see in Cloudera Manager under the Hosts tab (menu), duplicated hostname/ip for that particular uuid
.
In order to solve this, you need to generate new uuid
, but how since these are automatically managed by CM.
These uuid
is kept under /var/lib/cloudera-scm-agent
. There will be two files there, uuid
and response.avro
. If you cat uuid
, you will get the uuid
for that particular node. In order to change it, follow these steps:
- Stop CM server -
service cloudera-scm-server stop
- On all nodes, stop the CM agent -
service cloudera-scm-agent hard_stop
.hard_stop
is needed because we need to restart thesupervisord
as well and flushes all the settings pertaining to the CM. - Now remove both
uuid
andresponse.avro
under/var/lib/cloudera-scm-agent
, or just rename it just in case. - Make sure
/etc/cloudera-scm-agent/config.ini
is pointing to the right CM server. - Start CM server -
service cloudera-scm-server start
- On all nodes, start the CM agent -
service cloudera-scm-server clean_start
orservice cloudera-scm-server hard_start
.
Check CM, you'll see all nodes are working now.