Today I discovered way to handle changes of IP address for Hadoop cluster managed by CDH 5.3. I think this should also be applicable for CDH 5.x. Took me couple of hours to figure this one out :/
I've done this previously but it was on CDH 4.x. On CDH 4.x, you need to access the DB holding hosts information. You can find the tutorial here. Since CDH 5.x, this is no longer valid, even if you change through the DB, it will get overwritten by Cloudera Manager (CM).
How it works
It's best to get a little bit background on how CM works in CDH 5.x. If you actually went to the link mentioned above and followed the steps up until accessing the DB under CDH 5.x, you will notice that the column
host_identifier inside the table
hosts is no longer the FQDN of each host but instead some sort of hash.
This hash (or uuid) is actually random string generated by Cloudera Manager agent (run by the daemon
cloudera-scm-agent). Now, instead of each host identified by its IP like in CDH 4.x, each host is identified by its uuid. One of the advantage is that how ever you changed the IP it will always get propagated/updated to the CM server (run by the daemon
When IP address changes, whenever
cloudera-scm-agent sent heartbeat to its
cloudera-scm-server (set in
/etc/cloudera-scm-agent/config.ini), it will send together its
uuid and if there is any changes in hostname or ip address, the DB will get updated based on the
uuid. Smart right compared to CDH 4.x?
Say for example that there are nodes having the same
uuid, in my case it happens since for development I run a small cluster in ESX and I create one master copy and clone it to other servers, thus it has the same
uuid. If you have the same
uuid, you will see in Cloudera Manager under the Hosts tab (menu), duplicated hostname/ip for that particular
In order to solve this, you need to generate new
uuid, but how since these are automatically managed by CM.
uuid is kept under
/var/lib/cloudera-scm-agent. There will be two files there,
response.avro. If you
cat uuid, you will get the
uuid for that particular node. In order to change it, follow these steps:
- Stop CM server -
service cloudera-scm-server stop
- On all nodes, stop the CM agent -
service cloudera-scm-agent hard_stop.
hard_stopis needed because we need to restart the
supervisordas well and flushes all the settings pertaining to the CM.
- Now remove both
/var/lib/cloudera-scm-agent, or just rename it just in case.
- Make sure
/etc/cloudera-scm-agent/config.iniis pointing to the right CM server.
- Start CM server -
service cloudera-scm-server start
- On all nodes, start the CM agent -
service cloudera-scm-server clean_startor
service cloudera-scm-server hard_start.
Check CM, you'll see all nodes are working now.