Replace a failed YB-TServer
You can replace a failed YB-TServer in a YugabyteDB cluster, as follows:
-
Install and then start a new YB-TServer, making sure it is in the same placement group as the one you are replacing. For detailed instructions, see Start YB-TServers.
-
Blacklist the failed YB-TServer by using the following command:
~/master/bin/yb-admin -master_addresses $MASTERS change_blacklist ADD $OLD_IP:9100
For details, see change_blacklist in the yb-admin reference page.
-
Wait for the data to drain from the failed YB-TServer and for the data to be loaded into the new one. You can check for the completion of rebalancing by running the following command:
~/master/bin/yb-admin -master_addresses $MASTERS get_load_move_completion
Loading and rebalancing will complete only if data from the failed YB-TServer has someplace to be stored. You would need to start the new YB-TServer first, or ensure your remaining YB-TServers have enough capacity and are in the correct placement zones.
For details on using this command, see
get_load_move_completion
. -
When the data move is complete (100%), kill the failed YB-TServer by stopping the
yb-tserver
process or terminating the VM. Then wait for the YB-TServer to be marked asDEAD
by the YB-Master leader. The YB-Master leader will mark the server asDEAD
after not responding for one minute (based on thetserver_unresponsive_timeout_ms
, with default being60000
).To verify that the failed YB-TServer is dead, open your web browser to
$MASTER_LEADER_IP:7000/tablet-servers
and check the output. -
Because the replacement YB-TServer is running and is loaded with data, remove the address for the failed YB-TServer from the blacklist, as follows:
~/master/bin/yb-admin -master_addresses $MASTERS change_blacklist REMOVE $OLD_IP:9100