Applies To
- Zenoss 4.2.3
Summary
In the event of a hardware or disk failure, it is important to have a backup of your distributed hub. The distributed hub backup can be used to restore your system in a quick and efficient manner.
Procedure
The following procedures describe how to backup, recover and restore your distributed hub.
Backing up your distributed hub:
You can choose to either backup the hub manually or use a scheduled backup. As part of best practices, set up a cron job using crontab to ensure regular backups of your distributed hubs.
Backup manually
The following is an example of how to manually backup your distributed hub:
# su zenoss
$ zenbackup --no-eventsdb --no-zodb
This creates the backup file in /opt/zenoss/backups. The file name has the format: zenbackup_[date stamp].tgz
Unless /opt/zenoss/backups is a mount point on a separate drive from the rest of Zenoss, we recommend that you schedule an additional procedure to move the resulting backup files to a different physical drive.
Note: The backup file is created in the default directory opt/zenoss/backups if a file location is not specified.
Backup using crontab
The following example explains how to add a chrontab entry to automatically backup your distributed hub at the beginning of every month to the default opt/zenoss/backups directory.
- Edit the crontab file as the zenoss user:
$ crontab -e - Add an entry, for example:
@monthly zenbackup --no-eventsdb --no-zodb - Save and close the chrontab file.
- To verify your new cron job, enter the following:
$ crontab -l
Restoring your distributed hub:
In the case of a hub hardware failure such as an unrecoverable drive, perform the following steps on the distributed hub:
Prepare the Hub
- Return the machine to physical running condition.
- Install the same version of CentOS or RHEL that is running on the main Resource Manager.
- Configure the system the same as before - ensure that you use the same IP address and name for the system.
Consult the Zenoss Service Dynamics Resource Management Installation guide for your Zenoss version and complete the prerequisite steps for installing a hub.
Note: Do not execute the steps related to deploying the hub from the Zenoss user interface.
Prepare the User
- As root, add the zenoss user with a /home/zenoss/ home directory . The following command adds a user:
# useradd zenoss -m - Become the zenoss user:
# su zenoss - As the zenoss user, create the directory /home/zenoss/.ssh:
$ mkdir /home/zenoss/.ssh - As the zenoss user, change the rights for the /home/zenoss/.ssh directory:
$ chmod 700 /home/zenoss/.ssh - Copy the zenoss user's public key from the master. This is located in the /home/zenoss/.ssh/id_rsa.pub file.
- Paste the key value into the file /home/zenoss/.ssh/authorized_keys (located in the home directory of the zenoss user on the hub target system).
- Change the rights on the /home/zenoss/.ssh/authorized_keys:
$ chmod 600 /home/zenoss/.ssh/authorized_keys - Exit back to the root user.
- Create the directory /opt/zenoss with the following command:
# mkdir /opt/zenoss/ - Change the directory ownership to the zenoss user with the following command:
# chown zenoss:zenoss /opt/zenoss/
Update the Hub
To update the hub from the Resource Manager instance (or server), perform the following:
- Navigate to Advanced → Collectors to display the list of collectors and hubs.
- Click the name of your hub to display the Hub Configuration pane on the right.
- Click the Hub Configuration action wheel (gear icon) to display the drop-down selection list.
- Select Update Hub from the list. The Update Hub dialog box displays.
- Click OK in the Update Hub dialog box to update the hub. The Update Hub progress window displays.
This synchronizes the hub files to create a working hub to receive your backup file for restoration. - From the command line of the hub system, issue the following command as the zenoss user to stop Zenoss:
$ zenoss stop
- As the zenoss user, issue the following command to search for and display any Zenoss daemons that failed to stop:
$ ps ax | grep zenoss
Consult the output from the command. Look for and verify all Zenoss daemons are stopped. If no Zenoss daemons are running, the output will be blank (the desired result).
If the output is not blank, use the process IDs to stop Zenoss daemons that are still running, for example:
kill 1234
where 1234 is the Zenoss daemon process ID. - As the zenoss user, copy the restoration backup file from the external location to a directory on the local distributed hub machine.
- Restore the backed up hub files with the following command. Replace BACKUPFILEPATH with your file path:
$ZENHOME/backup/zenrestore --file=BACKUPFILEPATH
Update Collectors
- Update any collectors physically running on this hub. In the UI, navigate to Advanced → Collectors to display the list of collectors and hubs.
- Click the collector name to display the Performance Collector Configuration pane on the right.
- Click the Performance Collector Configuration action wheel (gear icon) to display the drop-down selection list.
- Select Update Collector from the list. The Update Collector dialog box displays.
- Click OK in the Update Collector dialog box to update the collector.
- After the collectors are updated, log into the remote hub.
Restart Hub
- As the zenoss user, stop Zenoss on the remote hub. Issue the following command:
$ zenoss stop - As the zenoss user, issue the following command to search for and display any Zenoss daemons that failed to stop:
$ ps ax | grep zenoss
Consult the output from the command. Look for and verify all Zenoss daemons are stopped. If no Zenoss daemons are running, the output will be blank (the desired result...).
If the output is not blank, use the process IDs to stop Zenoss daemons that are still running, for example:
kill 1234
where 1234 is the Zenoss daemon process ID. - On the remote hub, as the zenoss user, start Zenoss with the following command:
$ zenoss start - Verify the Zenoss daemons continue to run after being started. Wait at least 20 seconds after the start command completes and issue the following command:
$ zenoss status
The following output shows the result or a successful start of the hub and an appropriate status for the hub with the daemons running :Any distributed collectors residing on different hardware and using your distributed hub must be restarted. If they use the same naming and IP address for the restored hub, they should come back online and work correctly.
Comments