Troubleshooting Cisco UCS Performance Manager Downtime
As a system administrator, I recently encountered an issue with the Cisco UCS performance manager (based on Zenoss 5) stopping responding with the main serviced daemon inactive. This resulted in unresponsive web access and all features being unavailable. After investigating the issue, I found that the problem was caused by the service being stuck in an inactive state.
To resolve this issue, I created a small script to check the status of the service and restart it if necessary. The script is based on the following code:
“`
#!/bin/bash
service=serviced
if (( $(ps aux | grep -v grep| grep -v “$0” | grep serviced| wc -l) > 0 )) then
echo “$service is running!!!!”
else
service $service restart
fi
“`
The script first checks the status of the serviced daemon using the `ps aux` command with some basic filtering to ignore the script itself and any `grep` commands. If the daemon is running, the script prints a success message and exits. Otherwise, it restarts the service using the `service` command.
To use the script, I gave it executable rights using the `chmod +x [name of script]` command and scheduled it to run periodically using a standard cron job. The cron job runs the script every minute, which is sufficient for my needs since the issue only occurs occasionally.
The reason I created this script is that there is no official solution or fix available from Cisco for this issue. Until something better comes out, this script should do the trick in keeping the serviced daemon running and preventing downtime.
In conclusion, if you are experiencing issues with Cisco UCS performance manager (based on Zenoss 5) stopping responding with the main serviced daemon inactive, you may want to try this script as a temporary solution until something better becomes available. The script is easy to create and use, and it has been successful in preventing downtime for me so far.