Applies To
- Zenoss 4.2.x
Summary
Stolen CPU Cycles refers to a virtual machine (VM) that is forced to wait for CPU resources. This happens because the VM host machine has allocated CPU resources to other tasks such as those for another VM.
In an ideal world, the percentage of lost (stolen CPU cycles) time should approach zero. Anything above zero means there is some performance degradation, typically caused by issues such as:
- Inadequate CPU resources assigned to the VM in question.
- The physical server is oversubscribed and the VMs are competing for scarce CPU resources.
Adding the CPU steal time metric to your Linux server monitoring regimen enables you to monitor for stolen CPU cycles.
This information is also available in the wiki artcle How Do I Monitor for Stolen CPU on Linux Servers? on the Zenoss wiki.
Procedure
To add the CPU steal time metric to your Linux server monitoring regimen, edit the monitoring template for Linux servers to activate the %st (steal time) counter. With this counter activated, Zenoss can monitor for and report a high steal time value that exceeds your threshold. You can drill down in the Zenoss console to determine the host CPU usage. With this information, you can determine how to prevent additional time frombeing lost due to steal time.
To Monitor for Stolen CPU Cycles:
- Ensure that you have Net-SNMP 5.7 or higher installed on the Linux server where you want to monitor for stolen CPU.
- Edit the monitoring template used to monitor Linux servers to include the ssCPURawSteal data source. For more information, see “Adding ssCPURawSteal to the Monitoring Template”.
- Set up a graph that displays stolen CPU when you view the Linux server. For more information, see “Setting Up a Graph That Displays Stolen CPU”.
- Set up a threshold that alerts you when CPU is being stolen. For more information, see “Setting Up a Threshold for Alerting on Stolen CPU”.
Adding ssCPURawSteal to the Monitoring Template
To monitor for stolen CPU cycles on Linux servers, add the ssCPURawSteal data source to the monitoring template for Linux servers:
- In the Zenoss Console, click the Advanced tab
- Click Monitoring Templates.
- In the left tree pane, under Device, click Server/Linux.
- Add a new data source for CPU steal time:
- In the Data Sources area, click the plus (+) sign.
- In the Add Data Source dialog box, in the Name field, type ssCpuRawSteal.
- In the Type field, specify SNMP, and then click Submit.
- Add the SNMP OID for ssCpuRawSteal:
- Select the ssCpuRawSteal data point and click the Gear icon, or Edit icon.
- Click View and Edit Details.
- In the OID field, enter the string:
1.3.6.1.4.1.2021.11.64.0 - Click Save.
- Edit the RRD Type for the data point:
- Under ssCPUawSteal select ssCPURawSTeal.ssCPURawSteal
- Click the Gear icon, or Edit icon
- Click View and Edit Details.
- In the RRD Type field, select DERIVE from the drop-down list. This calculates the rate at which CPU is being stolen, as a percentage.
- Click Save.
Setting Up a Graph That Displays Stolen CPU
To set up a graph that shows stolen CPU cycles for the Linux server:
- Under Graph Definitions, ensure you have a CPU Utilization graph set.
- Under Graph Definitions, select the CPU Utilization graph.
- Click on the gear button, or Edit button
- Click Manage Graph Points.
- On the Manage Graph Points dialog box, click the plus (+) sign.
- Click Data Point.
- In the Data Point field, select ssCpuRawSteal.ssCpuRawSteal
- Click Submit.
- Click Save.
Successfully configuring this graph means you have a graph that displays stolen CPU cycles when you look at the Linux server.
Setting Up a Threshold for Alerting on Stolen CPU Cycles
To set a threshold trigger for an alert when stolen CPU cycles on a Linux server exceeds the threshold value:
- Under Thresholds, click the plus (+) sign.
- In the Name field, specify the name you want to use for the threshold. For example, type High Stolen CPU.
- In the Type field, select MinMaxThreshold from the drop-down list.
- Click Add.
- Select the new High Stolen CPU threshold you just created.
- Click on the Gear icon, or Edit button.
- On the Edit Threshold dialog box, under Data Points, ensure the ssCPURawStea_SSCPURawSteal data point displays in the Selected column.
- In the Maximum Value field, specify a value. For example, if you want to receive an alert when more than 10% of the CPU is being stolen, type 10.
- In the Severity field, select a type for the event, such as Warning.
- In the Event Class field, specify the event class you want to use, such as /Perf/CPU.
- Click Save.
Successfully completing this process means you are collecting stolen CPU cycle information from your Linux server. You will receive an alert when the stolen CPU cycle value exceeds 10%.
Comments