Applies To
- Zenoss 5.x
- Zenoss 4.x
Summary
Administrators often would like to have the severity of an event generated by a threshold reflect the difference between the threshold data point's value and the maximum acceptable value. The following table illustrates a common scenario when monitoring file system usage:
File System Usage | Event Severity |
> 80% | Warning |
> 90% | Error |
> 95% | Critical |
Although Resource Manager provides this capability via the creation of multiple min/max thresholds against a single data point, administrators want to avoid generating multiple events against the same threshold value. For example, given the table above, if a monitored file system grows to 92% full, two events will be generated: a Warning event and an Error event.
A simple technique prevents multiple open events: when a Min/Max threshold’s minimum setting exceeds its maximum setting, data point values inside the range are considered “unacceptable” and will trigger events. For example, a threshold with a minimum of 90% and a maximum of 10% will trigger an event on 60% or any other value between 10% and 90% exclusive. A value of exactly 10% or 90% will not generate an event. The example procedure below takes this behavior into account to avoid gaps in the range of values covered by multiple thresholds. Administrators are able to create “excluded range” thresholds that don’t overlap with one another yet prevent multiple events for the same issue.
Monitoring Templates Background
Because changes to a Zenoss-supplied ("out of the box") monitoring template should not be made, this example procedure follows the recommended procedure of copying the target template to a new, dedicated device class and modifying the template copy.
Example Procedure
This procedure is based on the example scenario described above. Administrators can modify it for other component and device data points.
NOTES:
- Although this procedure is specific to Zenoss 5.x, the concepts can be applied to Zenoss 4.x.
- To simplify these instructions, the example value is listed in bold and enclosed in square brackets like this: [value]. Administrators should substitute appropriate values for their specific tasks.
Create a New Device Subclass
- In the Zenoss UI, click the INFRASTRUCTURE tab.
- In the device class tree on the left, navigate to the device class under which you will create the new subclass. In this example, select [/Server/Linux].
- With the parent device class selected, click the plus (+) button at the bottom of the left pane to launch the Add Device Class dialog.
- In the Add Device Class dialog:
- Enter the Name of your new device class. For this example, [Storage].
- Click SUBMIT.
Locate the Monitoring Template to Copy
- Click the INFRASTRUCTURE tab.
- In the right pane, click the name of a device that monitors the data point that the thresholds will test.
- In the left pane, select the Component Type associated with the data point. For this example, [File Systems]. The properties pane for the device displays:
- Select Templates from the Display menu.
- Note the name of the template.
- Click the ADVANCED tab.
- Click Monitoring Templates in the secondary navigation row.
- Double-click the name of the monitoring template from step 5 above [FileSystem].
- Find and select the definition of the monitoring template closest to the target device class. Note that it might be at the target device class or a device class higher in the hierarchy [/Server].
Copy the Target Monitoring Template
- Click the gear icon at the bottom of the left pane to display the gear menu.
- From the gear menu, select Copy / Override Template to display the Copy / Override dialog.
- In the Copy / Override dialog:
- Select the device class you created above as the Target. For this example, [/Server/Linux/Storage].
- Click SUBMIT.
Edit The Template Copy
- If necessary, navigate to ADVANCED and click Monitoring Templates:
- On the Monitoring Template page, in the left pane navigate to the new template and select it. For this example, [FileSystem > /Server/Linux/Storage].
- If there is already a threshold defined for the datapoint, remove it:
- Select the existing threshold.
- Click the minus (-) button in the Thresholds pane.
- Click OK in the Delete Threshold dialog.
- Add one threshold for each event severity to (potentially) generate. For this example, create three thresholds: one each for Warning, Error, and Critical.
To create a new threshold:- Click the plus (+) button in the Thresholds pane.
The UI displays the Add Threshold dialog:
- Enter the Threshold's Name as shown in the table below.
- Click Add to close the dialog and submit the information.
- Click the plus (+) button in the Thresholds pane.
- To edit a threshold, double-click on its row.
This displays the Edit Threshold dialog:
Make any changes and click Save to submit the information and exit the dialog.
- Edit each threshold to set their values. For this example, set the values shown below:
Threshold 1: Warning Name high disk warning Type MinMaxThreshold DataPoints usedBlocks_usedBlocks Severity Warning Minimum Value here.getTotalBlocks() * .900001 Maximum Value here.getTotalBlocks() * .8
Threshold 2: Error Name high disk error Type MinMaxThreshold DataPoints usedBlocks_usedBlocks Severity Error Minimum Value here.getTotalBlocks() * .950001 Maximum Value here.getTotalBlocks() * .9
Threshold 3: Critical Name high disk critical Type MinMaxThreshold DataPoints usedBlocks_usedBlocks Severity Critical Minimum Value leave blank Maximum Value here.getTotalBlocks() * .95
Use the New Template
To use the new thresholds, move the device(s) to be monitored into the new device class:
- Click Infrastructure.
- Navigate to thenew device class in the left pane [Devices > Server > Linux >Storage ].
- Click [/Server/Linux] to display its devices in the right pane.
- Select (highlight) the device row(s) in right pane, for example [linuxsnmp.hypothetical.loc].
- Drag and drop (release) the selected device(s), onto the new device class in the right pane [/Server/Linux/Storage].
Note: Start with the mouse over the device row(s) anywhere except on a hyperlinked field (the device name and device class are hyperlinks).
When you release the dragged device(s), the UI displays the Move Devices dialog. - Click OK to verify the move, save and close the dialog.
Notes about Setting Threshold Values
Component thresholds can be defined as a percentage of a total available value. For example, for FileSystem components, the expression here.getTotalBlocks() returns the maximum capacity of the file system under consideration. To set a threshold value of 95% of maximum, multiply the maximum value by .95. For example:
here.getTotalBlocks() * .95
This sets the value equal to 95% of the available disk space in that file system. Note however, there is an important caveat concerning exclusion of values:
MinMax threshold values are exclusive. This means a minimum threshold of here.getTotalBlocks() * .95 will not generate an event with a data point value exactly equal to 95. To ensure the 95 is included within the range that will trigger an event, the configured threshold value must include a slightly greater value. For example, the Error Threshold in our table above specifies a minimum of here.getTotalBlocks() * .950001 to include (generate an event for) a value of exactly 95.
The maximum expressions used in the example strictly follow the table in the Summary section. Note that the table specifies > (greater-than) and not ≥ (greater-than or equal to).
Comments