Applies To
- Zenoss 5.x
- Zenoss 4.x
Summary
The findposkeyerror python script (available as part of the zenoss.toolbox project on Github - https://github.com/zenoss/zenoss.toolbox) will scan through a path in dmd (the default path is '/') and check each object in that path, as well as its associated attributes and relations, to make sure that all references refer to valid objects. The findposkeyerror script will log any issues it finds and has an option to attempt to repair a subset of PKE issues with the -f flag.
Note:The term POSKeyError (PKE) refers to a POSitional Key Error. Zenoss uses a Zope object database called zodb. All objects within this database contain relationships to other objects - a PKE exists when one (or more) of those relationships is incorrect. When a reference exists that does not correspond to an actual object, a PKE is thrown. The terms POSKeyError, PKE, and Dangling Reference all refer to the same underlying issue.
Symptoms of POSKeyErrors/Dangling References
Due to the low level nature of PKEs, the symptoms are quite diverse. If you see any of the following, a check for PKEs is called for:
- Yellow flare message (error banner) in the UI mentioning a POSKeyError.
- POSKeyErrors mentioned in any Zenoss daemon log files.
- Errors mentioned in the results from the zodbscan, or zenrelationscan scripts.
The following is an example of a POSKeyError error traceback that displays in the Zope UI:
Traceback (most recent call last): File "/usr/local/lib/python/ZEO/zrpc/connection.py", line 407, in handle_request ret = meth(*args) File "/usr/local/lib/python/ZODB/FileStorage/FileStorage.py", line 655, in modifiedInVersion pos = self._lookup_pos(oid) File "/usr/local/lib/python/ZODB/FileStorage/FileStorage.py", line 534, in _lookup_pos raise POSKeyError(oid) POSKeyError: 0x172024
Important: Although the command findposkeyerror will not make changes to your system, when the command is run with the -f flag set, for example, as findposkeyerror -f, the script will make modifications to your system.
Using the findposkeyerror Script
usage:
findposkeyerror.py [-h] [-v] [-v10] [--tmpdir TMPDIR] [-s] [-f] [-n CYCLES] [-p PATH] [-u]
optional arguments:
-
-h, --help show this help message and exit -v, --version show program's version number and exit -v10, --debug debug verbose log output (debug logging) --tmpdir TMPDIR override the TMPDIR setting -s, --skipEvents skip creating summary events -f, --fix attempt to fix ZenRelationship objects -n CYCLES, --cycles CYCLES maximum times to cycle (with --fix) -p PATH, --path PATH base path to scan from (Devices.Server) -u, --unlimitedram skip transaction.abort() - unbounded RAM, ~40% faster
Example Script Output
[2016-02-18 13:37:45] Initializing findposkeyerror v2.0.0 (detailed log at /opt/zenoss/log/toolbox/findposkeyerror.log) [2016-02-18 13:37:49] Examining items under the '/' path (): [2016-02-18 13:38:17] Cycle 1 | Items Scanned: 133148 | Errors: 0 | [2016-02-18 13:38:17] Execution finished in 0:00:31
is this typical?
2015-01-28 09:04:50,294 INFO findposkeyerror: Initializing findposkeyerror
2015-01-28 09:04:50,294 INFO findposkeyerror: Command line options: {'debug': True, 'path': '/', 'fix': False, 'unlimitedram': False}
2015-01-28 09:04:50,295 DEBUG findposkeyerror: Acquired 'zenoss.toolbox' execution lock
2015-01-28 09:04:53,477 DEBUG findposkeyerror: ZenScriptBase connection obtained
2015-01-28 09:04:53,486 INFO findposkeyerror: Examining items under the '/' path (<Application at >)
2015-01-28 09:04:55,018 WARNING findposkeyerror: AttributeError: jobs on relationship 'jobs' of /zport/dmd/JobManager
2015-01-28 09:13:30,520 INFO findposkeyerror: findposkeyerror completed in 520.23 seconds
2015-01-28 09:13:30,522 INFO findposkeyerror: ###########################################################
Jason, you have, it seems, a single issue with /zport/dmd/JobManager. I would try executing "findposkeyerror -f" to see if the script can fix your issue; if not, I would consult the community forums to see if a solution to that issue exists.
The script does not fix the issue and I have posted to the community forum over a month ago hence the reason I tried this post.
Jason, where is your forum post? Can you include a link, please?
http://www.zenoss.org/forum/3166
i am using zenoss4.2.5 got one DANGLING REFERENCE while performing zodbscan , how to fix this
2017-11-14 13:35:35,304 INFO zodbscan: Initializing zodbscan (version 2.0.0)
2017-11-14 13:35:35,304 INFO zodbscan: Command line options: {'debug': True, 'skipEvents': False, 'tmpdir': '/tmp'}
2017-11-14 13:35:35,304 DEBUG zodbscan: Acquired 'zenoss.toolbox' execution lock
2017-11-14 13:35:36,172 INFO zodbscan: Examining 3365014 items in zodb database
2017-11-14 13:44:10,978 CRITICAL zodbscan: DANGLING REFERENCE (POSKeyError) FOUND:
PATH: 10.20.0.213
TYPE:
OID: 0x001bb9a5 '\x00\x00\x00\x00\x00\x1b\xb9\xa5' 1816997
Refers to a missing object:
NAME: macaddresses
TYPE:
OID: 0x0036b1ba '\x00\x00\x00\x00\x006\xb1\xba' 3584442
2017-11-14 13:55:28,694 INFO zodbscan: 1 Dangling References were detected
2017-11-14 13:55:28,694 INFO zodbscan: zodbscan completed in 1193.39 seconds
2017-11-14 13:55:28,694 INFO zodbscan: ############################################################