Follow

How to use findposkeyerror to scan/fix zodb dangling references

Applies To

  • Zenoss 5.x
  • Zenoss 4.x

Summary

The findposkeyerror python script (available as part of the zenoss.toolbox project on Github - https://github.com/zenoss/zenoss.toolbox) will scan through a path in dmd (the default path is '/') and check each object in that path, as well as its associated attributes and relations, to make sure that all references refer to valid objects. The findposkeyerror script will log any issues it finds and has an option to attempt to repair a subset of PKE issues with the -f flag.

Note:The term POSKeyError (PKE) refers to a POSitional Key Error. Zenoss uses a Zope object database called zodb. All objects within this database contain relationships to other objects - a PKE exists when one (or more) of those relationships is incorrect. When a reference exists that does not correspond to an actual object, a PKE is thrown. The terms POSKeyError, PKE, and Dangling Reference all refer to the same underlying issue.

Symptoms of POSKeyErrors/Dangling References

Due to the low level nature of PKEs, the symptoms are quite diverse. If you see any of the following, a check for PKEs is called for:

  • Yellow flare message (error banner) in the UI mentioning a POSKeyError.
  • POSKeyErrors mentioned in any Zenoss daemon log files.
  • Errors mentioned in the results from the zodbscan, or zenrelationscan scripts.

The following is an example of a POSKeyError error traceback that displays in the Zope UI:

 Traceback (most recent call last):
 File "/usr/local/lib/python/ZEO/zrpc/connection.py", line 407, in handle_request ret = meth(*args)
 File "/usr/local/lib/python/ZODB/FileStorage/FileStorage.py", line 655, in modifiedInVersion pos = self._lookup_pos(oid)
 File "/usr/local/lib/python/ZODB/FileStorage/FileStorage.py", line 534, in _lookup_pos raise POSKeyError(oid) POSKeyError: 0x172024

Important: Although the command findposkeyerror will not make changes to your system, when the command is run with the -f flag set, for example, as findposkeyerror -f, the script will make modifications to your system.

Using the findposkeyerror Script

usage:

findposkeyerror.py [-h] [-v] [-v10] [--tmpdir TMPDIR] [-s] [-f] [-n CYCLES] [-p PATH] [-u]

optional arguments:

-h, --help show this help message and exit
-v, --version show program's version number and exit
-v10, --debug debug verbose log output (debug logging)
--tmpdir TMPDIR override the TMPDIR setting
-s, --skipEvents skip creating summary events
-f, --fix attempt to fix ZenRelationship objects
  -n CYCLES, --cycles CYCLES maximum times to cycle (with --fix)
-p PATH, --path PATH base path to scan from (Devices.Server)
-u, --unlimitedram skip transaction.abort() - unbounded RAM, ~40% faster

Example Script Output

[2016-02-18 13:37:45] Initializing findposkeyerror v2.0.0 (detailed log at /opt/zenoss/log/toolbox/findposkeyerror.log)
[2016-02-18 13:37:49] Examining items under the '/' path ():

[2016-02-18 13:38:17]  Cycle 1  | Items Scanned:       133148 | Errors:       0 |

[2016-02-18 13:38:17] Execution finished in 0:00:31
Was this article helpful?
0 out of 0 found this helpful

Comments

  • Avatar
    Jason

    is this typical?
    2015-01-28 09:04:50,294 INFO findposkeyerror: Initializing findposkeyerror
    2015-01-28 09:04:50,294 INFO findposkeyerror: Command line options: {'debug': True, 'path': '/', 'fix': False, 'unlimitedram': False}
    2015-01-28 09:04:50,295 DEBUG findposkeyerror: Acquired 'zenoss.toolbox' execution lock
    2015-01-28 09:04:53,477 DEBUG findposkeyerror: ZenScriptBase connection obtained
    2015-01-28 09:04:53,486 INFO findposkeyerror: Examining items under the '/' path (<Application at >)
    2015-01-28 09:04:55,018 WARNING findposkeyerror: AttributeError: jobs on relationship 'jobs' of /zport/dmd/JobManager
    2015-01-28 09:13:30,520 INFO findposkeyerror: findposkeyerror completed in 520.23 seconds
    2015-01-28 09:13:30,522 INFO findposkeyerror: ###########################################################

  • Avatar
    Brian Bibeault

    Jason, you have, it seems, a single issue with /zport/dmd/JobManager. I would try executing "findposkeyerror -f" to see if the script can fix your issue; if not, I would consult the community forums to see if a solution to that issue exists.

  • Avatar
    Jason

    The script does not fix the issue and I have posted to the community forum over a month ago hence the reason I tried this post.

  • Avatar
    Brian Bibeault

    Jason, where is your forum post? Can you include a link, please?

  • Avatar
    Jason
  • Avatar
    Vijay Bandapally

    i am using zenoss4.2.5 got one DANGLING REFERENCE while performing zodbscan , how to fix this

    2017-11-14 13:35:35,304 INFO zodbscan: Initializing zodbscan (version 2.0.0)
    2017-11-14 13:35:35,304 INFO zodbscan: Command line options: {'debug': True, 'skipEvents': False, 'tmpdir': '/tmp'}
    2017-11-14 13:35:35,304 DEBUG zodbscan: Acquired 'zenoss.toolbox' execution lock
    2017-11-14 13:35:36,172 INFO zodbscan: Examining 3365014 items in zodb database
    2017-11-14 13:44:10,978 CRITICAL zodbscan: DANGLING REFERENCE (POSKeyError) FOUND:
    PATH: 10.20.0.213
    TYPE:
    OID: 0x001bb9a5 '\x00\x00\x00\x00\x00\x1b\xb9\xa5' 1816997
    Refers to a missing object:
    NAME: macaddresses
    TYPE:
    OID: 0x0036b1ba '\x00\x00\x00\x00\x006\xb1\xba' 3584442
    2017-11-14 13:55:28,694 INFO zodbscan: 1 Dangling References were detected
    2017-11-14 13:55:28,694 INFO zodbscan: zodbscan completed in 1193.39 seconds
    2017-11-14 13:55:28,694 INFO zodbscan: ############################################################

Powered by Zendesk