Project

General

Profile

Feature #8107

Improved Node Synchronization Feedback - Leveraging MNRead.synchronizationFailed()

Added by Monica Ihli almost 7 years ago. Updated over 6 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Architecture Design
Target version:
Start date:
2017-06-06
Due date:
% Done:

0%

Milestone:
None
Product Version:
*
Story Points:
Sprint:

Description

MN Operators would benefit from clearer feedback from the CN to be aware of when syncs fail for a PID.

  • GMN - currently logs MNRead.synchronizationFailed() requests received as part of the regular GMN logs.
  • Metacat - also currently just logs MNRead.synchronizationFailed() requests as part of regular logs.
  • One solution might be to simply log these failures in a separate file to make it easier for node operators to identify when something fails to sync.

Additionally, we should confirm what kind of information is getting sent back to MNs with MNRead.synchronizationFailed().

History

#1 Updated by Rob Nahf almost 7 years ago

  • Category changed from d1_synchronization to Architecture Design
  • Assignee deleted (Robert Waltz)

This is a good feature request. We currently communicate to the MN itself, via synchronizationFailed, but this has long been seen as insufficient. Logging sync failures might be a good start, but still requires MN operators to log into a system to see how things are going. Would MN operators really be disciplined to log onto portal, for example, to see if anything went wrong?

We have the MN operator's email address, perhaps we should send daily digests of failures? (submit to a messaging queue, then have a scheduled consumer read from the queue and send the email daily)

#2 Updated by Monica Ihli almost 7 years ago

An example of GMN's logs after a synchronize call for a bad pid:

2017-06-13 17:16:35 WARNING django.request base 26457 140507107538688 Not Found: /mn/v2/meta/electricsquirrel
2017-06-13 17:16:35 ERROR root external 26458 140507107538688 Received notification of synchronization error from CN but was unable to deserialize the DataONE Exception passed by the CN.
Exception passed by CN: <?xml version="1.0" encoding="UTF-8"?>

Synchronization task of [PID::] electricsquirrel [::PID] failed. Cause: NotFound: Unknown identifier. id="electricsquirrel"

Exception when deserializing: Deserialization failed with exception:
Traceback (most recent call last):
File "/var/local/dataone/gmn_venv/local/lib/python2.7/site-packages/d1_common/types/exceptions.py", line 61, in deserialize
dataone_exception_xml
File "/var/local/dataone/gmn_venv/local/lib/python2.7/site-packages/d1_common/types/generated/dataoneErrors.py", line 65, in CreateFromDocument
saxer.parse(io.BytesIO(xmld))
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 110, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 213, in feed
self.parser.Parse(data, isFinal)
File "/usr/lib/python2.7/xml/sax/expatreader.py", line 354, in start_element_ns
AttributesNSImpl(newattrs, qnames))
File "/var/local/dataone/gmn_venv/local/lib/python2.7/site-packages/pyxb/binding/saxer.py", line 370, in startElementNS
binding_object = this_state.startBindingElement(type_class, new_object_factory, element_decl, attrs)
File "/var/local/dataone/gmn_venv/local/lib/python2.7/site-packages/pyxb/binding/saxer.py", line 207, in startBindingElement
self.
constructElement(new_object_factory, attrs)
File "/var/local/dataone/gmn_venv/local/lib/python2.7/site-packages/pyxb/binding/saxer.py", line 135, in __constructElement
self.
_bindingInstance._setAttribute(attr_en, attrs.getValue(attr_name))
File "/var/local/dataone/gmn_venv/local/lib/python2.7/site-packages/pyxb/binding/basis.py", line 2251, in _setAttribute
raise pyxb.UnrecognizedAttributeError(type(self), attr_en, self)
UnrecognizedAttributeError: Attempt to reference unrecognized attribute pid in type

On input:
<?xml version="1.0" encoding="UTF-8"?>

Synchronization task of [PID::] electricsquirrel [::PID] failed. Cause: NotFound: Unknown identifier. id="electricsquirrel"

#3 Updated by Dave Vieglais over 6 years ago

  • Target version set to CCI-2.4.0

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)