Bug #8850

Updated by Rob Nahf over 4 years ago

branch D1_CN_INDEX_PROCESSOR_v2.3 introduced a new method HttpService.getDocumentBySeriesId, to be used by RdfXmlSubprocessor to get the head of a series.

It is buggy in that it determines the head of the series based only on the value of the obsoletedBy field, and in the index, more than one solr document can fit that criteria. The simple case (and that is likely to be enriched in multi-threaded indexing) indexing is when the systemMetadata update of an update of an object is processed at some future time after the indexing of the update itself. When indexing the update, In that case, both the update and the original would be returned from Solr, returned, and the client method would return which ever simply returns one happened to be first in of the list. two.

The v2.4 indexer which uses "total relationship enumeration," will not use this method, so fixing this bug is only important if the 2.3 logic will be maintained.

If it will be maintained, it's important to note that the DataONE spec does not require the obsoletedBy field to be populated, although Metacat does. Mutable MemberNodes my never update the obsoletedBy field of those objects not at the head. Similar logic implemented for resolve should be considered for these cases.


Add picture from clipboard (Maximum size: 14.8 MB)