Project

General

Profile

Bug #7318

Indexing failure with ObjectPath not found / Was: mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PIDs

Added by Mark Servilla over 8 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Metacat
Target version:
-
Start date:
2015-08-26
Due date:
% Done:

100%

Milestone:
None
Product Version:
*
Story Points:
Sprint:

Description

A set of urn:node:mnTestAEKOS PIDs have failed to be indexed by Solr in the cn-stage environment because their object paths are not present. These objects appear to have been successfully synchronized, but fail to be indexed (see below for an example):

[ INFO] 2015-08-26 17:30:26,510 (IndexTaskProcessor:getNextIndexTask:155) Start
of indexing pid: ltern.21.28
[ INFO] 2015-08-26 17:30:26,583 (IndexTaskProcessor:isObjectPathReady:263) Objec
t path for pid: ltern.21.28 is not available. Task will be retried.
[ INFO] 2015-08-26 17:30:26,588 (IndexTaskProcessor:getNextIndexTask:164) Task f
or pid: ltern.21.28 not processed.

No clear indication of an error is provided in /var/metacat/logs, /var/lib/tomcat7/logs, or /var/log/dataone/index.

The full set of meaningful non-indexed PIDs is attached - AEKOS-not-indexed.txt.

AEKOS-not-indexed.txt Magnifier (2.52 KB) Mark Servilla, 2015-08-26 20:13

opnf.txt Magnifier (13.8 KB) Mark Servilla, 2015-10-09 21:27


Related issues

Related to Member Nodes - MNDeployment #6544: TERN Australia (powered by AEKOS) Operational 2015-03-20
Related to Member Nodes - MNDeployment #6957: NRDC - Nevada Research Data Center Operational 2015-03-25

History

#1 Updated by Mark Servilla over 8 years ago

  • Subject changed from mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PID to mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PIDs

#2 Updated by Rob Nahf over 8 years ago

  • Assignee set to Rob Nahf

Is the metacat deployed to stage using the v2 dataone libclient?

ObjectPathMap.load(Identiifer) method relies on ReplicationService.getDocumentInfoMap() to function, and swallows all exceptions thrown from that service method, logging those exceptions only when log.debug is enabled.

My hunch is that there is a problem with ReplicationService.getDocumentInfoMap(). Code for ObjectPathMap has been stable since 2012. ReplicationService has had many commits in 2014-15, the latest one in May, 2015. A lot related to use of new HttpClient.

#3 Updated by Rob Nahf over 8 years ago

we should probably enable debugging on ObjectPathMap class to check if something is wrong with ReplicationService.

And then refactor the logging to Info or Warn, because we're relying on a service outside the package. There's no logging in ReplicationService....

/**
 *  Implementation of hazelcast MapLoader interface method.
 *  For the provided Identifier (as key), returns the path to the
 *  document on the local filesystem.  Returns null if it can't 
 *  create the path. 
 */
@Override
public String load(Identifier key) 
{

    String docid = null;
    String path = null;
    try {
        docid = im.getLocalId(key.getValue());
        path = pathToDocid(docid);          
    } catch (Exception e) {
        if (logMetacat.isDebugEnabled()) {
                         e.printStackTrace();
                   }
                   return null;
    }
    return path;
}

#4 Updated by Dave Vieglais over 8 years ago

#5 Updated by Mark Servilla over 8 years ago

This same issue is affecting urn:node:mnTestNRDC1 on cn-stage.test.dataone.org:

[ INFO] 2015-09-28 08:06:13,601 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_6
0326--v1.xml
[ INFO] 2015-09-28 08:06:13,707 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60
326--v1.xml is not available. Task will be retried.
[ INFO] 2015-09-28 08:06:13,711 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.
xml not processed.

#6 Updated by Mark Servilla over 8 years ago

Attempted re-indexing of two mnTestNRDC1 objects:

NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml

NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml

Indexing again failed due to "Object path ... is not available" error.

[ INFO] 2015-10-09 21:10:32,942 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml
[ INFO] 2015-10-09 21:10:33,854 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml is not available. Task will be retried.
[ INFO] 2015-10-09 21:10:33,861 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml not processed.
[ INFO] 2015-10-09 21:10:33,865 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml
[ INFO] 2015-10-09 21:10:33,939 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml is not available. Task will be retried.
[ INFO] 2015-10-09 21:10:33,945 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml not processed.

#7 Updated by Mark Servilla over 8 years ago

Performing a simple grep on cn-stage.test.dataone.org (cn-stage-ucsb-1) for "Object path for pid" (unique string for this error) in /var/log/dataone/index/cn-index-processor-daemon.log.* reveals 63 occurrences from both GMN and Metacat MNs in less than a single day - see attached file. I have not verified existence of complete synchronization for all listed objects.

#8 Updated by Laura Moyers over 8 years ago

#9 Updated by Laura Moyers over 8 years ago

  • Subject changed from mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PIDs to Indexing failure with ObjectPath not found / Was: mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PIDs

NRDC (in stage) has been identified as having this problem. Content is successfully synchronized but not indexed. It is possible that the autogen document isn't being generated, so indexing cannot occur. Sample messages follow:

[ INFO] 2015-10-09 21:10:32,942 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml
[ INFO] 2015-10-09 21:10:33,854 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml is not available. Task will be retried.
[ INFO] 2015-10-09 21:10:33,861 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml not processed.
[ INFO] 2015-10-09 21:10:33,865 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml
[ INFO] 2015-10-09 21:10:33,939 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml is not available. Task will be retried.
[ INFO] 2015-10-09 21:10:33,945 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml not processed.

This is occurring across environments and for multiple MNs, but it is not the same content which is not indexed in the different environments.

#10 Updated by Ben Leinfelder over 8 years ago

A number of us worked to track this down (Jing, Skye, Chris) and saw that we could index ltern.21.28 on cn-stage-ucsb-1 when the other two CNs were offline (Tomcat stopped). This is because the object had not [Metacat] replicated to the other two CNs in the environment, but the Object Path Map (Hazelcast) is shared amongst the three when they are all running. There's no telling which CN will 'lookup' the path for the given pid and we were getting the null path for this object (either unm or orc).
Then I noticed that many of the certificates had expired, particularly the ones used for Metacat replication from CN-to-CN. Chris generated new ones and I have installed them and begun a forced replication for all content in that environment. Not sure how long that will take, but new content that is harvested should behave as expected. Let us know if not!

#11 Updated by Rob Nahf almost 8 years ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

Assume there were no problems upon reindexing.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)