Bug #7318
Indexing failure with ObjectPath not found / Was: mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PIDs
100%
Description
A set of urn:node:mnTestAEKOS PIDs have failed to be indexed by Solr in the cn-stage environment because their object paths are not present. These objects appear to have been successfully synchronized, but fail to be indexed (see below for an example):
[ INFO] 2015-08-26 17:30:26,510 (IndexTaskProcessor:getNextIndexTask:155) Start
of indexing pid: ltern.21.28
[ INFO] 2015-08-26 17:30:26,583 (IndexTaskProcessor:isObjectPathReady:263) Objec
t path for pid: ltern.21.28 is not available. Task will be retried.
[ INFO] 2015-08-26 17:30:26,588 (IndexTaskProcessor:getNextIndexTask:164) Task f
or pid: ltern.21.28 not processed.
No clear indication of an error is provided in /var/metacat/logs, /var/lib/tomcat7/logs, or /var/log/dataone/index.
The full set of meaningful non-indexed PIDs is attached - AEKOS-not-indexed.txt.
Related issues
History
#1 Updated by Mark Servilla over 9 years ago
- Subject changed from mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PID to mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PIDs
#2 Updated by Rob Nahf over 9 years ago
- Assignee set to Rob Nahf
Is the metacat deployed to stage using the v2 dataone libclient?
ObjectPathMap.load(Identiifer) method relies on ReplicationService.getDocumentInfoMap() to function, and swallows all exceptions thrown from that service method, logging those exceptions only when log.debug is enabled.
My hunch is that there is a problem with ReplicationService.getDocumentInfoMap(). Code for ObjectPathMap has been stable since 2012. ReplicationService has had many commits in 2014-15, the latest one in May, 2015. A lot related to use of new HttpClient.
#3 Updated by Rob Nahf over 9 years ago
we should probably enable debugging on ObjectPathMap class to check if something is wrong with ReplicationService.
And then refactor the logging to Info or Warn, because we're relying on a service outside the package. There's no logging in ReplicationService....
/** * Implementation of hazelcast MapLoader interface method. * For the provided Identifier (as key), returns the path to the * document on the local filesystem. Returns null if it can't * create the path. */ @Override public String load(Identifier key) { String docid = null; String path = null; try { docid = im.getLocalId(key.getValue()); path = pathToDocid(docid); } catch (Exception e) { if (logMetacat.isDebugEnabled()) { e.printStackTrace(); } return null; } return path; }
#4 Updated by Dave Vieglais over 9 years ago
- Related to MNDeployment #6544: TERN Australia (powered by AEKOS) added
#5 Updated by Mark Servilla about 9 years ago
This same issue is affecting urn:node:mnTestNRDC1 on cn-stage.test.dataone.org:
[ INFO] 2015-09-28 08:06:13,601 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_6
0326--v1.xml
[ INFO] 2015-09-28 08:06:13,707 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60
326--v1.xml is not available. Task will be retried.
[ INFO] 2015-09-28 08:06:13,711 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.
xml not processed.
#6 Updated by Mark Servilla about 9 years ago
Attempted re-indexing of two mnTestNRDC1 objects:
NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml¶
NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml¶
Indexing again failed due to "Object path ... is not available" error.
[ INFO] 2015-10-09 21:10:32,942 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml
[ INFO] 2015-10-09 21:10:33,854 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml is not available. Task will be retried.
[ INFO] 2015-10-09 21:10:33,861 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml not processed.
[ INFO] 2015-10-09 21:10:33,865 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml
[ INFO] 2015-10-09 21:10:33,939 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml is not available. Task will be retried.
[ INFO] 2015-10-09 21:10:33,945 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml not processed.
#7 Updated by Mark Servilla about 9 years ago
- File opnf.txt added
Performing a simple grep on cn-stage.test.dataone.org (cn-stage-ucsb-1) for "Object path for pid" (unique string for this error) in /var/log/dataone/index/cn-index-processor-daemon.log.* reveals 63 occurrences from both GMN and Metacat MNs in less than a single day - see attached file. I have not verified existence of complete synchronization for all listed objects.
#8 Updated by Laura Moyers about 9 years ago
- Related to MNDeployment #6957: NRDC - Nevada Research Data Center added
#9 Updated by Laura Moyers about 9 years ago
- Subject changed from mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PIDs to Indexing failure with ObjectPath not found / Was: mnTestAEKOS PIDs fail to be indexed by Solr in cn-stage due to lack of "Object Path" for the PIDs
NRDC (in stage) has been identified as having this problem. Content is successfully synchronized but not indexed. It is possible that the autogen document isn't being generated, so indexing cannot occur. Sample messages follow:
[ INFO] 2015-10-09 21:10:32,942 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml [ INFO] 2015-10-09 21:10:33,854 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml is not available. Task will be retried. [ INFO] 2015-10-09 21:10:33,861 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_04_4241_60326--v1.xml not processed. [ INFO] 2015-10-09 21:10:33,865 (IndexTaskProcessor:getNextIndexTask:155) Start of indexing pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml [ INFO] 2015-10-09 21:10:33,939 (IndexTaskProcessor:isObjectPathReady:263) Object path for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml is not available. Task will be retried. [ INFO] 2015-10-09 21:10:33,945 (IndexTaskProcessor:getNextIndexTask:164) Task for pid: NRDC_NCCP_SCIENCE_METADATA_Spring1_Met_OneMin_2011_05_4241_60326--v1.xml not processed.
This is occurring across environments and for multiple MNs, but it is not the same content which is not indexed in the different environments.
#10 Updated by Ben Leinfelder about 9 years ago
A number of us worked to track this down (Jing, Skye, Chris) and saw that we could index ltern.21.28 on cn-stage-ucsb-1 when the other two CNs were offline (Tomcat stopped). This is because the object had not [Metacat] replicated to the other two CNs in the environment, but the Object Path Map (Hazelcast) is shared amongst the three when they are all running. There's no telling which CN will 'lookup' the path for the given pid and we were getting the null path for this object (either unm or orc).
Then I noticed that many of the certificates had expired, particularly the ones used for Metacat replication from CN-to-CN. Chris generated new ones and I have installed them and begun a forced replication for all content in that environment. Not sure how long that will take, but new content that is harvested should behave as expected. Let us know if not!
#11 Updated by Rob Nahf over 8 years ago
- % Done changed from 0 to 100
- Status changed from New to Closed
Assume there were no problems upon reindexing.