Project

General

Profile

Bug #8361

index processor throws unhandled NPE if task systemMetadata is null

Added by Rob Nahf almost 4 years ago. Updated over 3 years ago.

Status:
New
Priority:
Normal
Assignee:
Category:
d1_indexer
Target version:
Start date:
2018-02-12
Due date:
% Done:

0%

Milestone:
None
Product Version:
*
Story Points:
Sprint:

Description

Seen in STAGE, from among a group of task left in the 'IN PROCESS' state:

[ INFO] 2018-02-12 16:52:10,292 (IndexTaskProcessor:getNextIndexTask:610) Start of indexing pid: urn:uuid:eff10cb2-e454-4bc8-acc2-0927488621b0
[ERROR] 2018-02-12 16:52:10,311 (IndexTask:unMarshalSystemMetadata:221) 
org.dataone.exceptions.MarshallingException
    at org.dataone.service.util.TypeMarshaller.unmarshalTypeFromStream(TypeMarshaller.java:299)
    at org.dataone.cn.index.task.IndexTask.unMarshalSystemMetadata(IndexTask.java:213)
    at org.dataone.cn.index.task.IndexTask.isArchived(IndexTask.java:251)
    at org.dataone.cn.index.task.IndexTask.isDeleteTask(IndexTask.java:265)
    at org.dataone.cn.index.processor.IndexTaskProcessor.getNextIndexTask(IndexTaskProcessor.java:612)
    at org.dataone.cn.index.processor.IndexTaskProcessor.processIndexTaskQueue(IndexTaskProcessor.java:184)
    at org.dataone.cn.index.processor.IndexTaskProcessorJob.execute(IndexTaskProcessorJob.java:57)
    at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
Caused by: javax.xml.bind.UnmarshalException
 - with linked exception:
[org.xml.sax.SAXParseException; Premature end of file.]
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(AbstractUnmarshallerImpl.java:335)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(UnmarshallerImpl.java:563)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:249)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:214)
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:157)
    at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:204)
    at org.dataone.service.util.TypeMarshaller.unmarshalTypeFromStream(TypeMarshaller.java:297)
    ... 8 more
Caused by: org.xml.sax.SAXParseException; Premature end of file.
    at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:243)
    ... 12 more
[ERROR] 2018-02-12 16:52:10,312 (JobRunShell:run:222) Job d1-cn-index-processor.d1-index-processor-job threw an unhandled Exception: 
java.lang.NullPointerException
[ERROR] 2018-02-12 16:52:10,313 (ErrorLogger:schedulerError:2360) Job (d1-cn-index-processor.d1-index-processor-job threw an exception.
org.quartz.SchedulerException: Job threw an unhandled exception. [See nested exception: java.lang.NullPointerException]
    at org.quartz.core.JobRunShell.run(JobRunShell.java:224)
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
Caused by: java.lang.NullPointerException
[ WARN] 2018-02-12 16:54:08,196 (IndexTaskProcessorJob:execute:55) processing job [1806183035/org.dataone.cn.index.processor.IndexTaskProcessorJob@15d15ed9] executing index task with processor [org.dataone.cn.index.processor.IndexTaskProcessor@20621d44]
[ INFO] 2018-02-12 16:54:09,567 (IndexTaskProcessor:logProcessorLoad:245) new tasks:375, tasks previously failed: 1660

and this in postgres:

d1-index-queue=# select * from index_task where sysmetadata = '';
    id    | datesysmetamodified | deleted |               formatid                | nextexecution | objectpath |                      pid                      | priority |   status   | sysmetadata | taskmodifieddate | trycount | version 
----------+---------------------+---------+---------------------------------------+---------------+------------+-----------------------------------------------+----------+------------+-------------+------------------+----------+---------
 14083132 |       1452029399937 | f       | text/plain                            | 1518032957157 |            | urn:uuid:eff10cb2-e454-4bc8-acc2-0927488621b0 |        1 | IN PROCESS |             |    1518454330288 |        4 |       5
 14083818 |       1452029390440 | f       | eml://ecoinformatics.org/eml-2.1.1    | 1518032959192 |            | urn:uuid:0f327426-3e97-428f-810e-44a83ec92047 |        1 | IN PROCESS |             |    1518454449880 |        4 |       5
 14085245 |       1452032067206 | f       | text/plain                            | 1518032963639 |            | urn:uuid:7a31a8e1-54c1-4d10-9c7d-7cf969db4015 |        1 | IN PROCESS |             |    1518454569874 |        4 |       5
 14087397 |       1452032061198 | f       | text/csv                              | 1518032967945 |            | urn:uuid:0d07b6f5-4299-43e5-a1b8-aa37b259e51f |        1 | IN PROCESS |             |    1518454690716 |        4 |       5
 14087857 |       1452032064477 | f       | eml://ecoinformatics.org/eml-2.1.1    | 1518032968646 |            | urn:uuid:56e25ff0-dc85-4441-a4c2-74b9955cc43d |        1 | IN PROCESS |             |    1518454809882 |        4 |       5
 14088344 |       1452032063246 | f       | text/csv                              | 1518032969414 |            | urn:uuid:2fb13ee6-fb8a-4eea-b9ee-f52e85b35a7a |        1 | IN PROCESS |             |    1518454930608 |        4 |       5
 14092345 |       1452032069944 | f       | http://www.openarchives.org/ore/terms | 1518032978246 |            | urn:uuid:5429c13d-db3e-4997-985a-eee1273163d7 |        3 | IN PROCESS |             |    1518455322387 |        4 |       5
(7 rows)

The missing systemMetadata causes a parse error that in turn causes a Marshalling Exception, which I believe caused the NPE. Need to investigate the source code to see if this is the case.

The index processor should ideally handle the lack of sysmetadata more gracefully.

History

#1 Updated by Rob Nahf almost 4 years ago

  • Assignee changed from Dave Vieglais to Rob Nahf
  • Description updated (diff)

#2 Updated by Rob Nahf almost 4 years ago

  • Description updated (diff)

#3 Updated by Rob Nahf almost 4 years ago

Stage is running 2.3.6 code. The tasks are stuck in the 'IN PROCESS' status, so try reprocessing them after 2.3.7 deployment, to see if the issue was addressed.

#4 Updated by Dave Vieglais over 3 years ago

  • Sprint changed from CCI-2.4.0 to CCI-2.3.9

#5 Updated by Dave Vieglais over 3 years ago

  • Target version changed from CCI-2.3.8 to CCI-2.3.9

#6 Updated by Rob Nahf over 3 years ago

  • Target version changed from CCI-2.3.9 to CCI-2.3.10

2.3.9 is deployed...

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)