Project

General

Profile

Bug #7870

Metacat considers FGDC documents invalid from member node SEAD

Added by Jing Tao over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Metacat
Target version:
Start date:
2016-08-25
Due date:
% Done:

100%

Milestone:
None
Product Version:
*
Story Points:
Sprint:

Description

When we harvested the documents from SEAD and we got an error that - Fatal processing error..

I used the curl command to create an object on a Metacat through DataONE API and got the same error with more details:
metacat 20160825-11:43:19: [WARN]: MetacatHandler.handleInsertOrUpdateAction - General error when writing eml document to the database: Fatal processing error. [edu.ucsb.nceas.metacat.MetacatHandler]
org.xml.sax.SAXException: Fatal processing error.
org.xml.sax.SAXParseException; systemId: http://www.fgdc.gov/metadata/fgdc-std-001-1998.xsd; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.
at edu.ucsb.nceas.metacat.DBSAXHandler.fatalError(DBSAXHandler.java:736)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.scanExternalID(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl.scanDoctypeDecl(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.impl.xs.opti.SchemaParsingConfig.parse(Unknown Source)
at org.apache.xerces.impl.xs.opti.SchemaParsingConfig.parse(Unknown Source)
at org.apache.xerces.impl.xs.opti.SchemaDOMParser.parse(Unknown Source)
at org.apache.xerces.impl.xs.traversers.XSDHandler.getSchemaDocument(Unknown Source)
at org.apache.xerces.impl.xs.traversers.XSDHandler.parseSchema(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaLoader.loadSchema(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.findSchemaGrammar(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.handleStartElement(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.startElement(Unknown Source)
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
at org.apache.xerces.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at edu.ucsb.nceas.metacat.DocumentImpl.write(DocumentImpl.java:2903)
at edu.ucsb.nceas.metacat.DocumentImpl.write(DocumentImpl.java:2667)
at edu.ucsb.nceas.metacat.DocumentImplWrapper.write(DocumentImplWrapper.java:63)
at edu.ucsb.nceas.metacat.MetacatHandler.handleInsertOrUpdateAction(MetacatHandler.java:1807)
at edu.ucsb.nceas.metacat.dataone.D1NodeService.insertOrUpdateDocument(D1NodeService.java:1385)
at edu.ucsb.nceas.metacat.dataone.D1NodeService.create(D1NodeService.java:455)
at edu.ucsb.nceas.metacat.dataone.MNodeService.create(MNodeService.java:616)
at edu.ucsb.nceas.metacat.restservice.v2.MNResourceHandler.putObject(MNResourceHandler.java:1535)
at edu.ucsb.nceas.metacat.restservice.v2.MNResourceHandler.handle(MNResourceHandler.java:289)
at edu.ucsb.nceas.metacat.restservice.D1RestServlet.doPost(D1RestServlet.java:84)

The attached are the science metadata file and the system metadata file.

fgdc.xml Magnifier (2.65 KB) Jing Tao, 2016-08-25 18:52

sysmeta-fgdc.xml Magnifier (1.01 KB) Jing Tao, 2016-08-25 18:53

History

#1 Updated by Jing Tao over 7 years ago

If we cache the fgdc schema files on Metacat also changed xsi:noNamespaceSchemaLocation="http://www.fgdc.gov/metadata/fgdc-std-001-1998.xsd" to the local value xsi:noNamespaceSchemaLocation="http://valley.duckdns.org/metacat/fgdc-std-001/fgdc-std-001-1998.xsd", it worked.
The main fgdc schema file have the included other schema files. It seems to me xerces somehow can't find them remotely, but can find them locally.

#2 Updated by Jing Tao over 7 years ago

I tried to upgrade xerces version to 2.11.0 and set the property "http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation" explicitly. But neither of them works.

#3 Updated by Jing Tao over 7 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

Matt found the url of xsi:noNamespaceSchemaLocation is redirected from http://www.fgdc.gov/metadata/fgdc-std-001-1998.xsd to https://www.fgdc.gov/metadata/fgdc-std-001-1998.xsd. We suspected that Xerces can't download the schema because of the redirection. So we think change the value from http to https in this attribute will be quick fix. Somehow the test didn't work (we made some mistakes in testing, i believe). Chris offered some xerces validation code and I modified the code a little bit for testing. I found the document was valid if the value of the attribute started with https; it gave the error ( White spaces are required between publicId and systemId) if it started with http. So I believe the change of http to https should work. I did a fresh installation of Metacat 2.7.2 on my local machine. Then I used curl command successfully to create the object with the attribute value starting with https. Then I created another object on dev.nceas:

https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/test-jing-11

So I believe we need to notify the operator of SEAD to change the value of xsi:noNamespaceSchemaLocation from http://www.fgdc.gov/metadata/fgdc-std-001-1998.xsd https://www.fgdc.gov/metadata/fgdc-std-001-1998.xsd.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)