Project

General

Profile

Story #8485

XMLSchemaService progressively builds massively long string by calling doRefresh

Added by Rob Nahf about 6 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
Metacat
Target version:
Start date:
2018-03-06
Due date:
% Done:

100%

Story Points:

Description

in the static createRegisteredNameSpaceAndLocationString(), line 434 - 438, the static formatId_NamespaceLocationHash gets progressively added onto through multiple refreshes.

The value is then retrieved by getNameSpaceAndLocation(format_id) by MetacatHandler.handleInsertOrUpdateAction (on line 1778)
and passed into DocumentWrapper.write as an argument, and ultimately set as a property of the XMLReader via parser.setProperty(EXTERNALSCHEMALOCATIONPROPERTY, schemaLocation); after going through a string.trim function.

While processing the initial Pangaea corpus, this string grew to 100Mb length. It probably contributed to the increasing amount of time it took to process Pangaea (and only Pangaea) metadata.

What is the reason behind appending new values to old, rather than replacing them?

Here is the problem code

                  //the hash table already has it. We will attache the new pair to the exist value
                    String value = formatId_NamespaceLocationHash.get(formatId);
                    value += " "+ xmlSchema.getFileNamespace() + " "
                            + xmlSchema.getLocalFileUri();
                    formatId_NamespaceLocationHash.put(formatId, value);

Related issues

Related to Infrastructure - Task #8514: deploy new Metacat / synchronization slows down because of increased cn.create times New 2018-03-22

History

#1 Updated by Jing Tao about 6 years ago

  • % Done changed from 0 to 100
  • Target version set to CCI-2.3.9
  • Status changed from New to Closed

In the beginning of the call, we initialize the formatId_NamespaceLocationHash and the bug has been fixed. A new junit test method has been added as well.

#2 Updated by Rob Nahf about 6 years ago

Not deployed to the CN yet. It will be part of the Metacat 2.9.0 release.

#3 Updated by Rob Nahf about 6 years ago

  • Target version changed from CCI-2.3.9 to CCI-2.3.10

#4 Updated by Rob Nahf about 6 years ago

  • Related to Task #8514: deploy new Metacat / synchronization slows down because of increased cn.create times added

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)