Story #8485
XMLSchemaService progressively builds massively long string by calling doRefresh
100%
Description
in the static createRegisteredNameSpaceAndLocationString(), line 434 - 438, the static formatId_NamespaceLocationHash gets progressively added onto through multiple refreshes.
The value is then retrieved by getNameSpaceAndLocation(format_id) by MetacatHandler.handleInsertOrUpdateAction (on line 1778)
and passed into DocumentWrapper.write as an argument, and ultimately set as a property of the XMLReader via parser.setProperty(EXTERNALSCHEMALOCATIONPROPERTY, schemaLocation);
after going through a string.trim function.
While processing the initial Pangaea corpus, this string grew to 100Mb length. It probably contributed to the increasing amount of time it took to process Pangaea (and only Pangaea) metadata.
What is the reason behind appending new values to old, rather than replacing them?
Here is the problem code
//the hash table already has it. We will attache the new pair to the exist value String value = formatId_NamespaceLocationHash.get(formatId); value += " "+ xmlSchema.getFileNamespace() + " " + xmlSchema.getLocalFileUri(); formatId_NamespaceLocationHash.put(formatId, value);
Related issues
History
#1 Updated by Jing Tao almost 7 years ago
- % Done changed from 0 to 100
- Target version set to CCI-2.3.9
- Status changed from New to Closed
In the beginning of the call, we initialize the formatId_NamespaceLocationHash and the bug has been fixed. A new junit test method has been added as well.
#2 Updated by Rob Nahf almost 7 years ago
Not deployed to the CN yet. It will be part of the Metacat 2.9.0 release.
#3 Updated by Rob Nahf almost 7 years ago
- Target version changed from CCI-2.3.9 to CCI-2.3.10
#4 Updated by Rob Nahf almost 7 years ago
- Related to Task #8514: deploy new Metacat / synchronization slows down because of increased cn.create times added