https://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2017-09-26T00:19:13ZDataONE TasksInfrastructure - Task #8165: Re-factor origin field in isotc211 indexing componenthttps://redmine.dataone.org/issues/8165?journal_id=291772017-09-26T00:19:13ZBryce Mecummecum@nceas.ucsb.edu
<ul></ul><p>After looking at this email thread and some example documents, I decided to try simply making the XPath more selective about where in the document it pulls party information. This was suggested in the email thread. I made the change on dev.nceas, uploaded the two example documents linked in the email thread showing the incorrect behavior, reindexed them with the modified indexing bean, and the result was the correct behavior.</p>
<p>The patch is here: <a href="https://gist.github.com/amoeba/6670f0ceb8fb1f6cb7fdaf4d46534777">https://gist.github.com/amoeba/6670f0ceb8fb1f6cb7fdaf4d46534777</a></p>
<p>Before this change, the <code>origin</code> field was being filled in with all <code>gmd:CI_ResponsibleParty</code> (individuals and orgs) in the document w/ role originator, author, PI, owner and after the change only those same <code>gmd:CI_ResponsibleParty</code>'s under the citation are included.</p>
Infrastructure - Task #8165: Re-factor origin field in isotc211 indexing componenthttps://redmine.dataone.org/issues/8165?journal_id=292682017-11-21T20:08:26ZRob Nahfrnahf@epscor.unm.edu
<ul><li><strong>Category</strong> set to <i>d1_indexer</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>Bryce Mecum</i></li><li><strong>Target version</strong> set to <i>CCI-2.3.7</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>30</i></li></ul><p>The unit tests related to the origin field for the isotc211 solr parser are failing, and either the starting science metadata needs to be updated to conform to the tighter restrictions on the origin source fields.</p>
<p>[ERROR] Failures: <br>
[ERROR] SolrFieldIsotc211Test.testIsotc211DistributionInfoParsing:1126->BaseSolrFieldXPathTest.testXPathParsing:72->BaseSolrFieldXPathTest.compareFields:149 For field: origin<br>
Expected: iterable containing ["NOAA/NESDIS USA, 5200 Auth Rd, Camp Springs, MD, 20746"]<br>
but: item 0: was ""<br>
[ERROR] SolrFieldIsotc211Test.testIsotc211LooselyCoupledServiceSrvAndDistrib:1138->BaseSolrFieldXPathTest.testXPathParsing:72->BaseSolrFieldXPathTest.compareFields:149 For field: origin<br>
Expected: iterable containing ["Bob"]<br>
but: item 0: was ""<br>
[ERROR] SolrFieldIsotc211Test.testIsotc211Nodc2FieldParsing:1096->BaseSolrFieldXPathTest.testXPathParsing:72->BaseSolrFieldXPathTest.compareFields:149 For field: origin<br>
Expected: iterable containing ["NEODAAS"]<br>
but: item 0: was ""<br>
[ERROR] SolrFieldIsotc211Test.testTightlyCoupledServiceSrvOnly:1144->BaseSolrFieldXPathTest.testXPathParsing:72->BaseSolrFieldXPathTest.compareFields:149 For field: origin<br>
Expected: iterable containing ["UNM"]<br>
but: item 0: was ""</p>
<p>corresponding source files are found here:</p>
<p><a href="https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/test/resources/org/dataone/cn/index/resources/d1_testdocs/isotc211/">https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/test/resources/org/dataone/cn/index/resources/d1_testdocs/isotc211/</a></p>
Infrastructure - Task #8165: Re-factor origin field in isotc211 indexing componenthttps://redmine.dataone.org/issues/8165?journal_id=292692017-11-22T18:36:43ZBryce Mecummecum@nceas.ucsb.edu
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Closed</i></li><li><strong>% Done</strong> changed from <i>30</i> to <i>100</i></li></ul><p>Unit tests all fixed up. It also turned out that the bean XPath (and thereby the unit tests) were wrong because they had hardcoded an element name (gco:CharacterString) where two possible elements are common (gco:CharacterString and gmx:Anchor). I changed the XPath and the tests to reflect the change.</p>
<p>The change will require a reindex of all isotc211 content but this is not a high enough priority to do the reindex now as we're also considering changing this XPath again in <a href="https://redmine.dataone.org/issues/8189">https://redmine.dataone.org/issues/8189</a></p>
Infrastructure - Task #8165: Re-factor origin field in isotc211 indexing componenthttps://redmine.dataone.org/issues/8165?journal_id=292712017-11-22T19:03:24ZRob Nahfrnahf@epscor.unm.edu
<ul><li><strong>Related to</strong> <i><a class="issue tracker-5 status-1 priority-4 priority-default" href="/issues/8222">Task #8222</a>: reindex all isotc211 content in production to reflect final decisions from origin field mapping</i> added</li></ul> Infrastructure - Task #8165: Re-factor origin field in isotc211 indexing componenthttps://redmine.dataone.org/issues/8165?journal_id=292752017-11-22T19:09:03ZRob Nahfrnahf@epscor.unm.edu
<ul><li><strong>Target version</strong> changed from <i>CCI-2.3.7</i> to <i>CCI-2.4.0</i></li></ul><p>Fixes by Bryce made in trunk, so would need to copy to 2.3 branch to pull into earlier releases. It doesn't look like a priority yet, as other tests are failing...</p>