DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2020-08-06T00:06:07ZDataONE Tasks
Redmine CN REST - Bug #8867 (New): CNCore.listChecksumAlgorithms() returns incorrect listhttps://redmine.dataone.org/issues/88672020-08-06T00:06:07ZMatthew Jonesjones@nceas.ucsb.edu
<p>The definition of the ChecksumAlgorithm type in SystemMetadata allows any checksum algorithm listed in the Library of Congress vocab. But the current CNCore.listChecksumAlgorithms() implementation only returns two, MD5 and SHA-1. Need to correct this to include the full list of supported algorithms (see <a href="http://id.loc.gov/vocabulary/preservation/cryptographicHashFunctions.html">http://id.loc.gov/vocabulary/preservation/cryptographicHashFunctions.html</a>).</p>
<p>The implementation of this is in a property file, which needs to be updated with the correct list. The file (d1_cn_rest/src/test/resources/org/dataone/configuration/node.properties) currently contains:</p>
<p><code>cn.checksumAlgorithmList=SHA-1;MD5</code></p>
<p>But it should contain all of the other valid algorithms as well from the LoC.</p>
Infrastructure - Bug #8696 (New): double indexing of a resource map and another not processed bec...https://redmine.dataone.org/issues/86962018-09-12T00:18:51ZRob Nahfrnahf@epscor.unm.edu
<p>In production, the ORE 'a1a0e96a-3cde-4f3c-829c-29650b09f22b' was not processed because a member was also referenced by the ORE it obsoleted, 'dc39515e-440b-4673-9f63-962c7374bf48'. The task failed without being requeued. Below is the log output.</p>
<pre>rnahf@cn-orc-1:/var/log/dataone/index$ grep a1a0e96a-3cde-4f3c-829c-29650b09f22b cn-index-processor-daemon.log.*
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:27,384 (IndexTaskProcessor:saveTask:865) IndexTaskProcess.saveTask save the index task a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:27,384 (IndexTaskProcessor:getNextIndexTask:610) Start of indexing pid: a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:34,832 (IndexTaskProcessor:getNextIndexTask:664) the original index task - IndexTask [id=18085996, pid=a1a0e96a-3cde-4f3c-829c-29650b09f22b, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2018091015425874434.1, dateSysMetaModified=1536087134490, deleted=false, taskModifiedDate=1536619467383, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:34,832 (IndexTaskProcessor:getNextIndexTask:671) the new index task - IndexTask [id=18085996, pid=a1a0e96a-3cde-4f3c-829c-29650b09f22b, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2018091015425874434.1, dateSysMetaModified=1536087134490, deleted=false, taskModifiedDate=1536619467383, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:34,901 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:35,402 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:35,902 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:36,402 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:36,903 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:37,403 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:37,903 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:38,403 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:38,904 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:39,404 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ERROR] 2018-09-10 22:44:39,904 (IndexTaskProcessor:checkReadinessProcessResourceMap:384) We waited for another thread to finish indexing a resource map which has the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 for a while. Now we quited and can't index id a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:[ERROR] 2018-09-10 22:44:39,904 (IndexTaskProcessor:processTask:297) Unable to process task for pid: a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:java.lang.Exception: We waited for another thread to finish indexing a resource map which has the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 for a while. Now we quited and can't index id a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:39,906 (IndexTaskProcessor:newOrFailedIndexTaskExists:890) IndexTaskProcess.newOrFailedIndexTaskExists for id a1a0e96a-3cde-4f3c-829c-29650b09f22b
rnahf@cn-orc-1:/var/log/dataone/index$ date
Tue Sep 11 23:46:56 UTC 2018
rnahf@cn-orc-1:/var/log/dataone/index$ grep dc39515e-440b-4673-9f63-962c7374bf48 cn-index-processor-daemon.log.*
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:12,133 (HZEventFilter:filter:127) HZEventFilter.filter - the system metadata for the index event shows shows dc39515e-440b-4673-9f63-962c7374bf48 having a newer version than the SOLR server. So this event should be granted for indexing.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:13,347 (HZEventFilter:filter:127) HZEventFilter.filter - the system metadata for the index event shows shows dc39515e-440b-4673-9f63-962c7374bf48 having a newer version than the SOLR server. So this event should be granted for indexing.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:18,677 (IndexTaskProcessor:saveTask:865) IndexTaskProcess.saveTask save the index task dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:18,677 (IndexTaskProcessor:getNextIndexTask:610) Start of indexing pid: dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:25,783 (IndexTaskProcessor:getNextIndexTask:664) the original index task - IndexTask [id=18086020, pid=dc39515e-440b-4673-9f63-962c7374bf48, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2017072514144216470.1, dateSysMetaModified=1536087137440, deleted=false, taskModifiedDate=1536619458675, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:25,783 (IndexTaskProcessor:getNextIndexTask:671) the new index task - IndexTask [id=18086020, pid=dc39515e-440b-4673-9f63-962c7374bf48, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2017072514144216470.1, dateSysMetaModified=1536087137440, deleted=false, taskModifiedDate=1536619458675, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:27,221 (IndexTaskProcessor:processTask:284) *********************start to process update index task for dc39515e-440b-4673-9f63-962c7374bf48 in thread 20
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:43,513 (IndexTaskProcessor:processTask:288) *********************end to process update index task for dc39515e-440b-4673-9f63-962c7374bf48 in thread 20
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:43,519 (IndexTaskProcessor:processTask:315) Indexing complete for pid: dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:11,604 (IndexTaskProcessor:saveTask:865) IndexTaskProcess.saveTask save the index task dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:11,604 (IndexTaskProcessor:getNextIndexTask:610) Start of indexing pid: dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:18,731 (IndexTaskProcessor:getNextIndexTask:664) the original index task - IndexTask [id=18086015, pid=dc39515e-440b-4673-9f63-962c7374bf48, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2017072514144216470.1, dateSysMetaModified=1536087137440, deleted=false, taskModifiedDate=1536619571603, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:18,732 (IndexTaskProcessor:getNextIndexTask:671) the new index task - IndexTask [id=18086015, pid=dc39515e-440b-4673-9f63-962c7374bf48, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2017072514144216470.1, dateSysMetaModified=1536087137440, deleted=false, taskModifiedDate=1536619571603, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:20,164 (IndexTaskProcessor:processTask:284) *********************start to process update index task for dc39515e-440b-4673-9f63-962c7374bf48 in thread 20
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:36,252 (IndexTaskProcessor:processTask:288) *********************end to process update index task for dc39515e-440b-4673-9f63-962c7374bf48 in thread 20
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:36,255 (IndexTaskProcessor:processTask:315) Indexing complete for pid: dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.7:[ INFO] 2018-09-10 21:44:09,798 (HZEventFilter:compareRaplicaList:256) HZEventFilter.compareReplicaList - the system metadata for the index event shows dc39515e-440b-4673-9f63-962c7374bf48 having the same replica list as the solr doc.
cn-index-processor-daemon.log.7:[ INFO] 2018-09-10 21:44:09,798 (HZEventFilter:filter:164) HZEventFilter.filter - the system metadata for the index event shows dc39515e-440b-4673-9f63-962c7374bf48 having the same modification date as the SOLR server. Also both have the same replica list. So this event has been filtered out for indexing (no indexing).
rnahf@cn-orc-1:/var/log/dataone/index$
</pre> Infrastructure - Bug #8500 (New): Geohash not calculated properlyhttps://redmine.dataone.org/issues/85002018-03-14T17:06:52ZDave Vieglaisdave.vieglais@gmail.com
<p>The metadata with pid = <code>{39A912EE-E2DC-4F50-A3DE-C0EF04BC1E88}</code> has bounding coords correctly indexed:</p>
<pre> <float name="eastBoundCoord">166.657</float>
<float name="westBoundCoord">-178.378</float>
<float name="southBoundCoord">-14.5596</float>
<float name="northBoundCoord">28.4536</float>
</pre>
<p>but the computed geohash is wrong:</p>
<pre> <arr name="geohash_9">
<str>ec5zd8st0</str>
</arr>
<arr name="geohash_1">
<str>e</str>
</arr>
<arr name="geohash_2">
<str>ec</str>
</arr>
<arr name="geohash_3">
<str>ec5</str>
</arr>
<arr name="geohash_4">
<str>ec5z</str>
</arr>
<arr name="geohash_5">
<str>ec5zd</str>
</arr>
<arr name="geohash_6">
<str>ec5zd8</str>
</arr>
<arr name="geohash_7">
<str>ec5zd8s</str>
</arr>
<arr name="geohash_8">
<str>ec5zd8st</str>
</arr>
</pre>
<p>Investigate if this is a systemic issue or peculiar to certain types of metadata or representation of coordinates within the metadata.</p>
Infrastructure - Bug #4674 (New): Ask Judith, Mike and Virgina Perez.2.1 to obsolete those pids w...https://redmine.dataone.org/issues/46742014-03-31T18:02:41ZJing Taotao@nceas.ucsb.edu
<p>doi:10.5063/AA/Virginia Perez.2.1<br>
judith botha.1.1<br>
judith botha.2.1<br>
judith kruger.1.1<br>
judith kruger.2.1<br>
judith kruger.3.1<br>
judith kruger.4.1<br>
judith kruger.5.1<br>
doi:10.6085/AA/ SHLX00_XXXITV2XLSR03_20111128.40.1 (PISCO)</p>
Infrastructure - Task #4210 (Testing): Metacat does not set serialVersion correctly in CNodeServi...https://redmine.dataone.org/issues/42102013-12-20T15:22:50ZChris Jonescjones@nceas.ucsb.edu
<p>For DATA and METADATA, CNodeService.archive() and D1NodeService.archive(), respectively, don't increment the serialVersion field. Check this for delete() as well. D1NodeService delegates to DocumentImpl to call the HZ put() method, so the fix needs to be there, and in CNodeService.</p>
Infrastructure - Bug #3675 (New): package relationships not available for archived objectshttps://redmine.dataone.org/issues/36752013-03-20T19:12:28ZRob Nahfrnahf@epscor.unm.edu
<p>Currently, records for obsoleted items are maintained in the solr index so its resourceMap, documents, documentedBy relationships are available, and people can "investigate the past". However, those same relationships are not available for archived items, leading to an incomplete solution for this use case (accessing package relationships of out-of-date content).</p>
<p>Archive is used to limit discoverability, but it also eliminates the ability to navigate the package relationships. </p>
<p>Note: archive is intended to be used when the owner does not want to update the object, but simply remove it. However, nothing prevents the owner from archiving obsoleted content. So, in fact, the ability to navigate the package relationships of out-of-date content cannot be guaranteed, and is subject to the individual data management practices of content owners. </p>
Member Nodes - MNDeployment #3521 (Operational): SEAD Member Nodehttps://redmine.dataone.org/issues/35212013-01-25T21:19:12ZRebecca Koskelarkoskela@unm.edu
<p>SEAD (Sustainable Environment - Actionable Data), another DataNet, would like to become a DataONE Member Node<br>
(<a href="http://sead-data.net/">http://sead-data.net/</a>)</p>
Infrastructure - Bug #3492 (In Progress): Invalid PIDs in production (whitespace)https://redmine.dataone.org/issues/34922013-01-17T15:13:44ZDave Vieglaisdave.vieglais@gmail.com
<p>Recording this for future reference. </p>
<p>There are nine PIDs in the production environment that contain whitespace. This appears to have no functional effect - sysmeta and objects can be retrieved so no action is required other than to ensure no more sneak in.</p>
<p>The PIDs in question are:</p>
<a name="guid"></a>
<h2 > guid <a href="#guid" class="wiki-anchor">¶</a></h2>
<p>doi:10.5063/AA/Virginia Perez.2.1<br>
judith kruger.3.1<br>
judith kruger.4.1<br>
judith botha.1.1<br>
judith kruger.1.1<br>
judith kruger.2.1<br>
judith kruger.5.1<br>
judith botha.2.1<br>
resourceMap_Lin Cheng-Jung.1.1<br>
resourceMap_Lin Cheng-Jung.1.2<br>
resourceMap_Lin Cheng-Jung.1.3<br>
Lin Cheng-Jung.1.1<br>
Lin Cheng-Jung.1.2<br>
Lin Cheng-Jung.1.3<br>
doi:10.6085/AA/ SHLX00_XXXITV2XLSR03_20111128.40.1</p>
Member Nodes - MNDeployment #3118 (Operational): Dryad Member Nodehttps://redmine.dataone.org/issues/31182012-08-05T17:05:51ZDave Vieglaisdave.vieglais@gmail.com
<p>The Dryad MN will operate as a tier 1 member node.</p>
<p>Base_URL: <a href="https://datadryad.org/mn">https://datadryad.org/mn</a><br>
Node_ID: urn:node:DRYAD<br>
Deployment_Contact: Ryan Scherle<br>
Software: Custom on modified DSpace (Dryad)<br>
Target_Tier: 1<br>
Content_Volume_GB: 20</p>
Requirements - Requirement #822 (New): (Requirement) Object access and manipulation should be res...https://redmine.dataone.org/issues/8222010-09-06T03:32:31ZDave Vieglaisdave.vieglais@gmail.com
<p>The system should respond within a reasonable time when a user requests content or changes are made to content.</p>
<p>Rationale</p>
<p>An unresponsive system will be an impediment to adoption.</p>
<p>Fit Criteria</p>
<ul>
<li><p>Object lists always respond within XX seconds</p></li>
<li><p>Get object always returns within XX seconds</p></li>
<li><p>resolve object always responds within XX seconds</p></li>
<li><p>create and update operations always respond within XX seconds</p></li>
</ul>
Requirements - Requirement #821 (New): (Requirement) Sponsor required functionalityhttps://redmine.dataone.org/issues/8212010-09-06T03:23:56ZDave Vieglaisdave.vieglais@gmail.com
<p>(Requirement) Sponsor required functionality</p>
Requirements - Requirement #592 (New): DataONE needs to synchronize metadata between MNs and CNshttps://redmine.dataone.org/issues/5922010-04-23T00:53:56ZMatthew Jonesjones@nceas.ucsb.edu
<p><a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a> functions by having CNs that provide a central index of all metadata in the system. This allows for rapid search of the whole network while maintaining autonomy of MNs.</p>
Requirements - Requirement #411 (New): (Requirement) The infrastructure must survive destruction ...https://redmine.dataone.org/issues/4112010-03-24T15:32:54ZDave Vieglaisdave.vieglais@gmail.com
<p>Data storage nodes (Member Nodes or Coordinating Nodes) will go offline for various reasons - maintenance, hardware failure, decommissioning, etc. The infrastructure built by <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a> must not be adversely affected by such events.</p>
<p>Rationale</p>
<p>This is a requirement of content persistence (<a class="issue tracker-8 status-1 priority-3 priority-lowest" title="Requirement: (Requirement) The infrastructure must support long term preservation of Data (New)" href="https://redmine.dataone.org/issues/410">#410</a>).</p>
<p>No systems have 100% uptime, and so we should design and plan for failure of MN or CN nodes.</p>
<p>Fit Criteria</p>
<ul>
<li>XX% of all Data remains available if XX% of member nodes go offline</li>
<li>The infrastructure remains functional if two CNs go offline</li>
</ul>
Requirements - Requirement #384 (New): (Requirement) Enable efficient mechanisms for users to dis...https://redmine.dataone.org/issues/3842010-03-16T22:03:21ZDave Vieglaisdave.vieglais@gmail.com
<p>There will likely be a very large number of data and metadata objects cataloged by the <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a> infrastructure. These are accessible by their unique identifiers, but it is also necessary to provide mechanisms for users to discover content that is relevant to some purpose.</p>
<p>Rationale</p>
<p>Without a mechanism to search content, <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a> will be little more than a large hash table - a useful functionality in itself but insufficient to help users discover useful information.</p>
<p>Fit Criteria</p>
<ul>
<li>A mechanism to search the entire holdings of <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a> is available</li>
<li>Searches are possible using fields useful to to the users</li>
<li>Searches return accurate results</li>
<li>Searches support precise results</li>
<li>Searches are fast. (how fast?)</li>
<li>the user community is satisfied with the search functionality</li>
<li>Searches can be performed using tools familiar to the users (web browser, analytical tools)</li>
</ul>
Requirements - Requirement #318 (New): (Requirement) Sponsor required Y1 functionalityhttps://redmine.dataone.org/issues/3182010-03-10T17:12:42ZDave Vieglaisdave.vieglais@gmail.com
<p>The project sponsor has specified that a minimum of three Coordinating Nodes and three Member Nodes will be stood up and operational at the end of the first year of the project (i.e. 2010-07-31).</p>
<p>Rationale:: It's something the sponsor, or the directors of the sponsor need to see for assurance of progress.</p>
<p>Fit Criteria:: By the end of year one of the project (i.e. 2010-07-31) there will be at least:</p>
<ul>
<li><p>three operational coordinating nodes</p></li>
<li><p>three operational member nodes</p></li>
<li><p>Basic mirroring of content between CNs and replication of content from MNs should be enabled.</p></li>
</ul>