DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2018-09-12T00:18:51ZDataONE Tasks
Redmine Infrastructure - Bug #8696 (New): double indexing of a resource map and another not processed bec...https://redmine.dataone.org/issues/86962018-09-12T00:18:51ZRob Nahfrnahf@epscor.unm.edu
<p>In production, the ORE 'a1a0e96a-3cde-4f3c-829c-29650b09f22b' was not processed because a member was also referenced by the ORE it obsoleted, 'dc39515e-440b-4673-9f63-962c7374bf48'. The task failed without being requeued. Below is the log output.</p>
<pre>rnahf@cn-orc-1:/var/log/dataone/index$ grep a1a0e96a-3cde-4f3c-829c-29650b09f22b cn-index-processor-daemon.log.*
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:27,384 (IndexTaskProcessor:saveTask:865) IndexTaskProcess.saveTask save the index task a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:27,384 (IndexTaskProcessor:getNextIndexTask:610) Start of indexing pid: a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:34,832 (IndexTaskProcessor:getNextIndexTask:664) the original index task - IndexTask [id=18085996, pid=a1a0e96a-3cde-4f3c-829c-29650b09f22b, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2018091015425874434.1, dateSysMetaModified=1536087134490, deleted=false, taskModifiedDate=1536619467383, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:34,832 (IndexTaskProcessor:getNextIndexTask:671) the new index task - IndexTask [id=18085996, pid=a1a0e96a-3cde-4f3c-829c-29650b09f22b, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2018091015425874434.1, dateSysMetaModified=1536087134490, deleted=false, taskModifiedDate=1536619467383, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:34,901 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:35,402 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:35,902 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:36,402 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:36,903 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:37,403 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:37,903 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:38,403 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:38,904 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:39,404 (IndexTaskProcessor:checkReadinessProcessResourceMap:369) ###################Another resource map is process the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 as well. So the thread to process id a1a0e96a-3cde-4f3c-829c-29650b09f22b has to wait 0.5 seconds.
cn-index-processor-daemon.log.6:[ERROR] 2018-09-10 22:44:39,904 (IndexTaskProcessor:checkReadinessProcessResourceMap:384) We waited for another thread to finish indexing a resource map which has the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 for a while. Now we quited and can't index id a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:[ERROR] 2018-09-10 22:44:39,904 (IndexTaskProcessor:processTask:297) Unable to process task for pid: a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:java.lang.Exception: We waited for another thread to finish indexing a resource map which has the referenced id ee73cf7f-1005-4b89-bab9-3a7fa01d27c6 for a while. Now we quited and can't index id a1a0e96a-3cde-4f3c-829c-29650b09f22b
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:39,906 (IndexTaskProcessor:newOrFailedIndexTaskExists:890) IndexTaskProcess.newOrFailedIndexTaskExists for id a1a0e96a-3cde-4f3c-829c-29650b09f22b
rnahf@cn-orc-1:/var/log/dataone/index$ date
Tue Sep 11 23:46:56 UTC 2018
rnahf@cn-orc-1:/var/log/dataone/index$ grep dc39515e-440b-4673-9f63-962c7374bf48 cn-index-processor-daemon.log.*
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:12,133 (HZEventFilter:filter:127) HZEventFilter.filter - the system metadata for the index event shows shows dc39515e-440b-4673-9f63-962c7374bf48 having a newer version than the SOLR server. So this event should be granted for indexing.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:13,347 (HZEventFilter:filter:127) HZEventFilter.filter - the system metadata for the index event shows shows dc39515e-440b-4673-9f63-962c7374bf48 having a newer version than the SOLR server. So this event should be granted for indexing.
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:18,677 (IndexTaskProcessor:saveTask:865) IndexTaskProcess.saveTask save the index task dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:18,677 (IndexTaskProcessor:getNextIndexTask:610) Start of indexing pid: dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:25,783 (IndexTaskProcessor:getNextIndexTask:664) the original index task - IndexTask [id=18086020, pid=dc39515e-440b-4673-9f63-962c7374bf48, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2017072514144216470.1, dateSysMetaModified=1536087137440, deleted=false, taskModifiedDate=1536619458675, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:25,783 (IndexTaskProcessor:getNextIndexTask:671) the new index task - IndexTask [id=18086020, pid=dc39515e-440b-4673-9f63-962c7374bf48, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2017072514144216470.1, dateSysMetaModified=1536087137440, deleted=false, taskModifiedDate=1536619458675, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:27,221 (IndexTaskProcessor:processTask:284) *********************start to process update index task for dc39515e-440b-4673-9f63-962c7374bf48 in thread 20
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:43,513 (IndexTaskProcessor:processTask:288) *********************end to process update index task for dc39515e-440b-4673-9f63-962c7374bf48 in thread 20
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:44:43,519 (IndexTaskProcessor:processTask:315) Indexing complete for pid: dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:11,604 (IndexTaskProcessor:saveTask:865) IndexTaskProcess.saveTask save the index task dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:11,604 (IndexTaskProcessor:getNextIndexTask:610) Start of indexing pid: dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:18,731 (IndexTaskProcessor:getNextIndexTask:664) the original index task - IndexTask [id=18086015, pid=dc39515e-440b-4673-9f63-962c7374bf48, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2017072514144216470.1, dateSysMetaModified=1536087137440, deleted=false, taskModifiedDate=1536619571603, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:18,732 (IndexTaskProcessor:getNextIndexTask:671) the new index task - IndexTask [id=18086015, pid=dc39515e-440b-4673-9f63-962c7374bf48, formatid=http://www.openarchives.org/ore/terms, objectPath=/var/metacat/data/autogen.2017072514144216470.1, dateSysMetaModified=1536087137440, deleted=false, taskModifiedDate=1536619571603, priority=3, status=IN PROCESS]
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:20,164 (IndexTaskProcessor:processTask:284) *********************start to process update index task for dc39515e-440b-4673-9f63-962c7374bf48 in thread 20
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:36,252 (IndexTaskProcessor:processTask:288) *********************end to process update index task for dc39515e-440b-4673-9f63-962c7374bf48 in thread 20
cn-index-processor-daemon.log.6:[ INFO] 2018-09-10 22:46:36,255 (IndexTaskProcessor:processTask:315) Indexing complete for pid: dc39515e-440b-4673-9f63-962c7374bf48
cn-index-processor-daemon.log.7:[ INFO] 2018-09-10 21:44:09,798 (HZEventFilter:compareRaplicaList:256) HZEventFilter.compareReplicaList - the system metadata for the index event shows dc39515e-440b-4673-9f63-962c7374bf48 having the same replica list as the solr doc.
cn-index-processor-daemon.log.7:[ INFO] 2018-09-10 21:44:09,798 (HZEventFilter:filter:164) HZEventFilter.filter - the system metadata for the index event shows dc39515e-440b-4673-9f63-962c7374bf48 having the same modification date as the SOLR server. Also both have the same replica list. So this event has been filtered out for indexing (no indexing).
rnahf@cn-orc-1:/var/log/dataone/index$
</pre> Infrastructure - Bug #4674 (New): Ask Judith, Mike and Virgina Perez.2.1 to obsolete those pids w...https://redmine.dataone.org/issues/46742014-03-31T18:02:41ZJing Taotao@nceas.ucsb.edu
<p>doi:10.5063/AA/Virginia Perez.2.1<br>
judith botha.1.1<br>
judith botha.2.1<br>
judith kruger.1.1<br>
judith kruger.2.1<br>
judith kruger.3.1<br>
judith kruger.4.1<br>
judith kruger.5.1<br>
doi:10.6085/AA/ SHLX00_XXXITV2XLSR03_20111128.40.1 (PISCO)</p>
Infrastructure - Task #4210 (Testing): Metacat does not set serialVersion correctly in CNodeServi...https://redmine.dataone.org/issues/42102013-12-20T15:22:50ZChris Jonescjones@nceas.ucsb.edu
<p>For DATA and METADATA, CNodeService.archive() and D1NodeService.archive(), respectively, don't increment the serialVersion field. Check this for delete() as well. D1NodeService delegates to DocumentImpl to call the HZ put() method, so the fix needs to be there, and in CNodeService.</p>
Member Nodes - MNDeployment #3521 (Operational): SEAD Member Nodehttps://redmine.dataone.org/issues/35212013-01-25T21:19:12ZRebecca Koskelarkoskela@unm.edu
<p>SEAD (Sustainable Environment - Actionable Data), another DataNet, would like to become a DataONE Member Node<br>
(<a href="http://sead-data.net/">http://sead-data.net/</a>)</p>
Infrastructure - Bug #3492 (In Progress): Invalid PIDs in production (whitespace)https://redmine.dataone.org/issues/34922013-01-17T15:13:44ZDave Vieglaisdave.vieglais@gmail.com
<p>Recording this for future reference. </p>
<p>There are nine PIDs in the production environment that contain whitespace. This appears to have no functional effect - sysmeta and objects can be retrieved so no action is required other than to ensure no more sneak in.</p>
<p>The PIDs in question are:</p>
<a name="guid"></a>
<h2 > guid <a href="#guid" class="wiki-anchor">¶</a></h2>
<p>doi:10.5063/AA/Virginia Perez.2.1<br>
judith kruger.3.1<br>
judith kruger.4.1<br>
judith botha.1.1<br>
judith kruger.1.1<br>
judith kruger.2.1<br>
judith kruger.5.1<br>
judith botha.2.1<br>
resourceMap_Lin Cheng-Jung.1.1<br>
resourceMap_Lin Cheng-Jung.1.2<br>
resourceMap_Lin Cheng-Jung.1.3<br>
Lin Cheng-Jung.1.1<br>
Lin Cheng-Jung.1.2<br>
Lin Cheng-Jung.1.3<br>
doi:10.6085/AA/ SHLX00_XXXITV2XLSR03_20111128.40.1</p>
Member Nodes - MNDeployment #3118 (Operational): Dryad Member Nodehttps://redmine.dataone.org/issues/31182012-08-05T17:05:51ZDave Vieglaisdave.vieglais@gmail.com
<p>The Dryad MN will operate as a tier 1 member node.</p>
<p>Base_URL: <a href="https://datadryad.org/mn">https://datadryad.org/mn</a><br>
Node_ID: urn:node:DRYAD<br>
Deployment_Contact: Ryan Scherle<br>
Software: Custom on modified DSpace (Dryad)<br>
Target_Tier: 1<br>
Content_Volume_GB: 20</p>
Requirements - Requirement #795 (New): (Requirement) System must support revocation of user permi...https://redmine.dataone.org/issues/7952010-08-27T14:51:51ZMark Servillamark.servilla@gmail.com
<p>The system should be able to revoke any user's permissions and, ultimately, their direct access to the system, if the user is misbehaving within the system.</p>
<p>Although it is unclear as to who assigns permissions, I believe that the final responsibility and authority for access control is the <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a> administrator. As such, permissions and simple access to any part of the <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a> infrastructure, and perhaps member node infrastructure that is accessed through <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a>, should be revokable.</p>
<p>Fit Criteria</p>
<ul>
<li>Administrator can change permissions for a user for any object</li>
<li>Permission changes are propagated through the system within XXX seconds</li>
<li>Read, write access rules can be altered for a user for all content in the system</li>
</ul>
Requirements - Requirement #771 (New): (Requirement) User identities should have simple string se...https://redmine.dataone.org/issues/7712010-08-11T00:46:18ZMatthew Jonesjones@nceas.ucsb.edu
<p>When user identities can be drawn from multiple providers, we need to be able to serialize both the id and the provider namespace, for example by encapsulating both in a single distinguished name (DN). Ideally this serialization would be relatively short, persistent, and human understandable, and ideally it should not contain spaces or other characters that make it difficult to utilize in a variety of contexts (such as command line applications).</p>
<p>An example DN that has worked for the KNB network is:</p>
<p>uid=jones,o=NCEAS,dc=ecoinformatics,dc=org</p>
<p>Fit Criteria</p>
<p>*<br>
*<br>
*</p>
Requirements - Requirement #767 (New): (Requirement) Users need to be able to express embargo rul...https://redmine.dataone.org/issues/7672010-08-11T00:14:36ZMatthew Jonesjones@nceas.ucsb.edu
<p>These embargo rules allow data to be published in the system, but not released until a particular date. By operating this way, users can use the system in their daily management of their data without worry of losing track of publication of the data at a later date. It encourages people to start using the system even long before they want to publicly release the data.</p>
<p>Fit Criteria</p>
<p>*<br>
*<br>
*</p>
Requirements - Requirement #766 (New): (Requirement) Users should be able to easily assign proxy ...https://redmine.dataone.org/issues/7662010-08-11T00:11:26ZMatthew Jonesjones@nceas.ucsb.edu
<p>When users need to execute processes asynchronously, they need to be able to grant proxy privileges (e.g., to a workflow or grid system) to operate on their behalf in particular contexts. In addition, at times some users want others to be able to run and access data and operate on behalf of another, such as in a faculty student situation where the student acts as a proxy for the faculty member.</p>
<p>Fit Criteria</p>
<p>*<br>
*<br>
*</p>
Requirements - Requirement #765 (New): (Requirement) Tools can access an API for authn and authzhttps://redmine.dataone.org/issues/7652010-08-10T23:23:23ZMatthew Jonesjones@nceas.ucsb.edu
<p>A standardized API allows us to build interoperable tools, and adapt existing tools to interoperate with the <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataNets">DataNets</a>. All infrastructure components should be able to use these services, including CNs, MNs, and client tools and libraries.</p>
<p>Fit Criteria</p>
<ul>
<li>Demonstrated interoperability of the API across 3 client and member node systems</li>
<li>
*</li>
</ul>
Requirements - Requirement #762 (New): (Requirement) User identities can be derived from existing...https://redmine.dataone.org/issues/7622010-08-10T22:55:56ZMatthew Jonesjones@nceas.ucsb.edu
<p>Many existing directory services are in use in the environmental sciences, and participating member nodes should be able to expose their directories through a standardized mechanism to allow users to make use of existing identities. For example, the KNB LDAP server is a federation of multiple LDAP systems from around the world, and these identities have been used in access rules for many existing objects.Rationale: Re-use of existing infrastructure reduces cost of participation and minimizes confusion over which accounts to use and which rules are associated with what account. </p>
<p>Fit Criteria</p>
<ul>
<li>The system provides a mechanism for exsiting directory services to join * The system provides a namespacing mechanism to differentiate users with the same id in different original directories (e.g., mjones@LTER, mjones@UCNRS)</li>
<li>The same email address can be associated with multiple identities</li>
<li>The same person or system can possess multiple identities </li>
<li>If a user has multiple identities, the user can express equivalence rules that indicate that they are linked, equivalent identities for the purposes of authentication and authorization</li>
</ul>
Requirements - Requirement #761 (New): (Requirement) Users can specify authorization rules for da...https://redmine.dataone.org/issues/7612010-08-10T22:53:02ZMatthew Jonesjones@nceas.ucsb.edu
<p>Users might be able to upload data and science metadata as an atomic operation, but each should be identified separately and access control rules should apply to the objects separately. For example, a user could grant public read access to a metadata object but only grant read access to certain colleagues for associated data objects.</p>
<p>Rationale: <br>
Enabling access control at the same level of granularity of objects in the system ensures that complete control over object conglomerations (packages, etc) is available.</p>
<p>Fit Criteria<br>
** All objects in the system have access control rules <br>
** Separate rules can be associated with the elements of a package during operations at the package level (e.g. @@create@@)</p>
Requirements - Requirement #411 (New): (Requirement) The infrastructure must survive destruction ...https://redmine.dataone.org/issues/4112010-03-24T15:32:54ZDave Vieglaisdave.vieglais@gmail.com
<p>Data storage nodes (Member Nodes or Coordinating Nodes) will go offline for various reasons - maintenance, hardware failure, decommissioning, etc. The infrastructure built by <a class="wiki-page new" href="https://redmine.dataone.org/projects/d1req/wiki/DataONE">DataONE</a> must not be adversely affected by such events.</p>
<p>Rationale</p>
<p>This is a requirement of content persistence (<a class="issue tracker-8 status-1 priority-3 priority-lowest" title="Requirement: (Requirement) The infrastructure must support long term preservation of Data (New)" href="https://redmine.dataone.org/issues/410">#410</a>).</p>
<p>No systems have 100% uptime, and so we should design and plan for failure of MN or CN nodes.</p>
<p>Fit Criteria</p>
<ul>
<li>XX% of all Data remains available if XX% of member nodes go offline</li>
<li>The infrastructure remains functional if two CNs go offline</li>
</ul>
Requirements - Requirement #318 (New): (Requirement) Sponsor required Y1 functionalityhttps://redmine.dataone.org/issues/3182010-03-10T17:12:42ZDave Vieglaisdave.vieglais@gmail.com
<p>The project sponsor has specified that a minimum of three Coordinating Nodes and three Member Nodes will be stood up and operational at the end of the first year of the project (i.e. 2010-07-31).</p>
<p>Rationale:: It's something the sponsor, or the directors of the sponsor need to see for assurance of progress.</p>
<p>Fit Criteria:: By the end of year one of the project (i.e. 2010-07-31) there will be at least:</p>
<ul>
<li><p>three operational coordinating nodes</p></li>
<li><p>three operational member nodes</p></li>
<li><p>Basic mirroring of content between CNs and replication of content from MNs should be enabled.</p></li>
</ul>