DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2018-01-09T02:05:23ZDataONE Tasks
Redmine Infrastructure - Task #8237 (New): Configure production environment to use new API key under KU O...https://redmine.dataone.org/issues/82372018-01-09T02:05:23ZDave Vieglaisdave.vieglais@gmail.comInfrastructure - Feature #8107 (New): Improved Node Synchronization Feedback - Leveraging MNRead....https://redmine.dataone.org/issues/81072017-06-06T17:30:43ZMonica Ihliemail@monicaihli.com
<p>MN Operators would benefit from clearer feedback from the CN to be aware of when syncs fail for a PID.</p>
<ul>
<li>GMN - currently logs MNRead.synchronizationFailed() requests received as part of the regular GMN logs.</li>
<li>Metacat - also currently just logs MNRead.synchronizationFailed() requests as part of regular logs.</li>
<li>One solution might be to simply log these failures in a separate file to make it easier for node operators to identify when something fails to sync. </li>
</ul>
<p>Additionally, we should confirm what kind of information is getting sent back to MNs with MNRead.synchronizationFailed().</p>
Infrastructure - Story #8061 (New): develop queue-based processing system for the CNhttps://redmine.dataone.org/issues/80612017-04-05T22:40:24ZRob Nahfrnahf@epscor.unm.edu
<p>The event-based mechanism for generating indexing tasks is not robust to network segregation and inefficient because it triggers indexing tasks when system metadata are loaded into Hazelcast map - not "real" events, just a data hydration from persistent storage.</p>
<p>Investigate using reliable queues instead. The design will want to be abstracted so that different implementations can be swapped in at a later date, so use standard messaging patterns.</p>
<p>RabbitMQ, ActiveMQ are potential implementations to use.<br>
ZeroMQ is a lower-level implementation, probably a bit more complicated, but very performant.</p>
Infrastructure - Story #8049 (In Progress): Support synchronization of system metadata for unhost...https://redmine.dataone.org/issues/80492017-03-21T05:34:38ZRob Nahfrnahf@epscor.unm.edu
<p>As part of mutable-content MN support, allow the MemberNode to keep the system metadata records for all resultant versions of its changeable entities. this allows them to keep accurate system metadata for every version even though they do not have the object bytes anymore for that version.</p>
<p>Benefits:<br>
1. MN does not orphan any objects<br>
2. MN can administer objects from past versions on their own MN. Adjust the access policy of all versions, for example.<br>
3. don't need to call cn.setObsoletedBy or leave that field empty.</p>
<p>Costs:<br>
1. requires new logic for indexing (possibly)<br>
2. requires new logic for registerSystemMEtadata (possibly)<br>
3. require new logic for synchronization </p>
<p>very similar to how we synchronize DATA objects, but don't trigger MN replication.</p>
Infrastructure - Bug #8046 (New): Mutable Member Nodes orphan unhosted objectshttps://redmine.dataone.org/issues/80462017-03-15T17:40:49ZRob Nahfrnahf@epscor.unm.edu
<p>Mutable Member Nodes that only retain the latest version of their entities rarely do the bookkeeping work on orphaned versions they no longer host. This includes:</p>
<ol>
<li>setting the obsoletedBy field of the the orphaned object</li>
<li>transferring authority to another node (CN or MN)</li>
<li> invalidating the replica in the Replica section</li>
</ol>
<p>This leads to problems in search, resolve, and updating system metadata.</p>
<p>I propose we utilize the new CN-carve-out Node Properties to enable the CN to perform these actions on the mutable member node's behalf. This can either be done with a blanket property 'isMutable', or something more specific, because some member nodes might orphan the object, but keep the system metadata. 'autoObsolete', 'autoDeorphan', 'autoReplicaAudit', or such...</p>
Infrastructure - Bug #7967 (In Progress): The CN is not following the xml schema definition when ...https://redmine.dataone.org/issues/79672017-01-14T01:12:48ZRob Nahfrnahf@epscor.unm.edu
<p>when updating the schemas in 2013 for V2, the field 'pid' was changed to 'identifier'. d1_common_java did not implement this change, yet GMN has. So we have two supported MN software stacks "in the wild" that require different xml delivered with mn.synchronizationFailed(Session s, SynchronizationFailedException x). Fixing for one breaks the other. </p>
<p>We will need to choose whether or not we support 'pid' as a field or 'identifier'.</p>
<p>The CN might be able to mitigate the impact for the short term by sending different versions of the xml, depending on which MN it is sending the synchronizationFailed message to. (This would be entirely within the synchronization code).</p>
<p>Fixing of this bug requires<br>
1) making a decision on which field name to use.<br>
2) updating the documentation<br>
3) possibly updating the schema<br>
4) developing a custom mn.synchronizationFailed() client call and logic for knowing when to send the variant </p>
DataONE API - Task #7839 (New): Online documentation places synchronize in CNRead apihttps://redmine.dataone.org/issues/78392016-06-30T17:35:01ZRobert Waltz
<p>the synchronize method is defined in the CNCore Api in d1_common_java, and it really isn't a 'read' operation.</p>
Infrastructure - Story #7807 (New): cn.synchronize should support synchronization failure correct...https://redmine.dataone.org/issues/78072016-05-13T16:56:25ZRob Nahfrnahf@epscor.unm.edu
<p>cn.synchronize(session, identifier) works well for its original purpose (supporting MN-driven system metadata updates, and MN-driven push synchronization), but doesn't seem to work for manual synchronization failure workflows. The main problem is that the request can only be made by the MN itself (using the MN client certificate). </p>
<p>As we envision a centralized dashboard for monitoring failed synchronization items, how do we address this situation? </p>
<p>The synchronization processing queue needs both the pid and a nodeId from where to retrieve the object. the NodeId is not specified directly in the method call, but gleaned from the session by a reverse lookup from the certificate. (It uses the first node found in the NodeList where the Node.subject field matches the certificate subject).</p>
<p>Should we allow node.contactSubjects into the algorithm?<br>
Should we add nodeId as a parameter?</p>
DataONE API - Bug #7578 (New): Fix 404 link to d1_instance_generator folder in documentationhttps://redmine.dataone.org/issues/75782016-01-08T22:01:20ZBryce Mecummecum@nceas.ucsb.edu
<p>In the MN API documentation for MNStorage.create (<a href="https://jenkins-ucsb-1.dataone.org/job/API%20Documentation%20-%20trunk/ws/api-documentation/build/html//apis/MN_APIs.html#MNStorage.create">https://jenkins-ucsb-1.dataone.org/job/API%20Documentation%20-%20trunk/ws/api-documentation/build/html//apis/MN_APIs.html#MNStorage.create</a>), I found a the following paragraph contains a broken link to d1_instance_generator:</p>
<blockquote>
<p>"The system metadata included with the create call must contain values for the elements required to be set by clients (see System Metadata). The system metadata document can be crafted by hand or preferably with a tool such as generate_sysmeta.py which is available in the d1_instance_generator Python package. See documentation included with that package for more information on its operation."</p>
</blockquote>
<p>The link to d1_instance_generator was to the SVN folder <a href="https://repository.dataone.org/software/cicore/trunk/d1_instance_generator">https://repository.dataone.org/software/cicore/trunk/d1_instance_generator</a> which is currently a 404. I think the folder moved to /d1_test_utilities_python/src/d1_test/instance_generator.</p>
Infrastructure - Story #7559 (New): Develop plan for securing application passwords in the CN stackhttps://redmine.dataone.org/issues/75592015-12-15T22:58:06ZBen Leinfelderleinfelder@nceas.ucsb.edu
<p>There are many components that use passwords in configuration files. While we do restrict who can access our servers and what they can view when on the server, it's still not entirely secure to have property files with cleartext passwords.</p>
<p>Here are components that are known to be configured with cleartext passwords<br>
* d1_identity_manager (LDAP)<br>
* d1_noderegistry (LDAP)<br>
* d1_replication (postgres)<br>
* d1_portal_servlet (postgres)<br>
* Metacat (postgres)<br>
* all hazelcast connections</p>
DataONE API - Bug #7528 (New): Incorrect argument name documented for CNCore.reserveIdentierhttps://redmine.dataone.org/issues/75282015-12-08T19:51:03ZPeter Slaughterslaughter@nceas.ucsb.edu
<p>The docs at <a href="http://jenkins-1.dataone.org/jenkins/job/API%20Documentation%20-%20trunk/ws/api-documentation/build/html/apis/CN_APIs.html#CNCore.reserveIdentifier">http://jenkins-1.dataone.org/jenkins/job/API%20Documentation%20-%20trunk/ws/api-documentation/build/html/apis/CN_APIs.html#CNCore.reserveIdentifier</a> indicate that the 'key' for the identifier to reserve is 'id', when it should actually be 'pid'. </p>
<p>Specifying a request with id (see attached bash script) produces the error: </p>
<p><?xml version="1.0" encoding="UTF-8"?>7091b0df-df6a-424e-b46c-4c31738f1936<a href="/d1:identifier">/d1:identifier</a><?xml version="1.0" encoding="UTF-8"?><br>
<br>
The given identifier is already in use: null</p>
Infrastructure - Story #7224 (New): push synchronization request status indicator: synchronizeSta...https://redmine.dataone.org/issues/72242015-06-18T08:30:42ZRob Nahfrnahf@epscor.unm.edu
<p>Push synchronization (cn.synchronize, mn.updateSystemMetadata) involves an end-user that might want to have an idea of how long until the queued action is going to take to complete. Something as simple as returning the place in line of the sync request might suffice as the indicator, or a more complete data packet, including the place in line and the queue velocity, could be attempted.</p>
<p>The real-world analogy for this kind of indictor is taking a number at the deli-counter: You don't know when you will be served, but you know how many people are in front of you. </p>
<p>This option is a separate call to the CN to check the status of the sync request, so that the current place in line is returned. The advantage of this is that if the velocity of synchronization changes, the interested party can call again and get an updated value - it has more diagnostic and monitoring power. This could lead to over-use, however.</p>
DataONE API - Task #7127 (New): Finalize CNAuthentication interfacehttps://redmine.dataone.org/issues/71272015-05-21T19:31:22ZBen Leinfelderleinfelder@nceas.ucsb.edu
<p>Dave and I have been discussing what API methods may need to be defined for supporting somewhat standardized authentication. </p>
<p>So far we have a registry notion of authentication services:<br>
GET /authenticate/ -> List of authentication type names<br>
GET /authenticate/{auth_type} -> Connection info for authentication server</p>
<p>And a few service implementation<br>
GET /portal/XXX -> initiates login procedure which may be direct (LDAP) or take you through a series of redirects as you authenticate with a chosen provider (CILogon, ORCID)<br>
GET /portal/token -> returns the token string associated with the http session<br>
GET /portal/logout -> discards the session</p>
<p>Some of these may need to be moved under the CN webapp, some may need to have new dataone types defined.</p>
Infrastructure - Task #6754 (New): Architecture API loose endshttps://redmine.dataone.org/issues/67542015-01-13T17:59:06ZRob Nahfrnahf@epscor.unm.edu
<p>While comparing v1 to v2 APIs, the following questions came up:</p>
<ol>
<li>the ping() methods are listed in the table as returning null, but in actuality, we return a Date.</li>
<li>Are we keeping CN.search in the V2 APIs? I thought this one was deprecated.</li>
<li>the systemMetadataChanged method moved from v1.MNAuth to v2.MNRead. Was that intentional? Is it necessary?</li>
<li>synchronizationFailed returns Type.Boolean in the table, instead of a simple boolean that all other methods return. Implementations use simple boolean. </li>
</ol>
Infrastructure - Feature #6499 (New): define and implement addFormat() methodhttps://redmine.dataone.org/issues/64992014-10-02T20:13:01ZRob Nahfrnahf@epscor.unm.edu
<p>target for the V2 api.<br>
The method would programmatically add a new ObjectFormat and ObjectFormatIdentifier to the ObjectFormatList</p>
<p>could be implemented as a POST to the existing cn/v1/format API endpoint</p>