DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2018-10-27T16:04:14ZDataONE Tasks
Redmine Infrastructure - Story #8738 (In Progress): HZEventFilter performance decline with increased task...https://redmine.dataone.org/issues/87382018-10-27T16:04:14ZRob Nahfrnahf@epscor.unm.edu
<p>While reindexing, I noticed that creating index tasks was taking about 300ms (when index_task table had about 30k records). Later in the index task generation, that duration increased to about 500 ms on average. (now it's at 600ms).</p>
<p>There are two calls to the database that search for the pid to check its status, and those filters are not against a field that is indexed (pid). Ideally, we should index that field.</p>
<p>At the very least, the 2 queries should be reduced to one query. This could be done without changing the ORM model we're using.</p>
<p>below is the table description in postgres:</p>
<pre>d1-index-queue=# \d index_task
Table "public.index_task"
Column | Type | Collation | Nullable | Default
---------------------+------------------------+-----------+----------+---------
id | bigint | | not null |
datesysmetamodified | bigint | | not null |
deleted | boolean | | not null |
formatid | character varying(255) | | |
nextexecution | bigint | | not null |
objectpath | text | | |
pid | text | | not null |
priority | integer | | not null |
status | character varying(255) | | |
sysmetadata | text | | |
taskmodifieddate | bigint | | not null |
trycount | integer | | not null |
version | integer | | not null |
Indexes:
"index_task_pkey" PRIMARY KEY, btree (id)
d1-index-queue=# \q
</pre> CN REST - Story #8364 (In Progress): Ensure portal uses correct X509 certificateshttps://redmine.dataone.org/issues/83642018-02-13T20:17:25ZChris Jonescjones@nceas.ucsb.edu
<p>We've run into issues where after an upgrade of the <code>dataone-cn-portal</code> package on the CNs, the properties pointing to the public certificate and private key are incorrectly pointing to the old GeoTrust wildcard files rather than the new Lets Encrypt files:<br>
<br>
cn.server.publiccert.filename=/etc/ssl/certs/<em>.test.dataone.org.crt<br>
cn.server.privatekey.filename=/etc/ssl/private/</em>.test.dataone.org.key</p>
<p>These should be (in STAGE):</p>
<p>/etc/letsencrypt/live/cn-stage.test.dataone.org/cert.pem<br>
/etc/letsencrypt/live/cn-stage.test.dataone.org/privkey.pem</p>
<p>The issue might be that these are not being set correctly during the <code>postinst</code> script run. Jing pointed out that these values are taken from the debconf database settings that get set when <code>dataon-cn-os-core</code> is installed. So although the <code>postinst</code> script might be setting the correct values, the old cached values might still be in memory in the debconf database. If so, we'll need to clear those values during installations and upgrades.</p>
<p>Also, knowing where to look for these configuration settings can be challenging. These are referenced from <code>/var/lib/tomcat7/webapps/portal/WEB-INF/portal.properties</code>. These settings should be consolidated into <code>/etc/dataone/portal/portal.properties</code> so they also don't get blown away on war file upgrades in Tomcat.</p>
Infrastructure - Story #8227 (In Progress): ExceptionHandler regurgitates long html pages into th...https://redmine.dataone.org/issues/82272017-12-13T21:19:23ZRob Nahfrnahf@epscor.unm.edu
<p>While useful to know what was returned in the error response when it was not the correct response, HTML pages can be verbose and include excessive markup that's not useful. Especially when a GMN MN is in debugging mode and there is a systematic error being returned (like during an authentication issue), these logged html pages can end up being 75% of the log files, and cause meaningful log lines from scrolling off the end of the log rotation.</p>
<p>An option should be provided to limit the amount of characters being returned in the ServiceFailure.</p>
<p>Options are to:<br>
1. eliminate the message body altogether<br>
2. truncate the message body<br>
3. only print the visible parts of the HTML (remove and elements)<br>
4. combination of 2 & 3</p>
<p>since a new feature, develop in trunk.</p>
Infrastructure - Story #8172 (In Progress): investigate atomic updates for some solr updateshttps://redmine.dataone.org/issues/81722017-09-01T19:35:25ZRob Nahfrnahf@epscor.unm.edu
<p>Atomic updates came to solr with v4.0. (We're currently at 5.x)</p>
<p>Atomic updates are supposed to be more efficient, and could help us with the race condition in <a class="issue tracker-5 status-5 priority-4 priority-default closed child" title="Task: Use multiple threads to index objects (Closed)" href="https://redmine.dataone.org/issues/7771">#7771</a>.<br>
(multiple tasks reading a solr record and then modifying it in divergent ways via overwriting existing values.</p>
<p>atomic add and remove modifiers allow addition and removal of multivalued fields, which is where our race conditions arise.</p>
Infrastructure - Story #8155 (In Progress): Ensure GMN fully supports the Package APIhttps://redmine.dataone.org/issues/81552017-08-01T16:25:32ZDave Vieglaisdave.vieglais@gmail.com
<p>The package API </p>
<p><a href="https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html#MNPackage.getPackage">https://releases.dataone.org/online/api-documentation-v2.0/apis/MN_APIs.html#MNPackage.getPackage</a></p>
<p>is a convenience method for clients to download a complete data package in single call. The result is a ZIP file in the BagIt format</p>
<p>The goal of this story is to fully implement the Package API on GMN.</p>
Infrastructure - Story #8081 (In Progress): develop federated broker configuration for indexinghttps://redmine.dataone.org/issues/80812017-04-24T22:52:34ZRob Nahfrnahf@epscor.unm.eduInfrastructure - Story #8049 (In Progress): Support synchronization of system metadata for unhost...https://redmine.dataone.org/issues/80492017-03-21T05:34:38ZRob Nahfrnahf@epscor.unm.edu
<p>As part of mutable-content MN support, allow the MemberNode to keep the system metadata records for all resultant versions of its changeable entities. this allows them to keep accurate system metadata for every version even though they do not have the object bytes anymore for that version.</p>
<p>Benefits:<br>
1. MN does not orphan any objects<br>
2. MN can administer objects from past versions on their own MN. Adjust the access policy of all versions, for example.<br>
3. don't need to call cn.setObsoletedBy or leave that field empty.</p>
<p>Costs:<br>
1. requires new logic for indexing (possibly)<br>
2. requires new logic for registerSystemMEtadata (possibly)<br>
3. require new logic for synchronization </p>
<p>very similar to how we synchronize DATA objects, but don't trigger MN replication.</p>
Infrastructure - Story #7920 (In Progress): migrate apache2 authorization rules from 2.2 conformi...https://redmine.dataone.org/issues/79202016-10-26T18:15:04ZRob Nahfrnahf@epscor.unm.edu
<p>Currently, our apache configs are using the 2.2 style, but we will have to upgrade at some point. </p>
<p>The access_compat module (under mods-enabled) is in place to allow us to use the old 2.2 conventions.</p>
Infrastructure - Story #7832 (In Progress): migrate from JibX to JAXB for XML binding / codegenhttps://redmine.dataone.org/issues/78322016-06-24T19:12:07ZRob Nahfrnahf@epscor.unm.edu
<p>JibX is losing support in the community - not finding willing partners for maintainance - while JAXB has become the standard XML binding framework for Java. We plan to migrate / switch to JAXB for object un/marshalling and most likely datatype code generation.</p>
Infrastructure - Story #7407 (In Progress): object formats in the d1_common_java bootstrap list a...https://redmine.dataone.org/issues/74072015-10-05T22:43:57ZRob Nahfrnahf@epscor.unm.edu
<p>Mark reports that:</p>
<p>dcx is not in d1_common_java but in production</p>
<p>And the objectFormat also appears to list formats that are not supported science metadata formats? For example:</p>
<p><a href="http://dublincore.org/schemas/xmls/qdc/2008/02/11/simpledc.xsd">http://dublincore.org/schemas/xmls/qdc/2008/02/11/simpledc.xsd</a><br>
<a href="http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd">http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd</a></p>
<p>The list should be consistent with what is in production.</p>
Infrastructure - Story #7358 (In Progress): ContactSubject on NodeList must be valid D1 ldap entryhttps://redmine.dataone.org/issues/73582015-09-16T19:58:13ZRobert Waltz
<p>Before a CN can be started, LDAP must have an approved entry for Contact Subject.</p>
<p>Contact Subject has been defaulted to CN=Robert P Waltz A904,O=Google,C=US,DC=cilogon,DC=org on all of the CN entries in the node list.</p>
<p>Since Robert P Waltz is a developer and not an organizer or director, then the publicized contact on the CNs should be changed to reflect the organizational hierarchy.</p>
<p>The Contact Subject for the CNs should be the PI of the project, or at least, a Co-PI.</p>
<p>Also, The DN of this subject should be derived from the DataONE CA instead of cilogon.</p>
<p>Updating the existing systems should be trivial. The Ldap Entry for each CN node will be modified, and a new LDAP entry for the new Subject will need to be added.</p>
OGC-Slender Node - Story #7149 (Testing): Implement mechanism to retrieve a list of objects avail...https://redmine.dataone.org/issues/71492015-06-04T20:20:47ZDave Vieglaisdave.vieglais@gmail.com
<p>Using Python, implement a tool that is able to retrieve a list of packages, and the objects that make up each package.</p>
Infrastructure - Story #6377 (In Progress): Review CNode initialization in ReplicationManager - i...https://redmine.dataone.org/issues/63772014-09-10T18:03:08ZRob Nahfrnahf@epscor.unm.edu
<p>There is complicated fallback logic for performing certain CN API calls where the ReplicationManager CNode is supposed to be the local CN baseurl, and if that fails, it falls back to using the ReplicationService (roundRobin) CN to perform the same call. However, ReplicationManager is already using the RR CN, unless the D1Client.CN_URL is different for different deployments. But in that case, ReplicationService would also use the same D1Client.CN_URL property.</p>
<p>Simplifying this would help encapsulate client interaction, and connection management. Fallback CNode procurement uses D1NodeFactory + a locally instantiated DefaultHttpMultipartRestClient for both ReplicationManager and ReplicationServices (a recent temporary solution for v2). This leads to two ConnectionManagers in the underlying HttpClient.</p>
<p>Since ReplicationManager is a complicated class, I would first look to encapsulate logic in the ReplicationService class.</p>
Infrastructure - Story #4463 (In Progress): Incorporate Node Replication Policy into replication ...https://redmine.dataone.org/issues/44632014-03-14T21:47:09ZSkye Roseboomsroseboo@dataone.unm.edu
<p>Once MN replication policy are available through the Node Repository datasource, replication processing will need to be updated to honor this default replication policy when items without any policy are synchronized to the CN.</p>
<p>This task needs to follow redmine <a class="issue tracker-4 status-5 priority-4 priority-default closed parent" title="Story: Implement NodeReplicationPolicy in Node Registry Service (Closed)" href="https://redmine.dataone.org/issues/2192">#2192</a>.</p>
Java Client - Story #3666 (In Progress): D1Client.listUpdateHistory() needs to handle changing ac...https://redmine.dataone.org/issues/36662013-03-15T22:51:23ZRob Nahfrnahf@epscor.unm.edu
<p>the current D1Client.listUpdateHistory() method needs to gracefully handle the situation where a NotAuthorized request is returned. the ObsoletesChain client class may need to be refactored to allow for this exception to be held so it can notify the user where appropriate.</p>
<p>Ostensibly, with a NotAuthorized, the user will not have access to either the tail or head of the chain, so can't return the head or tail, depending on how access changes.</p>