DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2020-07-15T18:03:40ZDataONE Tasks
Redmine Infrastructure - Bug #8866 (New): Java client tools should set a custom user agent stringhttps://redmine.dataone.org/issues/88662020-07-15T18:03:40ZBryce Mecummecum@nceas.ucsb.edu
<p>Related to <a href="https://redmine.dataone.org/issues/7047">https://redmine.dataone.org/issues/7047</a></p>
<p>It looks like nowhere in <code>d1_libclient_java</code> do we set a user agent string. Aside from being best practice, it limits our ability to customize our infrastructure around it. For example, OPC is running into HTTP 413s due to overrunning their TLS renegotiation buffer and we can't effectively whitelist their requests, which come from our Java client tools, to allow them to upload large files.</p>
Infrastructure - Story #8862 (New): Deploy a new dataone-cn-rest releasehttps://redmine.dataone.org/issues/88622020-04-23T16:24:46ZJing Taotao@nceas.ucsb.edu
<p>We have a new d1_portal jar release which addresses the issue that restarting tomcat in CNs is needed when the LE certificates are renewed in CNs. The new d1_portal jar file has been deployed to dataone-cn-portal. However, the component dataone-cn-rest was overlooked. We need to deploy it there as well.<br>
Yesterday, we did a hack fix in CNs when we restarted tomcat - dropped the d1_portal-2.3.2.jar file there. So now it should work. But we still need a formal release.</p>
Infrastructure - Task #8858 (New): Update CN Apache configs in version control with directives to...https://redmine.dataone.org/issues/88582020-02-05T20:02:12ZBryce Mecummecum@nceas.ucsb.edu
<p>Sitemaps are located on disk in ${tomcat_webapps_dir}/${context}/sitemaps as <code>sitemap_index.xml</code> and <code>sitemap%d.xml</code> (for each sub-sitemap).</p>
<p>The rule we've come up with is:</p>
<p><code>RewriteRule ^/(sitemap.+) /metacat/sitemaps/$1 [R=303]</code></p>
Infrastructure - Story #8853 (New): Make cn.resolve smarterhttps://redmine.dataone.org/issues/88532019-11-15T16:46:12ZJing Taotao@nceas.ucsb.edu
<p>In this case the cn.resolve() operation should be ignoring the node that is marked as offline, or at least placing it last in the list.</p>
<p>This should be a high priority fix, and should be fairly simple to implement since the information is available in the node document.</p>
<ul>
<li>Dave</li>
</ul>
<blockquote>
<p>On 2019-11-14, at 21:38, Matt Jones <a href="mailto:jones@nceas.ucsb.edu">jones@nceas.ucsb.edu</a> wrote:</p>
<p>FYI, thread form today with Ethan White on ebird replication, and the resolve() api in DataONE. Relates to our conversation today about making resolve() and MetacatUI downloads smarter.</p>
<p>Matt</p>
<p>Ethan White 5:06 PM<br>
What's the right place to report data that if 404ing on DataONE?</p>
<p>Matt Jones 5:07 PM<br>
<a href="mailto:support@dataone.org">support@dataone.org</a> would work</p>
<p>5:08 PM<br>
or let me know</p>
<p>5:08 PM<br>
is it that same data set?</p>
<p>5:08 PM<br>
the Ebird one?</p>
<p>Ethan White 5:09 PM<br>
Yeah, which we had discovered had been reposted and spent a bunch of time gearing up to support again. We were in the middle of testing when it suddenly disappeared again. <a href="http://dataone.ornith.cornell.edu/metacat/d1/mn/v2/object/EOD_CLO_2016.csv.gz">http://dataone.ornith.cornell.edu/metacat/d1/mn/v2/object/EOD_CLO_2016.csv.gz</a></p>
<p>Matt Jones 5:10 PM<br>
yeah. Cornell just gave us permission to replicate the data to other nodes. They haven’t wanted us to do so in the past.</p>
<p>Ethan White 5:13 PM<br>
Thanks. That's good news. So can we expect it to reappear at some point soonish?</p>
<p>Matt Jones 5:14 PM<br>
Yeah, its been replicated. I’m checking to see if it is properly linked to the original.</p>
<p>5:15 PM<br>
<a href="https://knb.ecoinformatics.org/view/EOD_CLO_2016.eml">https://knb.ecoinformatics.org/view/EOD_CLO_2016.eml</a></p>
<p>new messages</p>
<p>Ethan White 5:16 PM<br>
Thanks Matt. FYI that link I posted is the one being returned from a current search of DataONE.</p>
<p>Matt Jones 5:17 PM<br>
Yeah. Because that’s the ‘authoritative’ copy at cornell.</p>
<p>5:17 PM<br>
but Cornell’s node has been going up and down.</p>
<p>5:17 PM<br>
our resolve service lists all copies of a data set</p>
<p>5:17 PM<br>
so if one is down, you can get it from another location:</p>
<p>5:18 PM<br>
<code><br>
$ curl -H "Accept: text/xml" https://cn.dataone.org/cn/v2/resolve/EOD_CLO_2016.eml<br>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><br>
<ns2:objectLocationList xmlns:ns2="http://ns.dataone.org/service/types/v1"><br>
<identifier>EOD_CLO_2016.eml</identifier><br>
<objectLocation><br>
<nodeIdentifier>urn:node:CLOEBIRD</nodeIdentifier><br>
<baseURL>http://dataone.ornith.cornell.edu/metacat/d1/mn</baseURL><br>
<version>v1</version><br>
<version>v2</version><br>
<url>http://dataone.ornith.cornell.edu/metacat/d1/mn/v2/object/EOD_CLO_2016.eml</url><br>
</objectLocation><br>
<objectLocation><br>
<nodeIdentifier>urn:node:CN</nodeIdentifier><br>
<baseURL>https://cn.dataone.org/cn</baseURL><br>
<version>v1</version><br>
<version>v2</version><br>
<url>https://cn.dataone.org/cn/v2/object/EOD_CLO_2016.eml</url><br>
</objectLocation><br>
<objectLocation><br>
<nodeIdentifier>urn:node:KNB</nodeIdentifier><br>
<baseURL>https://knb.ecoinformatics.org/knb/d1/mn</baseURL><br>
<version>v1</version><br>
<version>v2</version><br>
<url>https://knb.ecoinformatics.org/knb/d1/mn/v2/object/EOD_CLO_2016.eml</url><br>
</objectLocation><br>
</ns2:objectLocationList><br>
</code></p>
<p>Ethan White 5:19 PM<br>
OK, thanks. That's why I thought the link in DataONE <a href="https://cn.dataone.org/cn/v2/resolve/EOD_CLO_2016.csv.gz">https://cn.dataone.org/cn/v2/resolve/EOD_CLO_2016.csv.gz</a> would take me to a working version, but clearly I just don't understand the details. We'll just use the the one on KNB at least for the moment. Really appreciate your help as always.</p>
<p>Matt Jones 5:20 PM<br>
No problem. I’d love to make this all work more seamlessly. (edited) </p>
<p>5:20 PM<br>
So suggestions definitely welcome.</p>
<p>5:21 PM<br>
I expect Cornell to take their node offline altogether — so the KNB will likely be the better location.</p>
<p>5:22 PM<br>
Btw, the resolve link when executed in a browser just redirects to the first copy</p>
<p>Ethan White 5:23 PM<br>
Yeah, Cornell's closed approach to things is a pretty big disappointment, especially on data like this that is generated by volunteers. We'll just go to the KNB version permanently.</p>
<p>Matt Jones 5:23 PM<br>
whereas programatically you get the list of locations</p>
<p>5:23 PM<br>
if you ask for XML</p>
<p>Ethan White 5:23 PM<br>
That makes sense. Thanks.</p>
<p>Matt Jones 5:23 PM<br>
and then you can choose to try one or more</p>
</blockquote>
Infrastructure - Story #8849 (New): During sync, the CN does not detect error returned from getCh...https://redmine.dataone.org/issues/88492019-11-05T19:25:48ZRoger Dahldahl@unm.edu
<p>Due to a bug, GMN returned 500 on some getChecksum() calls. The CN did not detect the 500 return status and proceeded with the sync, using "null" as the checksum.</p>
Infrastructure - Story #8848 (New): A minor difference of annotation index between CN and MNhttps://redmine.dataone.org/issues/88482019-11-01T21:37:01ZJing Taotao@nceas.ucsb.edu
<p>The solr index on CN is:</p>
<pre><arr name="sem_annotation">
<str>http://purl.dataone.org/odo/ECSO_00000512</str>
<str>
http://ecoinformatics.org/oboe/oboe.1.2/oboe-core.owl#MeasurementType
</str>
<str>http://purl.dataone.org/odo/ECSO_00001102</str>
<str>http://purl.dataone.org/odo/ECSO_00001243</str>
<str>http://purl.dataone.org/odo/ECSO_00000629</str>
<str>http://purl.dataone.org/odo/ECSO_00000518</str>
<str>http://www.w3.org/2000/01/rdf-schema#Resource</str>
<str>http://purl.dataone.org/odo/ECSO_00000516</str>
<str>http://purl.obolibrary.org/obo/UO_0000301</str>
</arr>
</pre>
<p>The mn is:</p>
<pre><arr name="sem_annotation">
<str>http://purl.dataone.org/odo/ECSO_00000512</str>
<str>
http://ecoinformatics.org/oboe/oboe.1.2/oboe-core.owl#MeasurementType
</str>
<str>http://purl.dataone.org/odo/ECSO_00001102</str>
<str>http://purl.dataone.org/odo/ECSO_00001243</str>
<str>http://purl.dataone.org/odo/ECSO_00000629</str>
<str>http://purl.dataone.org/odo/ECSO_00000518</str>
<str>http://purl.dataone.org/odo/ECSO_00000516</str>
<str>http://purl.obolibrary.org/obo/UO_0000301</str>
</arr>
</pre>
<p>The cn has an extra <code><str>http://www.w3.org/2000/01/rdf-schema#Resource</str></code><br>
Bryce and I discussed it and thought it wouldn't affect the feature. But we still need to figure it out.</p>
Member Nodes - MNDeployment #8847 (In Review): Freshwater Research and Environmental Database (IG...https://redmine.dataone.org/issues/88472019-10-17T19:19:35ZAmy Forresteraforres4@utk.edu
<p>he Freshwater Research and Environmental Database is the central data repository for IGB (Leibniz-Institut of Freshwater Ecology and Inland Fisheries). It is where we store and share environmental data from observations of lakes, rivers, peatlands and other freshwater habitats. In FRED you can find continuous data collected over several decades from our long-term research programme at the lakes Müggelsee, Stechlinsee, Arendsee and the river Spree, as well as environmental data derived from short-term projects in aquatic ecosystems. All data include detailed metadata descriptions in text form to allow reuse of the data. The database can be searched for a range of aspects, such as ecosystem types or abiotic and biotic variables. Data use, where not freely accessible, shall be granted after consulting with the contact person given in the database, and is subject to the IGB Data Policy.</p>
Member Nodes - Bug #8844 (New): Server certificate is expiredhttps://redmine.dataone.org/issues/88442019-10-08T15:17:01ZDave Vieglaisdave.vieglais@gmail.com
<p>The server certificate has expired and needs to be renewed:</p>
<pre>Certificate:
Data:
Version: 3 (0x2)
Serial Number:
04:99:29:51:81:59:be:23:83:e8:a2:2d:9f:78:7c:7d:92:67
Signature Algorithm: sha256WithRSAEncryption
Issuer: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
Validity
Not Before: Jun 19 16:21:39 2018 GMT
Not After : Sep 17 16:21:39 2018 GMT
Subject: CN = dataone.sensor.nevada.edu
</pre> Infrastructure - Task #8817 (New): Configure sitemaps on the CNhttps://redmine.dataone.org/issues/88172019-06-06T23:52:23ZBryce Mecummecum@nceas.ucsb.edu
<p>Support for sitemaps landed last fall in Metacat: <a href="https://github.com/NCEAS/metacat/pull/1283">https://github.com/NCEAS/metacat/pull/1283</a>. Sitemaps are good for users but especially for search engines and DataONE's Search Catalog could benefit from having sitemaps enabled. A sitemap could help crawlers discover all of the datasets in DataONE. The CNs already run Metacat and should use Metacat's sitemaps ability to generate sitemaps for all content.</p>
<p>To enable sitemaps on the CNs, a few things seem to be needed. I'm not very familiar with how the CNs get built so I may be wrong or be missing things:</p>
<ul>
<li>Sitemaps rely on two properties in Metacat's <code>metacat.properties</code> file, which should have values: <code>sitemap.location.base=https://search.dataone.org/</code> and <code>sitemap.entry.base=https://search.dataone.org/view</code></li>
<li>The Apache config the CNs are built with need to serve the <code>sitemap_index.xml</code> and individual sitemaps from the Tomcat webapps dir. Metacat generates sitemaps in the <code>sitemaps</code> subfolder (e.g., <code>/usr/lib/tomcat8/webapps/metacat/sitemaps</code>). A <code>Directory</code> directive should work so long as filesystem permissions are set up for Apache to see the files.</li>
<li>We need a robots.txt that points at the sitemap index file at search.dataone.org which provides the entrypoint to <code>sitemap_index.xml</code></li>
<li>Metacat generates sitemaps with a recurring job mechanism that's internal to Metacat. AFAIK this job isn't turned on when Tomcat loads Metacat and a request has to get sent to the admin API which turns this job on as a side-effect. We might want to change this to reduce maintenance burden or chance of having stale sitemaps</li>
</ul>
<p>Dave nominated Jing for this work and has targeted this for the next CCI release. I'm not sure which that is so please select whichever one is appropriate.</p>
<p>Note this relates to <a href="https://redmine.dataone.org/issues/8693">https://redmine.dataone.org/issues/8693</a> which we've delayed because Google's crawler infrastructure has changed and DataONE is now visible by Google. Google staff have indicated we only need to send them a robots.txt that points to our sitemaps for them to begin crawling.</p>
Member Nodes - MNDeployment #8772 (Operational): metaGRILhttps://redmine.dataone.org/issues/87722019-03-05T21:28:34ZAmy Forresteraforres4@utk.edu
<p>while deploying metacat, query Lauren about exposing downloads, views, and citation counts in UI. Response = MN feature --> passed contact Eric Beaulieu (<a href="mailto:eric.g.beaulieu@umontreal.ca">eric.g.beaulieu@umontreal.ca</a>) to MN Coordinator.</p>
<p><a href="https://oraprdnt.uqtr.uquebec.ca/pls/public/gscw030?owa_no_site=543">https://oraprdnt.uqtr.uquebec.ca/pls/public/gscw030?owa_no_site=543</a></p>
<p>Catalog of datasets of GRIL funded research projects<br>
Le Groupe de recherche interuniversitaire en limnologie (GRIL) </p>
<p><a href="https://docs.google.com/document/d/1xHj62ZhYoubIii2FJsbKDnBfn4z35N4kzZXPhbQjXc0/edit?usp=sharing" class="external">Potential Member Node Discovery Worksheet</a></p>
Member Nodes - MNDeployment #6957 (Operational): NRDC - Nevada Research Data Centerhttps://redmine.dataone.org/issues/69572015-03-25T19:43:32ZLaura Moyerslmoyers1@utk.edu
<p>The NCCP, based at the University of Nevada, Reno, is a part of the Nevada EPSCoR project. They currently have about a dozen sensor sites collecting climate data, with the intent to add 1-2 sites each year for the next 2-3 years.</p>
<p>At present, the goal is to stand up a Tier 1 GMN exposing monthly aggregated data (immutable).</p>
<p>See <a href="http://sensor.nevada.edu">http://sensor.nevada.edu</a></p>
<p>Technical POCs are Richard Kelley and Moinul Hossain.</p>
<p>Initial meeting notes: <a href="https://epad.dataone.org/pad/p/NCCP_and_DataONE">https://epad.dataone.org/pad/p/NCCP_and_DataONE</a></p>
Member Nodes - MNDeployment #6562 (Operational): BCO-DMOhttps://redmine.dataone.org/issues/65622014-11-11T20:13:41ZMatthew Jonesjones@nceas.ucsb.edu
<p>The Biological and Chemical Oceanography Data Management Office (BCO-DMO) (<a href="http://www.bco-dmo.org/">http://www.bco-dmo.org/</a>)</p>
<p>Main contact: Adam Shepherd <a href="mailto:ashepherd@whoi.edu">ashepherd@whoi.edu</a><br>
Additional Contact: Cyndy Chandler <a href="mailto:cchandler@whoi.edu">cchandler@whoi.edu</a></p>
<p>As of today, the BCO-DMO repository holds over 7142 data sets from 419 projects.</p>
<p>Implementation: Their system is based on Drupal.</p>
<p>Associated with GeoLink project which Matt Jones and Mark Schildhauer are on.</p>
Member Nodes - MNDeployment #6485 (Operational): The Digital Archaeology Record (tDAR)https://redmine.dataone.org/issues/64852014-10-01T19:21:48ZLaura Moyerslmoyers1@utk.edu
<p>Digital Antiquity (<a href="http://www.digitalantiquity.org/">http://www.digitalantiquity.org/</a>) is based at Arizona State University. </p>
<p>tDAR (<a href="http://www.tdar.org/">http://www.tdar.org/</a>) is Digital Antiquity's data repository. </p>
<p>Adam Brin (abrin at digitalantiquity dot org) is the POC, and he has been in contact with us via the MNF and talking with Chris with questions.</p>
Member Nodes - MNDeployment #3230 (Planning): ARM - Atmospheric Radiation Measurement member nodehttps://redmine.dataone.org/issues/32302012-09-07T00:21:59ZDave Vieglaisdave.vieglais@gmail.com
<p>This issue captures the activity associated with deployment or otherwise of the "Atmospheric Radiation Measurement" member node.</p>
Member Nodes - MNDeployment #3213 (Operational): University of Illinois, Chicago member nodehttps://redmine.dataone.org/issues/32132012-09-05T03:22:28ZDave Vieglaisdave.vieglais@gmail.com
<p>This issue is to track the deployment of the UIC member node.</p>
<p>This MN is slated to be initially at least, a replication target member node.</p>