DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2018-03-19T21:57:31ZDataONE Tasks
Redmine Infrastructure - Task #8505 (New): Review solr terms for coverage of data citation elementshttps://redmine.dataone.org/issues/85052018-03-19T21:57:31ZDave Vieglaisdave.vieglais@gmail.com
<p>Review the solr search terms and determine how a DataCite version 4 record could be generated from the values available in the solr search index and what changes may be necessary to support such an operation. </p>
<p>Indicate which DataCite fields can not be populated.</p>
<p>Indicate which DataONE solr fields are could be used but are not populated correctly or inconsistently.</p>
Infrastructure - Story #8504 (New): Support creation of data citation record from solr recordhttps://redmine.dataone.org/issues/85042018-03-19T21:53:13ZDave Vieglaisdave.vieglais@gmail.com
<p>The goal of this story is to ensure that elements in the solr search schema are available and appropriately populated to support generation of DataCite version 4.x or later records.</p>
<p>By ensuring support for this schema, it can also be asserted that suitable citation metadata can be provided in landing pages and other renderings of content provided by DataONE.</p>
<p>Resources:</p>
<ul>
<li><a href="https://schema.datacite.org/meta/kernel-4.1/" class="external">DataCite Schema version 4</a></li>
<li><a href="http://indexer-documentation.readthedocs.io/en/latest/generated/solr_schema.html" class="external">DataONE solr Search fields</a></li>
<li><a href="https://rd-alliance.org/group/data-citation-wg/outcomes/data-citation-recommendation.html" class="external">RDA Data Citation Recommendations</a></li>
</ul>
Infrastructure - Task #8171 (New): Document hazelcast components to be moved to postgres storagehttps://redmine.dataone.org/issues/81712017-09-01T15:25:21ZDave Vieglaisdave.vieglais@gmail.com
<p>Hazelcast is currently used to manage distributed data structures across the Coordinating Nodes.</p>
<p>The goal of this task is to document the structures that can be switched to direct postgres storage, the portions of code that need to be adjusted as a result of the change, and how change notification should be broadcast (e.g. should postgres triggers be used to notify of a change to system metadata or should the process changing system metadata be responsible for broadcasting notice of the change through rabbit MQ?).</p>
Infrastructure - Task #8170 (New): Identify the components that are currently managed in LDAP on ...https://redmine.dataone.org/issues/81702017-09-01T15:20:22ZDave Vieglaisdave.vieglais@gmail.com
<p>The node list for example is currently stored in the LDAP database running on the Coordinating Nodes.</p>
<p>The goals of this task are to identify the components that are currently being persisted in LDAP and for each indicate the systems and code that will need to be adjusted to switch from LDAP to postgres storage.</p>
Infrastructure - Task #8169 (New): Document process for changing the postgres masterhttps://redmine.dataone.org/issues/81692017-09-01T15:09:50ZDave Vieglaisdave.vieglais@gmail.com
<p>It will at times be necessary to change the master postgres instance to one of the slaves. This process should be straightforward and ideally scriptable.</p>
<p>Goal of this task is to document the process for changing the master postgres server.</p>
Infrastructure - Task #8168 (New): Evaluate options for postgres replication configurationhttps://redmine.dataone.org/issues/81682017-09-01T15:00:33ZDave Vieglaisdave.vieglais@gmail.com
<p>Goal of this task is to research and document the necessary tools and configuration to support replication of postgres databases across a wide area network.</p>
<ul>
<li>Server configuration</li>
<li>Security of connections (TLS, certificates, etc)</li>
<li>Firewall rules and network configurations necessary</li>
</ul>
Infrastructure - Story #8167 (New): Enable postgres replication between Coordinating Nodeshttps://redmine.dataone.org/issues/81672017-09-01T14:57:03ZDave Vieglaisdave.vieglais@gmail.com
<p>The goal of this story is to enable replication of postgres databases between Coordinating Nodes.</p>
<p>The replication process should:</p>
<ul>
<li>be robust across a wide area network with potentially high latency.</li>
<li>recover consistency after being disconnected deliberately or otherwise</li>
<li>Assume a single master, with at least two, but potentially several slaves</li>
<li>Support ease of switching the master to one of the slaves (e.g. switching from cn-ucsb-1 to cn-orc-1)</li>
</ul>
Member Nodes - Task #8146 (New): Failed Sync Parserhttps://redmine.dataone.org/issues/81462017-07-19T00:46:35ZMonica Ihliemail@monicaihli.com
<p>Develop cron-managed parser to report failed CN sync attempt callbacks.</p>
Member Nodes - Support #8019 (New): Follow up on Preferred Replication Target Testinghttps://redmine.dataone.org/issues/80192017-02-14T19:51:25ZMonica Ihliemail@monicaihli.com
<p>Current testing of specified preferred replication targets is in process with LTER. Follow progress on this functionality as it would pertain to NCEI GMN.</p>
Member Nodes - MNDeployment #7969 (Operational): FEMC (Forest Ecosystem Monitoring Cooperative) -...https://redmine.dataone.org/issues/79692017-01-18T01:20:16ZLaura Moyerslmoyers1@utk.edu
<p>Jim Duncan from VMC contacted us via the "Contact Us" page inquiring about participation with DataONE. See meeting notes from 11 Jan 2017 for summary: <a href="https://epad.dataone.org/pad/p/VMC_and_DataONE">https://epad.dataone.org/pad/p/VMC_and_DataONE</a> </p>
<p>From <a href="http://www.uvm.edu/vmc/about/">http://www.uvm.edu/vmc/about/</a></p>
<p>"The mission of the Vermont Monitoring Cooperative is to serve Vermont through improved understanding of long-term trends, annual conditions, and interdisciplinary relationships of the physical, chemical, and biological components of forested ecosystems in Vermont.</p>
<p>The VMC also promotes the efficient coordination of multi-disciplinary environmental monitoring and research activities among federal, state, university, and private-sector agencies with common interests in the long-term health, management, and protection of Vermont's forested ecosystems."</p>
Member Nodes - Bug #7896 (In Progress): Dryad data sets don't mark new versions correctly in Syst...https://redmine.dataone.org/issues/78962016-09-30T18:38:57ZMatthew Jonesjones@nceas.ucsb.edu
<p>Dryad currently has 74,299 metadata dcouments reported in its DataONE profile, while on their site they indicate having only 14,217 data packages (<a href="http://datadryad.org">http://datadryad.org</a>). I believe this is because Dryad registers each new version of their metadata and data files with a new timestamped PID, but has not provided the appropriate 'obsoletes' information in their system metadata to indicate that the newer PID replaces an original PID. Thus, these newer versions are being counted in DataONE as independent data sets, when in fact they are just revisions. For a concrete example, see:</p>
<p>Original metadata: <a href="https://cn.dataone.org/cn/v2/meta/http://dx.doi.org/10.5061/dryad.0t407/1%3Fver=2016-09-30T10:03:37.256-04:00">https://cn.dataone.org/cn/v2/meta/http://dx.doi.org/10.5061/dryad.0t407/1%3Fver=2016-09-30T10:03:37.256-04:00</a><br>
New version: <a href="https://cn.dataone.org/cn/v2/meta/http://dx.doi.org/10.5061/dryad.0t407/1%3Fver=2016-09-30T10:08:08.080-04:00">https://cn.dataone.org/cn/v2/meta/http://dx.doi.org/10.5061/dryad.0t407/1%3Fver=2016-09-30T10:08:08.080-04:00</a></p>
<p>That new version is not marked as obsoleting the original, and so they both show up as independent data sets. Ideally, the obsoletes field would be set for the newer of the two. Dryad would also probably benefit from using the SID field in system metadata to indicate that the two versions are part of the same series with the given DOI (e.g., in this case, <a href="http://dx.doi.org/10.5061/dryad.0t407/1">http://dx.doi.org/10.5061/dryad.0t407/1</a>). This lack of obsoletes fields should be easily fixed because the PIDs in Dryad seem to consistently use the underlying DOI with a timestamped version field, so a script could probably be written to update all of the system metadata using a simple timestamp comparison to determine version order for the PIDs.</p>
<p>I believe the same issues exist for resource maps and data files as well, although possibly to a lesser extent for the data files.</p>
Member Nodes - MNDeployment #7895 (Operational): Pangaeahttps://redmine.dataone.org/issues/78952016-09-26T20:18:22ZMatthew Jonesjones@nceas.ucsb.edu
<p>At the 2016 RDA plenary in Denver, I made contact with Markus Stocker <a href="mailto:markus.stocker@gmail.com">markus.stocker@gmail.com</a> who is on the staff at the Pangaea (<a href="https://pangaea.de/">https://pangaea.de/</a>) repository in Germany about the possibility of bringing Pangaea into DataONE as a MN. Markus was enthusiastic, and said he would bring it to the attention of the leads at Pangaea, namely Michael Diepenbroek <a href="mailto:mdiepenbroek@pangaea.de">mdiepenbroek@pangaea.de</a> and Uwe Schindler <a href="mailto:uschindler@pangaea.de">uschindler@pangaea.de</a>. He has since replied, saying that there is support for doing so but minimal resources. He suggested that they would join if we 'do the work', which given their existing web service interfaces, probably means some form of Slender Node. They have ElasticSearch, PMH, and other endpoints accessible, as well as data subsetting services. Their metadata format largely appears to be a custom Pangaea schema. Their web services are listed and accessible here: <a href="http://ws.pangaea.de/">http://ws.pangaea.de/</a></p>
<p>Markus' response follows:</p>
<p>Hi Matt,</p>
<p>Was great meeting and talking to you at RDA.</p>
<p>As promised, I have raised the idea of a PANGAEA - DataONE integration in internal discussions. There is agreement here that this is something interesting to explore.</p>
<p>We will need to discuss the details but given the little free capacity on our side, I am afraid any integration would have to be "easy on us." As a start, the easiest for us would be if DataONE integrates via our suite of web services [1]. If you take a look at those services, is it possible for you to gauge whether this is a reasonable starting point, and can we draft a plan detailing how an integration could work based on those services?</p>
<p>Cheers, m.</p>
<p>We need to discuss and send a reply.</p>
Member Nodes - MNDeployment #7296 (Planning): GCOOShttps://redmine.dataone.org/issues/72962015-08-18T01:42:47ZLaura Moyerslmoyers1@utk.edu
<p>The Gulf of Mexico Coastal Ocean Observing System (GCOOS) is part of the Integrated Ocean Observing System. </p>
<p><a href="http://gcoos.org/">http://gcoos.org/</a></p>
Member Nodes - MNDeployment #6548 (Operational): R2R Repositoryhttps://redmine.dataone.org/issues/65482014-11-06T20:47:48ZMatthew Jonesjones@nceas.ucsb.edu
<p>The Rolling Deck to Repository (R2R) (<a href="http://www.rvdata.us/">http://www.rvdata.us/</a>)</p>
<p>Main contact: Bob Arko <a href="mailto:arko@ldeo.columbia.edu">arko@ldeo.columbia.edu</a></p>
<p>As of today, the R2R repository holds over 17,444,740 data files from 3686 cruises using 26 vessels.</p>
<p>Implementation: Interested in deploying this using DSpace which they already have operational.</p>
<p>Associated with GeoLink project which Matt Jones and Mark Schildhauer are on.</p>
Member Nodes - MNDeployment #5451 (Operational): Cary Institute (via Figshare)https://redmine.dataone.org/issues/54512014-05-29T19:28:05ZBruce Wilsonbwilso27@utk.edu
<p>figshare allows researchers to publish all of their research outputs in an easily citable, sharable and discoverable manner. All file formats can be published, including videos and datasets. Optional peer review process. figshare uses creative commons licensing. </p>