DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2020-07-02T19:06:39ZDataONE Tasks
Redmine Infrastructure - Task #8865 (Closed): Configure dataone.org web server to redirect DataONE datase...https://redmine.dataone.org/issues/88652020-07-02T19:06:39ZBryce Mecummecum@nceas.ucsb.edu
<p>As scoped out in <a href="https://hpad.dataone.org/8h3o_7VPTIibo5xL9bz24w">https://hpad.dataone.org/8h3o_7VPTIibo5xL9bz24w</a>, we'd like to be able to referenced DataONE resources in a linked-open-data manner. This is useful because it forms the basis of building more interested things on top of them. However, those resources (e.g., Data Packages) don't currently have, subjectively, suitable IRIs, though they have a variety of URLs.</p>
<p>The ideas in the above proposal are multi-tiered but a good first start can be achieved immediately: Support a PIRI space for "Datasets" (DataONE Data Packages) by redirecting requests from their IRI form:</p>
<p><code>https://dataone.org/datasets/$ID</code></p>
<p>to their URL form:</p>
<p><code>https://search.dataone.org/view/$ID</code>.</p>
<p>Such a redirection can be achieved within our Apache configuration using <code>mod_rewrite</code> and a rule similar to:</p>
<pre>RewriteRule "^/datasets/(.+)$" "https://search.dataone.org/view/$1" [L,R]
</pre> Infrastructure - Task #8858 (New): Update CN Apache configs in version control with directives to...https://redmine.dataone.org/issues/88582020-02-05T20:02:12ZBryce Mecummecum@nceas.ucsb.edu
<p>Sitemaps are located on disk in ${tomcat_webapps_dir}/${context}/sitemaps as <code>sitemap_index.xml</code> and <code>sitemap%d.xml</code> (for each sub-sitemap).</p>
<p>The rule we've come up with is:</p>
<p><code>RewriteRule ^/(sitemap.+) /metacat/sitemaps/$1 [R=303]</code></p>
Infrastructure - Bug #8836 (Closed): CSS and JS not loading on ProvONE documentationhttps://redmine.dataone.org/issues/88362019-08-12T21:54:25ZBryce Mecummecum@nceas.ucsb.edu
<p>I was looking at <a href="http://jenkins-1.dataone.org/jenkins/view/Documentation%20Projects/job/ProvONE-Documentation-trunk/ws/provenance/ProvONE/v1/provone.html">http://jenkins-1.dataone.org/jenkins/view/Documentation%20Projects/job/ProvONE-Documentation-trunk/ws/provenance/ProvONE/v1/provone.html</a> today and Chrome and Safari are both upset with how its CSS and JS are set up and are refusing to load either. Steps to reproduce:</p>
<ol>
<li>Go to <a href="http://jenkins-1.dataone.org/jenkins/view/Documentation%20Projects/job/ProvONE-Documentation-trunk/ws/provenance/ProvONE/v1/provone.html">http://jenkins-1.dataone.org/jenkins/view/Documentation%20Projects/job/ProvONE-Documentation-trunk/ws/provenance/ProvONE/v1/provone.html</a></li>
<li>See no styles are loaded, hide/show examples buttons don't work. See errors in devtools about CSP errors.</li>
</ol>
Infrastructure - Bug #8802 (Closed): Titles in EML records that use <value> in <title> do not get...https://redmine.dataone.org/issues/88022019-05-18T00:55:24ZBryce Mecummecum@nceas.ucsb.edu
<p>Margaret O'Brien over at EDI noticed this and sent it my way. She found an EML record that serializes the title in a schema-valid but unusual way:</p>
<pre>...snip...
<title>
<value>Forest-wide bird survey at 183 sample sites the Andrews Experimental Forest from 2009-present (Reformatted to ecocomDP Design Pattern)</value>
</title>
...snip...
</pre>
<p>See <a href="https://search.dataone.org/view/https://pasta.lternet.edu/package/metadata/eml/edi/359/1">https://search.dataone.org/view/https://pasta.lternet.edu/package/metadata/eml/edi/359/1</a> and notice the citation is missing its title (which is powered by Solr) and also see the accompanying Solr doc at <a href="https://search.dataone.org/cn/v2/query/solr/?q=id:%22https://pasta.lternet.edu/package/metadata/eml/edi/359/1%22">https://search.dataone.org/cn/v2/query/solr/?q=id:%22https://pasta.lternet.edu/package/metadata/eml/edi/359/1%22</a> and notice the title field is not set.</p>
<p>This is a bit tricky. It's pretty clearly <em>not</em> endorsed in the EML spec, as per:</p>
<p><em>i18nNonEmptyStringType</em>: (emphasis mine)</p>
<blockquote>
<p>This type specifies a content pattern for all elements that require language translations. The xml:lang attribute can be used to define the default language for element content. <em>Additional translations should be included as child 'value' elements</em> that also have an optional xml:lang</p>
</blockquote>
<p>So <code>value</code> in <code>title</code> is intended to be used like this:</p>
<pre><title>
My title
<value xml:lang="fr">Mon titre</value>
</title>
</pre>
<p>and not how it is in <a href="https://search.dataone.org/view/https://pasta.lternet.edu/package/metadata/eml/edi/359/1">https://search.dataone.org/view/https://pasta.lternet.edu/package/metadata/eml/edi/359/1</a>.</p>
<p>Do we want to tweak the indexer to support this or is it actually a good thing that the indexer didn't pick this up because it's, subjectively, not well-formed EML?</p>
Infrastructure - Task #8775 (In Progress): Make taxonomic rank fields in Solr index non-case-sens...https://redmine.dataone.org/issues/87752019-03-11T22:39:44ZBryce Mecummecum@nceas.ucsb.edu
<p>In the current system, EML documents with taxonomic coverage get indexed into fields such as <code>species</code> if they contain XML such as:</p>
<pre>...snip
<taxonomicCoverage>
<taxonomicClassification>
<taxonRankName>Species</taxonRankName>
<taxonRankValue>Some species</taxonRankValue>
...snip
</pre>
<p>The field values are extracted using the XPath in In <a href="https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-eml-base.xml:">https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-eml-base.xml:</a></p>
<pre>//taxonomicClassification/taxonRankValue[../taxonRankName="Species"]/text()
</pre>
<p>We ran into a case where the <code>taxonRankName</code> had been entered as 'species' instead of 'Species' and we decided that the XPath is too restrictive and that the strictness is needless and surprising. This change should result in a slight but negligible decrease in performance.</p>
<ul>
<li>Change all EML taxonomy fields to also match the lowercase form of each taxonomic rank</li>
<li>Check over other indexing field definitions related to taxonomy to make sure the above change is consistent</li>
</ul>
Infrastructure - Task #8755 (New): Expand EML indexing support for EML 2.2https://redmine.dataone.org/issues/87552018-12-19T00:55:12ZBryce Mecummecum@nceas.ucsb.eduInfrastructure - Task #8754 (New): Add EML 2.2 to CN formats listhttps://redmine.dataone.org/issues/87542018-12-19T00:54:05ZBryce Mecummecum@nceas.ucsb.eduInfrastructure - Task #8753 (New): Add support for EML 2.2 (indexing, view)https://redmine.dataone.org/issues/87532018-12-19T00:43:52ZBryce Mecummecum@nceas.ucsb.edu
<p>EML 2.2 is nearly released and, with it, comes changes to indexing and the view service. The key points are that EML now supports Markdown and semantic annotations but some more stuff was added too. See the in-progress What's New in EML 2.2.0 page <a href="https://github.com/NCEAS/eml/blob/BRANCH_EML_2_2/docs/eml-220info.md">https://github.com/NCEAS/eml/blob/BRANCH_EML_2_2/docs/eml-220info.md</a> for details.</p>
<p>I'll track these with sub-tasks but here's an overview:</p>
<p><strong>Formats</strong></p>
<p>We need to expand the list of formats to include EML 2.2.</p>
<p><strong>Indexing</strong></p>
<p>EML 2.2 requires adding new fields and changes to some existing fields. For example, abstracts in EML can now also include a <code>markdown</code> child element which we'll also want to store in Solr. We also want to add in support for semantic annotations and probably structured funding information since that's been requested so much.</p>
<p>Support for semantic annotations will likely use the same approach as our previous work on semantic search where incoming annotations were subject to materialization during the indexing process. To explain what that means, the idea is that, when an annotation for term X comes in, the index processor loads the relevant ontology, materializes the superclass hierarchy for the term, and stores the entire hierarchy in Solr. For EML records with multiple annotations, only the unique set needs to be stored.</p>
<p><strong>View service</strong></p>
<p>As mentioned above, EML 2.2 changes how some existing elements (e.g., abstract) work and adds some new ones (e.g., annotations. The existing EML 2 stylesheets will work for EML 2.2 because 2.2 is backwards compatible but we'll want to extend them to support what's new and changed in EML 2.2</p>
<p>Of note: EML now supports storing Markdown where previously only EML TextType (basically DocBook) was allowed. Our plan on Metacat/MetacatUI to support this is to render the Markdown on the client side which does involve some security concerns (see <a href="https://github.com/NCEAS/metacatui/issues/860">https://github.com/NCEAS/metacatui/issues/860</a> for tracking).</p>
Infrastructure - Bug #8629 (Closed): unable to find valid certificate path to requested target wh...https://redmine.dataone.org/issues/86292018-06-25T20:34:40ZBryce Mecummecum@nceas.ucsb.edu
<p>This bug came from Mark Schildhauer and Margaret O'Brien.</p>
<p>While using Protege to import <a href="https://purl.dataone.org/obo/ENVO_import.owl">https://purl.dataone.org/obo/ENVO_import.owl</a>, the following error pops up:</p>
<pre>sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Full Stack Trace
-----------------------------------------------------------------------------------------
org.semanticweb.owlapi.io.OWLOntologyCreationIOException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:207)
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.actualParse(OWLOntologyManagerImpl.java:1099)
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1055)
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:1011)
at org.protege.editor.owl.model.io.OntologyLoader.loadOntologyInternal(OntologyLoader.java:101)
at org.protege.editor.owl.model.io.OntologyLoader.lambda$loadOntologyInOtherThread$210(OntologyLoader.java:60)
at org.protege.editor.owl.model.io.OntologyLoader$$Lambda$102/1971532877.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1889)
at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1884)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1883)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1456)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1440)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)
at org.semanticweb.owlapi.io.AbstractOWLParser.getInputStreamFromContentEncoding(AbstractOWLParser.java:165)
at org.semanticweb.owlapi.io.AbstractOWLParser.getInputStream(AbstractOWLParser.java:127)
at org.semanticweb.owlapi.io.AbstractOWLParser.getInputSource(AbstractOWLParser.java:232)
at org.semanticweb.owlapi.rdf.rdfxml.parser.RDFXMLParser.parse(RDFXMLParser.java:72)
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:197)
... 10 more
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1937)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1478)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:212)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:969)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:904)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1050)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1363)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1391)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1375)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:563)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1512)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1440)
at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2942)
at java.net.URLConnection.getContentEncoding(URLConnection.java:523)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getContentEncoding(HttpsURLConnectionImpl.java:410)
at org.semanticweb.owlapi.io.AbstractOWLParser.getInputStream(AbstractOWLParser.java:122)
... 13 more
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292)
at sun.security.validator.Validator.validate(Validator.java:260)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1460)
... 28 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:145)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:131)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382)
... 34 more
</pre>
<p>To reproduce:</p>
<ul>
<li>Open Protege</li>
<li>Open from URL</li>
<li>Paste and open '<a href="https://purl.dataone.org/obo/ENVO_import.owl">https://purl.dataone.org/obo/ENVO_import.owl</a>'</li>
<li>See the stack trace</li>
</ul>
<p>That PURL link redirects to a GitHub raw URL which <em>does not</em> reproduce this error. The version of Protege I'm using makes use of its own version of Java:</p>
<pre>❯ /Applications/Protégé.app/Contents/Plugins/JRE/Contents/Home/jre/bin/java -version
java version "1.8.0_40"
Java(TM) SE Runtime Environment (build 1.8.0_40-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
</pre>
<p>A quick Google reveals it could be because Java isn't getting enough of the certificate chain back from the web server but quick run of <a href="https://www.ssllabs.com/ssltest/analyze.html?d=purl.dataone.org">https://www.ssllabs.com/ssltest/analyze.html?d=purl.dataone.org</a> makes everything look in order.</p>
<p>Any ideas?</p>
Infrastructure - Bug #8612 (Closed): Improperly formatted Alternate Data Access URLshttps://redmine.dataone.org/issues/86122018-06-12T22:37:39ZBryce Mecummecum@nceas.ucsb.edu
<p>If I go to</p>
<p><a href="https://search.dataone.org/#view/urn:uuid:8d639e70-55eb-40aa-b1a4-29712fd31b63">https://search.dataone.org/#view/urn:uuid:8d639e70-55eb-40aa-b1a4-29712fd31b63</a></p>
<p>and look at he underlying URL in the Address field, I see:</p>
<p><a href="https://search.dataone.org/%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20http://data.eol.ucar.edu/codiac/dss/id=102.036%0A">https://search.dataone.org/%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20http://data.eol.ucar.edu/codiac/dss/id=102.036%0A</a></p>
<p>which should be </p>
<p><a href="http://data.eol.ucar.edu/codiac/dss/id=102.036%0A">http://data.eol.ucar.edu/codiac/dss/id=102.036%0A</a></p>
<p>Looks like the underlying XSLT isn't going deep enough down the path hierarchy</p>
Infrastructure - Decision #8189 (New): Proposal to change the roles mapped to the origin Solr fie...https://redmine.dataone.org/issues/81892017-10-02T18:04:37ZBryce Mecummecum@nceas.ucsb.edu
<p>While discussing changing the behavior of the origin field in the ISO indexing component (<a href="https://redmine.dataone.org/issues/8165">https://redmine.dataone.org/issues/8165</a>) to make it more selective about where in the document originators are pulled, Matt Jones (over email) suggested we revisit the set of roles as well. Let's do that in this Issue.</p>
<p>The current set of roles mapped to the origin field are:</p>
<ul>
<li><em>originator</em>: party who created the resource</li>
<li><em>author</em>: party who authored the resource</li>
<li><em>owner</em>: party that owns the resource</li>
<li><em>principalInvestigator</em>: key party responsible for gathering information and conducting research</li>
</ul>
<p>This current set of roles may be surprising to some/many users so a possible outcome of this Issue is to greatly improve the content in our search index. This would have impacts on the CN and MNs running Metacat.</p>
<p>Key points:</p>
<ul>
<li>Matt's proposal is to exclude principalInvestigator from this list</li>
<li>The Research Workspace Member Node appears to be using the principalInvestigator role for one or more persons they want in their citation so if we follow Matt's proposal we may need to discuss this with them</li>
<li>I would lobby for only including originator and author but my reading of the definitions is a naïve one</li>
</ul>
<p>I'd like us to have a discussion on this, make the relevant change to the codebase, and then bring the discussion back to the MN operators.</p>
<p>Relevant links:</p>
<ul>
<li><a href="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml">http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml</a> (official? definitions for CI_RoleCode)</li>
<li><a href="https://geo-ide.noaa.gov/wiki/index.php?title=ISO_19115_and_19115-2_CodeList_Dictionaries">https://geo-ide.noaa.gov/wiki/index.php?title=ISO_19115_and_19115-2_CodeList_Dictionaries</a> (NOAA wiki entry for the role codes)</li>
</ul>
Infrastructure - Task #8165 (Closed): Re-factor origin field in isotc211 indexing componenthttps://redmine.dataone.org/issues/81652017-08-29T22:40:30ZBryce Mecummecum@nceas.ucsb.edu
<p>The xpath selectors we use in the origin field in the isotc211 indexing component (bean?) were found to be incorrect for a particular use case and a larger group of us agreed that the usage was incorrect. We should re-visit which xpaths are being used, re-deploy, and re-index the affected content.</p>
<p>I'm pasting in an email chain that initiated the creation of this Issue so we have the full background:</p>
<p>From Chris Turner at Axiom</p>
<blockquote>
<p>Hi Laura and Matt,</p>
<p>Since the launch of the Research Workspace member node, we've no noticed that the dataset citation given at the top of the page doesn't match how we or the PIs would like the official citations to be formatted. There are two issues: selection of contact names for the citation, and appearance of the DOI. </p>
<p>It looks like contact names are being pulled from several parts of the metadata record, the section describing the resource itself (gmd:MD_DataIdentification/gmd:citation/...) and from the section describing associated or aggregated resources (gmd:MD_DataIdentification/gmd:aggregationInfo/...). Here are two examples:</p>
<p>Mary Anne Bishop, Ben Gray, and Scott Pegau. 2017. Fish Predation on Juvenile Herring in Prince William Sound, Alaska, 2009-2012, EVOS Prince William Sound Herring Program. Research Workspace. 10.24431/rw1k1z.</p>
<p>Mary Anne Bishop, Anne Schaefer, Kathy Kuletz, Molly McCammon, Katrina Hoffman, et al. 2017. Fall and Winter Seabird Abundance Data, Prince William Sound, 2007-2017, Gulf Watch Alaska Pelagic Component. Research Workspace. 10.24431/rw1k1w.</p>
<p>In the first exmaple, Scott Pegau is listed in the dataset citation, though in the metadata he is not connected to the dataset but to as the PI for the Herring Program, a larger work referenced in the 'aggregationInfo' section. The second example is the same - McCammon, Hoffman, et al. are pulled from the 'aggregationInfo' element. </p>
<p>Please let me know if I understand correctly how the contacts are being selected for the citation. If I do have it right, is there anything that we can do about it, on our end or in the member node? </p>
<p>The DOI display issue is simpler. DataCite and CrossRef best practices are that DOIs should be displayed as complete URLs, with '<a href="https://doi.org/">https://doi.org/</a>' appearing before the DOI code assigned to a resource. That's how they're formatted in the metadata records, but the URL-esque formatting is stripped out for the citation and for display in the member node.</p>
<p>Can we display the DOI, both in the citation and in the metadata page as a full link?</p>
<p>Please advise on the best way to remedy these. I apologize if this is not the correct venue for this conversation. Please let me know if it makes more sense to continue talking about this on Slack.</p>
<p>Thanks in advance for your help.</p>
<ul>
<li>Chris</li>
</ul>
</blockquote>
<p>From Matt Jones:</p>
<blockquote>
<p>Bryce worked on a new stylesheet for ISO metadata, and so that is scheduled to be released soon. So, any changes would be good to propose even sooner to get them folded in. </p>
<p>If I am interpreting Chris correctly, he's saying that we are indiscriminately pulling Responsible_Party entries regardless of their context in the document, and that it would be best practice to only cite the ResponsibleParty instances that are part of the citatiion in gmd:MD_DataIdentification/gmd:citation/. This makes total sense to me, and is what we do with other metadata standards. So, we need to look into what is causing the behavior, and figure out if it is a stylesheet change, an indexing change, or both. </p>
<p>Matt</p>
</blockquote>
<p>From Chris Turner:</p>
<blockquote>
<p>Hi all, <br>
Matt's understanding is correct. As it is now. ResponsibleParty elements are being pulled in independent of where they appear in the record. Doing as he says and pulling only from gmd:MD_DataIdentification/gmd:citation would go be a good simple fix. We'd also like the pull the contact name from gmd:MD_DataIdentification/gmd:pointOfContact, too.<br>
I'm available the September 5th-8th to chat if we need to talk that week. </p>
<p>From Laura Moyers:</p>
<p>Thanks, Chris! </p>
<p>Matt, do you think what Chris describes would be a stylesheet change that Bryce could incorporate into his current changes? Would we need to talk with other ISO metadata users about this?</p>
<p>Thanks<br>
Laura</p>
</blockquote>
<p>From Matt Jones:</p>
<blockquote>
<p>I suspect it should be straightforward, and Bryce may have already handled it. We were just talking yesterday about getting his stylesheet changes into a Metacat release and deployed on the CN -- its been in a holding pattern for some reason. So, Jing and Bryce are going to work on getting those display improvements pushed out, as they were requested by a number of members. I don't think Chris' proposal would be at all controversial -- our current display is clearly misleading and would be improved, so I think its a clear win. So. I'll cc Bryce and hopefully he can comment on whether and what work would be needed to support proper citation displays for ISO records.</p>
<p>Matt</p>
</blockquote>
<p>From Rob Nahf:</p>
<blockquote>
<p>Will these discussed changes also be reflected in the DataONE solr index? If so, we would likely need to reindex all content of that format after making changes to the parser.<br><br>
(It sounds like there's broad community support for the change, so reindexing probably wouldn't negatively impact anyone...)</p>
</blockquote>
<p>From Bryce Mecum:</p>
<blockquote>
<p>Hey all: Yes, as Matt guesses this is pretty straight forward to fix. The dataset citations in our search and landing pages are powered by our Solr index and we would just need to change the relevant indexing routine and reindex the content. After a quick look at how things are working now, I agree that some change is needed. The information contained in this thread is super helpful so thanks, Chris, for the high level of detail in your original email.</p>
<p>All of that work is on our end and I can coordinate with the DataONE CI team on the changes and we'll let everyone here know when the changes have been made.</p>
</blockquote>
<p>The current set of XPaths can be found at <a href="https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-isotc211-base.xml">https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-isotc211-base.xml</a> which, at the time of writing this, have this bean set for the origin field:<br>
<br>
<br>
<br>
<br>
<br>
<br>
</p>
<p>The sub-tasks here are:</p>
<ul>
<li>Figure out what <em>should</em> go in there instead</li>
<li>Probably consult some folks for confirmation</li>
<li>Update the Bean</li>
<li>Re-index affected documents once change has been reployed</li>
</ul>
Infrastructure - Story #7859 (New): Add formatID for the STL 3d model file formathttps://redmine.dataone.org/issues/78592016-08-04T19:02:58ZBryce Mecummecum@nceas.ucsb.edu
<p>The STL file format is a domain standard file format for storing 3d models and is the most common way I've managed 3d models used while 3d printing. Given that 3d printing is seeing increased usage in the sciences, I would say this is a good candidate for inclusion in the controlled list of format ids.</p>
<p>Type: DATA<br>
Id: STL<br>
Name: StereoLithography File Format<br>
Media type: application/sla (unofficial)<br>
Extension: .stl</p>
<p>There is an ASCII form and a Binary form of this format. They don't see to be distinguished according to any standard. What do we do in this case?</p>
<p>References: <br>
- <a href="https://en.wikipedia.org/wiki/STL_(file_format)">https://en.wikipedia.org/wiki/STL_(file_format)</a><br>
- <a href="https://reference.wolfram.com/language/ref/format/STL.html">https://reference.wolfram.com/language/ref/format/STL.html</a></p>
Infrastructure - Story #7668 (New): Determine how indexing of data packages should workhttps://redmine.dataone.org/issues/76682016-03-02T00:16:25ZBryce Mecummecum@nceas.ucsb.edu
<p>I've discovered (with Lauren's help) a strange requirement for how the resource maps for nested data packages have to be written. In order to get nested data packages correctly indexed in Solr so that the 'resourceMap' field of the resource map being nested is set to the parent resource map's PID, you have to create the appropriate set of @cito:documents@ statements in addition to the expected @ore:aggregates@ statements.</p>
<p>I expected the following to be sufficient (pardon the highly abstracted RDF, examples are linked below):</p>
<p>parent_resource_map#aggregation ore:aggregates child_resource_map<br>
parent_resource_map#aggregation ore:aggregates metadata_object</p>
<p>but I also had to add a @cito:documents@ statement between the <em>parent resource map's metadata object</em> and the resource maps being nested</p>
<p>parent_resource_map#aggregation ore:aggregates child_resource_map<br>
parent_resource_map#aggregation ore:aggregates metadata_object</p>
<p>parent_metadata_object cito:documents child_resource_map</p>
<p>The documentation does not suggest this and I found it confusing. A real life example of what I expected to work is here: <a href="https://gist.github.com/amoeba/c7a6ba269c5a1f78db1d">https://gist.github.com/amoeba/c7a6ba269c5a1f78db1d</a><br>
What I actually had to insert is here: <a href="https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/resourceMap_urn:uuid:ab17b047-a341-4d06-b433-92eed90dacec">https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/resourceMap_urn:uuid:ab17b047-a341-4d06-b433-92eed90dacec</a></p>
<p>Is the need for the @cito:documents@ statement(s) really required and is this the intended behavior? I've made this issue in the hopes we can talk about it.</p>
<p>I suggest updating the API docs with whatever we decide, and hopefully that update will include example RDF for a nested data package.</p>
Infrastructure - Task #7466 (In Progress): Some objects not accessible on the CN via REST APIhttps://redmine.dataone.org/issues/74662015-11-04T18:41:38ZBryce Mecummecum@nceas.ucsb.edu
<p>While doing other work, I noticed that a good number (not sure how many) of objects listed on the CN's Solr index are not accessible via the REST API get() and resolve() methods. Instead of returning the object, they return a NotFound error. </p>
<p>To reproduce,</p>
<ol>
<li>Visit <a href="https://cn.dataone.org/cn/v1/query/solr/?fl=identifier,title,authoritativeMN,datasource&q=formatType:METADATA+AND+-obsoletedBy:*&rows=100&start=0">https://cn.dataone.org/cn/v1/query/solr/?fl=identifier,title,authoritativeMN,datasource&q=formatType:METADATA+AND+-obsoletedBy:*&rows=100&start=0</a></li>
<li>Pick a PID from the query result, e.g.</li>
</ol>
<ul>
<li>knb-lter-cap.148.9</li>
<li>CLOEBDMETADATA.10242013.1</li>
</ul>
<ol>
<li>Attempt to resolve() or get() the object via the REST API like: <a href="https://cn.dataone.org/cn/v1/object/CLOEBDMETADATA.10242013.1">https://cn.dataone.org/cn/v1/object/CLOEBDMETADATA.10242013.1</a></li>
<li>Receive a NotFound error instead of the object.</li>
</ol>
<p>Notes:</p>
<p>In IRC, Skye noticed that the objects can be retrieved via their respective MN so it appears this issue may be a Metacat replication issue.</p>