DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2020-02-05T20:02:12ZDataONE Tasks
Redmine Infrastructure - Task #8858 (New): Update CN Apache configs in version control with directives to...https://redmine.dataone.org/issues/88582020-02-05T20:02:12ZBryce Mecummecum@nceas.ucsb.edu
<p>Sitemaps are located on disk in ${tomcat_webapps_dir}/${context}/sitemaps as <code>sitemap_index.xml</code> and <code>sitemap%d.xml</code> (for each sub-sitemap).</p>
<p>The rule we've come up with is:</p>
<p><code>RewriteRule ^/(sitemap.+) /metacat/sitemaps/$1 [R=303]</code></p>
Infrastructure - Task #8775 (In Progress): Make taxonomic rank fields in Solr index non-case-sens...https://redmine.dataone.org/issues/87752019-03-11T22:39:44ZBryce Mecummecum@nceas.ucsb.edu
<p>In the current system, EML documents with taxonomic coverage get indexed into fields such as <code>species</code> if they contain XML such as:</p>
<pre>...snip
<taxonomicCoverage>
<taxonomicClassification>
<taxonRankName>Species</taxonRankName>
<taxonRankValue>Some species</taxonRankValue>
...snip
</pre>
<p>The field values are extracted using the XPath in In <a href="https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-eml-base.xml:">https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-eml-base.xml:</a></p>
<pre>//taxonomicClassification/taxonRankValue[../taxonRankName="Species"]/text()
</pre>
<p>We ran into a case where the <code>taxonRankName</code> had been entered as 'species' instead of 'Species' and we decided that the XPath is too restrictive and that the strictness is needless and surprising. This change should result in a slight but negligible decrease in performance.</p>
<ul>
<li>Change all EML taxonomy fields to also match the lowercase form of each taxonomic rank</li>
<li>Check over other indexing field definitions related to taxonomy to make sure the above change is consistent</li>
</ul>
Infrastructure - Task #8755 (New): Expand EML indexing support for EML 2.2https://redmine.dataone.org/issues/87552018-12-19T00:55:12ZBryce Mecummecum@nceas.ucsb.eduInfrastructure - Task #8754 (New): Add EML 2.2 to CN formats listhttps://redmine.dataone.org/issues/87542018-12-19T00:54:05ZBryce Mecummecum@nceas.ucsb.eduInfrastructure - Task #8753 (New): Add support for EML 2.2 (indexing, view)https://redmine.dataone.org/issues/87532018-12-19T00:43:52ZBryce Mecummecum@nceas.ucsb.edu
<p>EML 2.2 is nearly released and, with it, comes changes to indexing and the view service. The key points are that EML now supports Markdown and semantic annotations but some more stuff was added too. See the in-progress What's New in EML 2.2.0 page <a href="https://github.com/NCEAS/eml/blob/BRANCH_EML_2_2/docs/eml-220info.md">https://github.com/NCEAS/eml/blob/BRANCH_EML_2_2/docs/eml-220info.md</a> for details.</p>
<p>I'll track these with sub-tasks but here's an overview:</p>
<p><strong>Formats</strong></p>
<p>We need to expand the list of formats to include EML 2.2.</p>
<p><strong>Indexing</strong></p>
<p>EML 2.2 requires adding new fields and changes to some existing fields. For example, abstracts in EML can now also include a <code>markdown</code> child element which we'll also want to store in Solr. We also want to add in support for semantic annotations and probably structured funding information since that's been requested so much.</p>
<p>Support for semantic annotations will likely use the same approach as our previous work on semantic search where incoming annotations were subject to materialization during the indexing process. To explain what that means, the idea is that, when an annotation for term X comes in, the index processor loads the relevant ontology, materializes the superclass hierarchy for the term, and stores the entire hierarchy in Solr. For EML records with multiple annotations, only the unique set needs to be stored.</p>
<p><strong>View service</strong></p>
<p>As mentioned above, EML 2.2 changes how some existing elements (e.g., abstract) work and adds some new ones (e.g., annotations. The existing EML 2 stylesheets will work for EML 2.2 because 2.2 is backwards compatible but we'll want to extend them to support what's new and changed in EML 2.2</p>
<p>Of note: EML now supports storing Markdown where previously only EML TextType (basically DocBook) was allowed. Our plan on Metacat/MetacatUI to support this is to render the Markdown on the client side which does involve some security concerns (see <a href="https://github.com/NCEAS/metacatui/issues/860">https://github.com/NCEAS/metacatui/issues/860</a> for tracking).</p>
Search UI - Story #8574 (New): PANGAEA Temporary Fix: SID only in Data Citationhttps://redmine.dataone.org/issues/85742018-04-30T13:26:15ZMonica Ihliemail@monicaihli.com
<p>Adjust appearance of data citation in the case of formatid <a href="http://www.isotc211.org/2005/gmd-pangaea">http://www.isotc211.org/2005/gmd-pangaea</a> such that the SID alone appears in the data citation and not the PID. This is intended as a temporary fix to be in place only while a longer term strategy for customizing appearances of data citations is designed and implemented. Once that longer term strategy is in place, whatever temporary code level changes were implemented to accommodate PANGAEA should be removed.</p>
Infrastructure - Decision #8189 (New): Proposal to change the roles mapped to the origin Solr fie...https://redmine.dataone.org/issues/81892017-10-02T18:04:37ZBryce Mecummecum@nceas.ucsb.edu
<p>While discussing changing the behavior of the origin field in the ISO indexing component (<a href="https://redmine.dataone.org/issues/8165">https://redmine.dataone.org/issues/8165</a>) to make it more selective about where in the document originators are pulled, Matt Jones (over email) suggested we revisit the set of roles as well. Let's do that in this Issue.</p>
<p>The current set of roles mapped to the origin field are:</p>
<ul>
<li><em>originator</em>: party who created the resource</li>
<li><em>author</em>: party who authored the resource</li>
<li><em>owner</em>: party that owns the resource</li>
<li><em>principalInvestigator</em>: key party responsible for gathering information and conducting research</li>
</ul>
<p>This current set of roles may be surprising to some/many users so a possible outcome of this Issue is to greatly improve the content in our search index. This would have impacts on the CN and MNs running Metacat.</p>
<p>Key points:</p>
<ul>
<li>Matt's proposal is to exclude principalInvestigator from this list</li>
<li>The Research Workspace Member Node appears to be using the principalInvestigator role for one or more persons they want in their citation so if we follow Matt's proposal we may need to discuss this with them</li>
<li>I would lobby for only including originator and author but my reading of the definitions is a naïve one</li>
</ul>
<p>I'd like us to have a discussion on this, make the relevant change to the codebase, and then bring the discussion back to the MN operators.</p>
<p>Relevant links:</p>
<ul>
<li><a href="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml">http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml</a> (official? definitions for CI_RoleCode)</li>
<li><a href="https://geo-ide.noaa.gov/wiki/index.php?title=ISO_19115_and_19115-2_CodeList_Dictionaries">https://geo-ide.noaa.gov/wiki/index.php?title=ISO_19115_and_19115-2_CodeList_Dictionaries</a> (NOAA wiki entry for the role codes)</li>
</ul>
Search UI - Story #7754 (In Progress): Support for XSL transform of various metadata formatshttps://redmine.dataone.org/issues/77542016-04-27T14:43:23ZDave Vieglaisdave.vieglais@gmail.com
<p>Currently the DCX and ISO metadata formats are being rendered in the view service using solr output rather than a transform of the XML metadata. This results in a less than satisfactory rendering.</p>
<p>Goal of this story is to implement XSLT for metadata formats that currently are relying on the solr only rendering.</p>