Story #8504: Support creation of data citation record from solr record - Infrastructure - DataONE Tasks

Story #8504

Support creation of data citation record from solr record

Added by Dave Vieglais almost 7 years ago. Updated over 6 years ago.

Status:

New

Priority:

Normal

Assignee:

Monica Ihli

Category:

d1_indexer

Target version:

CCI-2.4.0

Start date:

2018-03-19

Due date:

% Done:

Story Points:

Sprint:

Infrastructure backlog

Description

The goal of this story is to ensure that elements in the solr search schema are available and appropriately populated to support generation of DataCite version 4.x or later records.

By ensuring support for this schema, it can also be asserted that suitable citation metadata can be provided in landing pages and other renderings of content provided by DataONE.

Resources:

Subtasks

Related issues

History

#1 Updated by Dave Vieglais almost 7 years ago

Blocks Decision #8189: Proposal to change the roles mapped to the origin Solr field for ISO docs added

#2 Updated by Monica Ihli almost 7 years ago

Procedural notes:

Evaluate DataCite v.4+ requirements.
Identify currently indexed values for each formatID.
Identify and address any gaps between DataCite requirements and currently indexed elements.
Ensure that the citations in search.dataone.org display appropriately.

Also:

Start with EML & ISO

#3 Updated by Bryce Mecum over 6 years ago

Just a note, Chris Turner has been waiting on progress on this since August 24, 2017. When/if any progress gets made w/r/t ISO citations, it'd be good to email him to let him know what's changed and how it affects the RW node.

#4 Updated by Bryce Mecum over 6 years ago

I re-read what I just wrote above and want to add two things:

I was initially assigned and didn't get this done so tardiness is largely my fault
Chris Turner is specifically looking for how ISO* format documents get turned into citations, even more specifically, how we populate the origin Solr field.

#5 Updated by Dave Vieglais over 6 years ago

Metadata extraction and index population rules are documented at:

indexer-documentation.readthedocs.io

with the ISO TC211 rules under http://indexer-documentation.readthedocs.io/en/latest/generated/proc_isotc211Subprocessor.html

It looks like the solr index origin field is populated with:

//gmd:identificationInfo/gmd:MD_DataIdentification/
gmd:citation/gmd:CI_Citation/
gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[
gmd:role/gmd:CI_RoleCode/text() = "owner" or
gmd:role/gmd:CI_RoleCode/text() = "originator" or
gmd:role/gmd:CI_RoleCode/text() =
"principalInvestigator" or gmd:role/gmd:CI_RoleCode/
text() = "author"]/gmd:individualName/*/text() 
| 
//gmd:identificationInfo/gmd:MD_DataIdentification/
gmd:citation/gmd:CI_Citation/
gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[
(gmd:role/gmd:CI_RoleCode/text() = "owner" or
gmd:role/gmd:CI_RoleCode/text() = "originator" or
gmd:role/gmd:CI_RoleCode/text() =
"principalInvestigator" or gmd:role/gmd:CI_RoleCode/
text() = "author") and (not(gmd:individualName) or
gmd:individualName[@gco:nilReason = "missing"])]/
gmd:organisationName/*/text()

Also available in: Atom PDF

Project

General

Profile

Infrastructure

Issues

Custom queries