Project

General

Profile

Task #5938

MNDeployment #3708: Minnesota Population Center

Task #5921: MPC: Testing

Task #5922: MPC: Registration in environment

Task #5933: MPC: Content Review

Task #5937: MPC: Verify Science Metadata

MPC: Verify Science Metadata content

Added by Laura Moyers almost 10 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
2014-07-18
Due date:
% Done:

100%

Estimated time:
0.00 h
Story Points:
Sprint:

Description

Verify that the Science Metadata is present, complete and correctly formatted.
I.e., can be opened in the applications that are designed to handle the formats.

ipumsi_6-3_at_2001.dc.xml Magnifier - Example MPC Dublin core Metadata (9.66 KB) Chris Jones, 2014-08-27 15:36

History

#1 Updated by Laura Moyers almost 10 years ago

  • Target version changed from Deploy by end of Y5Q4 to Deploy by end of Y1Q1

#2 Updated by Chris Jones almost 10 years ago

Thus far the DC-based metadata looks good, but one issue is the ccontent of the abstract element (see the attached example file).
The abstract contains a list of variables collected within the dataset, as opposed to a prose description of the dataset, collection methods, conclusions, etc.:

dc:abstractRecord type; Country; Year; IPUMS sample identifier; Household serial number; Number of person records in the household;
Household weight; Subsample number; Donated household; Group quarters status; Number of unrelated persons;
Continent and region of country; 1st subnational geographic level, world [consistent boundaries over time];
NUTS1 Region, Europe; NUTS2 Region, Europe; ...
/dc:abstract

The content of the abstract should contain "A summary of the resource." This list may technically be a summary, but differs from most abstracts. I'll assign this to Bruce in order to make a decision on this with Tracy, Wendy, Fabio, and others.

#3 Updated by Bruce Wilson almost 10 years ago

Per discussion with Wendy 28-aug-14, they've regenerated the metadata to put the list of terms into dc:tableOfContents and the contents of dcit:abstract are now in dc:abstract.

#4 Updated by Bruce Wilson almost 10 years ago

This also includes making sure that the correct formatId is specified in systemMetadata for the metadata. Since MPC is using QualifiedDC, the correct format ID is @http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd@

#5 Updated by Chris Jones almost 10 years ago

  • Assignee changed from Bruce Wilson to Chris Jones
  • translation missing: en.field_remaining_hours set to 0.0
  • Status changed from In Progress to Closed

I've validated all of the content against the schema:

#!/bin/bash

baseurl="https://dataone-test.pop.umn.edu/mn";
formatId="http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd";

list=$(curl -s -o - ${baseurl}/v1/object?formatId=${formatId});
identifiers=$(xml sel -t -m "//objectInfo" -v "identifier" -n <<< ${list});

mkdir ./scimeta;

for id in ${identifiers}; do
echo "${id}";
curl -s -o ./scimeta/${id} "${baseurl}/v1/object/${id}";
done

and

#!/bin/bash

curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dcterms.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dc.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dcmitype.xsd

for file in $(ls ./scimeta); do
xmllint --noout --valid --schema ./qualifieddc.xsd ./scimeta/${file};
done

xmllint complains about not having a DTD, but the files are schema valid. Looks good.

#6 Updated by Laura Moyers over 9 years ago

  • Target version changed from Deploy by end of Y1Q1 to Deploy by end of NCTE

#7 Updated by Laura Moyers over 9 years ago

  • Target version changed from Deploy by end of NCTE to Operational

#8 Updated by Laura Moyers over 9 years ago

  • Estimated time set to 0.00
  • % Done changed from 0 to 100

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)