Project

General

Profile

Task #5938

MNDeployment #3708: Minnesota Population Center

Task #5921: MPC: Testing

Task #5922: MPC: Registration in environment

Task #5933: MPC: Content Review

Task #5937: MPC: Verify Science Metadata

MPC: Verify Science Metadata content

Added by Laura Moyers over 9 years ago. Updated about 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
2014-07-18
Due date:
% Done:

100%

Estimated time:
0.00 h
Story Points:
Sprint:

Description

Verify that the Science Metadata is present, complete and correctly formatted.
I.e., can be opened in the applications that are designed to handle the formats.

ipumsi_6-3_at_2001.dc.xml Magnifier - Example MPC Dublin core Metadata (9.66 KB) Chris Jones, 2014-08-27 15:36

History

#1 Updated by Laura Moyers over 9 years ago

  • Target version changed from Deploy by end of Y5Q4 to Deploy by end of Y1Q1

#2 Updated by Chris Jones over 9 years ago

Thus far the DC-based metadata looks good, but one issue is the ccontent of the abstract element (see the attached example file).
The abstract contains a list of variables collected within the dataset, as opposed to a prose description of the dataset, collection methods, conclusions, etc.:

dc:abstractRecord type; Country; Year; IPUMS sample identifier; Household serial number; Number of person records in the household;
Household weight; Subsample number; Donated household; Group quarters status; Number of unrelated persons;
Continent and region of country; 1st subnational geographic level, world [consistent boundaries over time];
NUTS1 Region, Europe; NUTS2 Region, Europe; ...
/dc:abstract

The content of the abstract should contain "A summary of the resource." This list may technically be a summary, but differs from most abstracts. I'll assign this to Bruce in order to make a decision on this with Tracy, Wendy, Fabio, and others.

#3 Updated by Bruce Wilson over 9 years ago

Per discussion with Wendy 28-aug-14, they've regenerated the metadata to put the list of terms into dc:tableOfContents and the contents of dcit:abstract are now in dc:abstract.

#4 Updated by Bruce Wilson over 9 years ago

This also includes making sure that the correct formatId is specified in systemMetadata for the metadata. Since MPC is using QualifiedDC, the correct format ID is @http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd@

#5 Updated by Chris Jones over 9 years ago

  • Assignee changed from Bruce Wilson to Chris Jones
  • translation missing: en.field_remaining_hours set to 0.0
  • Status changed from In Progress to Closed

I've validated all of the content against the schema:

#!/bin/bash

baseurl="https://dataone-test.pop.umn.edu/mn";
formatId="http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd";

list=$(curl -s -o - ${baseurl}/v1/object?formatId=${formatId});
identifiers=$(xml sel -t -m "//objectInfo" -v "identifier" -n <<< ${list});

mkdir ./scimeta;

for id in ${identifiers}; do
echo "${id}";
curl -s -o ./scimeta/${id} "${baseurl}/v1/object/${id}";
done

and

#!/bin/bash

curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dcterms.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dc.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dcmitype.xsd

for file in $(ls ./scimeta); do
xmllint --noout --valid --schema ./qualifieddc.xsd ./scimeta/${file};
done

xmllint complains about not having a DTD, but the files are schema valid. Looks good.

#6 Updated by Laura Moyers over 9 years ago

  • Target version changed from Deploy by end of Y1Q1 to Deploy by end of NCTE

#7 Updated by Laura Moyers about 9 years ago

  • Target version changed from Deploy by end of NCTE to Operational

#8 Updated by Laura Moyers about 9 years ago

  • Estimated time set to 0.00
  • % Done changed from 0 to 100

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)