Task #5938
MNDeployment #3708: Minnesota Population Center
Task #5921: MPC: Testing
Task #5922: MPC: Registration in environment
Task #5933: MPC: Content Review
Task #5937: MPC: Verify Science Metadata
MPC: Verify Science Metadata content
100%
Description
Verify that the Science Metadata is present, complete and correctly formatted.
I.e., can be opened in the applications that are designed to handle the formats.
History
#1 Updated by Laura Moyers over 10 years ago
- Target version changed from Deploy by end of Y5Q4 to Deploy by end of Y1Q1
#2 Updated by Chris Jones over 10 years ago
- File ipumsi_6-3_at_2001.dc.xml added
- File ipumsi_6-3_at_2001.dc.xml added
- Status changed from New to In Progress
- Assignee set to Bruce Wilson
Thus far the DC-based metadata looks good, but one issue is the ccontent of the abstract element (see the attached example file).
The abstract contains a list of variables collected within the dataset, as opposed to a prose description of the dataset, collection methods, conclusions, etc.:
dc:abstractRecord type; Country; Year; IPUMS sample identifier; Household serial number; Number of person records in the household;
Household weight; Subsample number; Donated household; Group quarters status; Number of unrelated persons;
Continent and region of country; 1st subnational geographic level, world [consistent boundaries over time];
NUTS1 Region, Europe; NUTS2 Region, Europe; ...
/dc:abstract
The content of the abstract should contain "A summary of the resource." This list may technically be a summary, but differs from most abstracts. I'll assign this to Bruce in order to make a decision on this with Tracy, Wendy, Fabio, and others.
#3 Updated by Bruce Wilson over 10 years ago
Per discussion with Wendy 28-aug-14, they've regenerated the metadata to put the list of terms into dc:tableOfContents and the contents of dcit:abstract are now in dc:abstract.
#4 Updated by Bruce Wilson over 10 years ago
This also includes making sure that the correct formatId is specified in systemMetadata for the metadata. Since MPC is using QualifiedDC, the correct format ID is @http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd@
#5 Updated by Chris Jones over 10 years ago
- Assignee changed from Bruce Wilson to Chris Jones
- translation missing: en.field_remaining_hours set to 0.0
- Status changed from In Progress to Closed
I've validated all of the content against the schema:
#!/bin/bash
baseurl="https://dataone-test.pop.umn.edu/mn";
formatId="http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd";
list=$(curl -s -o - ${baseurl}/v1/object?formatId=${formatId});
identifiers=$(xml sel -t -m "//objectInfo" -v "identifier" -n <<< ${list});
mkdir ./scimeta;
for id in ${identifiers}; do
echo "${id}";
curl -s -o ./scimeta/${id} "${baseurl}/v1/object/${id}";
done
and
#!/bin/bash
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dcterms.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dc.xsd
curl -O http://dublincore.org/schemas/xmls/qdc/2008/02/11/dcmitype.xsd
for file in $(ls ./scimeta); do
xmllint --noout --valid --schema ./qualifieddc.xsd ./scimeta/${file};
done
xmllint complains about not having a DTD, but the files are schema valid. Looks good.
#6 Updated by Laura Moyers about 10 years ago
- Target version changed from Deploy by end of Y1Q1 to Deploy by end of NCTE
#7 Updated by Laura Moyers almost 10 years ago
- Target version changed from Deploy by end of NCTE to Operational
#8 Updated by Laura Moyers almost 10 years ago
- Estimated time set to 0.00
- % Done changed from 0 to 100