Project

General

Profile

Task #3724

MNDeployment #3552: USGS CSAS

Re-typing ORE documents

Added by Skye Roseboom over 11 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Skye Roseboom
Target version:
-
Start date:
2013-04-23
Due date:
% Done:

100%

Story Points:
Sprint:

Description

Similiar issue to ORNL Daac ORE doc issue where some documents were synchronized with the incorrect formatId ('application/octet-stream').

Need to resolve in the same manner. Updating the formatId on the CN, and on the MN manually.

For dataone to update the CN we need a list of pids to correct.

pids to retype to ORE.txt Magnifier (8.67 KB) Skye Roseboom, 2013-06-21 17:59


Related issues

Related to Member Nodes - Task #3839: ORE documents contain references to non-existent science metadata pids Closed 2013-06-24

History

#1 Updated by Ranjeet Devarakonda over 11 years ago

#2 Updated by Skye Roseboom over 11 years ago

Hi Ranjeet

I created a list of pids from the CN solr index, filtering by USGS CSAS, documents that are not ORE, and Ids that start with resourceMap:

https://cn.dataone.org/cn/v1/query/solr/?rows=1000&q=datasource:urn\:node\:USGSCSAS%20id:resourceMap*%20-formatType:RESOURCE&fl=id,formatId

for a list of just pids:

https://cn.dataone.org/cn/v1/query/solr/?rows=1000&q=datasource:urn\:node\:USGSCSAS%20id:resourceMap*%20-formatType:RESOURCE&fl=id&wt=csv

It looks to be 304 documents. Some of these might also be on the archived list. Can you confirm this list looks good? If so, we will proceed with updating the formatId for these documents - from 'application/octet-stream' to 'http://www.openarchives.org/ore/terms'.

Ranjeet indicated all these resource maps are to be archived, so updating the formatId for these are not priority. The new resource maps that need formatId updated have DOI identifiers - so I could not locate them by naming scheme in the search index.

#3 Updated by Ranjeet Devarakonda over 11 years ago

  • Assignee changed from Ranjeet Devarakonda to Skye Roseboom

All the PID's you sent should be archived (as we added DOI's to all resourceMaps).

Here are the PIDs that should have the formatType changed to "http://www.openarchives.prg/ore/terms"
https://cn.dataone.org/cn/v1/query/solr/?rows=1000&q=datasource:urn\:node\:USGSCSAS%20id:resourceMap_doi*&fl=id&wt=csv

Also, I won't be able to set the PID's to archived state..I'd need your help in doing this directly at the CN.

#4 Updated by Skye Roseboom over 11 years ago

Created text file from solr query provided by Ranjeet: pids_to_retype_to_ORE.txt

#5 Updated by Skye Roseboom over 11 years ago

  • Assignee changed from Skye Roseboom to Chris Jones

Issue is complicated by the fact that the original documents were not copied to the CN as they were typed as a 'data' type document. When these documents are modified to be an ORE, they need to copy to the CN for indexing. The current CN operational process does not allow formatID to be changed - and so the backdoor method of changing the formatID in system metadata on the CN, does not fully effect the change needed - that the document now be copied to the CN.

Just fyi, if the documents had been 'obsoleted' by the MN with new documents with proper formatID, this issue would not exist. The new documents would be copied to the CN and indexed and the old documents would remain as 'data' type documents.

#6 Updated by Skye Roseboom over 11 years ago

  • Assignee changed from Chris Jones to Skye Roseboom
  • Status changed from New to In Progress

#7 Updated by Skye Roseboom over 11 years ago

  • Status changed from In Progress to Testing

#8 Updated by Skye Roseboom over 11 years ago

  • Status changed from Testing to In Review

#9 Updated by Skye Roseboom over 11 years ago

This issue should be closed.

These documents were always originally uploaded as ORE packages. The issue is that the ORE document contents do not seem to specify existing science metadata documents. Will open new issue to capture the ORE content issue.

#10 Updated by Skye Roseboom over 11 years ago

  • Status changed from In Review to Closed
  • translation missing: en.field_remaining_hours set to 0.0

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)