Task #3980
MNDeployment #3558: CDL Merritt
Missing content
0%
Description
Started noticing some 404 - NotFound errors from CDL/Merrit MN today. It appears that the previous content of documents has been removed from the MN. Appears to correspond with rollout of corrected ORE/RDF resource maps - this is a guess based on the pid structure and content of the following example and the date of the 'new' version of content.
These two documents appear to have the same content:
https://cn.dataone.org/cn/v1/object/ark:/13030/m5416x9f/1/cadwsap-s2010005-011.xml
https://cn.dataone.org/cn/v1/object/ark:/13030/m5416x9f/2/cadwsap-s2010005-011.xml
however the first (with the '1' before 'cadwsap') does not appear on the source MN:
https://merritt.cdlib.org:8084/knb/d1/mn/v1/object/ark:/13030/m5416x9f/1/cadwsap-s2010005-011.xml
Although the second does ('2'):
https://merritt.cdlib.org:8084/knb/d1/mn/v1/object/ark:/13030/m5416x9f/1/cadwsap-s2010005-011.xml
Similar response for /meta requests on the same pids.
Another example (this time an RDF/ORE):
https://cn.dataone.org/cn/v1/meta/ark:/13030/m5s46rvg/1/mrt-dataone-map.rdf
https://cn.dataone.org/cn/v1/meta/ark:/13030/m5s46rvg/2/mrt-dataone-map.rdf
Same pattern - first version with the ('1') does not appear on MN:
https://merritt.cdlib.org:8084/knb/d1/mn/v1/meta/ark:/13030/m5s46rvg/1/mrt-dataone-map.rdf
although the second version ('2') does appear on the MN:
https://merritt.cdlib.org:8084/knb/d1/mn/v1/meta/ark:/13030/m5s46rvg/2/mrt-dataone-map.rdf
The old content is still present on the CN due to it not being 'archived'. This results in these documents continuing to appear in the CN search index and object list - although they no longer appear to exist on the source MN.
Need to discuss possible solutions. Possibly generate list of 'old' pids to be archived at the CN to cleanup content that no longer appears in the MN.
History
#1 Updated by Skye Roseboom about 11 years ago
- File Merrit-PIDs.txt added
- File OneShare-PIDs.txt added
- File Merrit-PIDs.txt added
- File OneShare-PIDs.txt added
Adding file listing of pids that should be archived for Merritt/CDL and ONESHare.
#2 Updated by Skye Roseboom about 11 years ago
- Assignee changed from John Kunze to Chris Jones
Chris can we 'archive' these pids for Merritt and ONEShare with the script you created?