Task #4140
MNDeployment #3118: Dryad Member Node
ORE documents reference pids that do not appear in object list
100%
Description
After 10/28 test run:
https://dev.datadryad.org/mn/object/?count=10&formatId=http://www.openarchives.org/ore/terms
reports: 1931
https://cn-dev-ucsb-1.test.dataone.org/cn/v1/object/?count=10&formatId=http://www.openarchives.org/ore/terms
reports: 1913
While most ORE documents synchronized to the CN, it appears about 1/3 of the successfully synchronized ORE - about 650 have not indexed.
Triaging some of the ORE that have not indexed, it appears they are referencing identifiers that to not present on the dryad /object list endpoint:
ORE
http://dx.doi.org/10.5061/dryad.5p6b1?format=d1rem&ver=2012-06-29T12:01:50.683-04:00
missing:
http://dx.doi.org/10.5061/dryad.5p6b1/1/bitstream
ORE
http://dx.doi.org/10.5061/dryad.8164?format=d1rem&ver=2011-09-02T13:08:10.676-04:00
missing:
http://dx.doi.org/10.5061/dryad.8164/5/bitstream
http://dx.doi.org/10.5061/dryad.8164/2/bitstream
http://dx.doi.org/10.5061/dryad.8164/3/bitstream
http://dx.doi.org/10.5061/dryad.8164/4/bitstream
http://dx.doi.org/10.5061/dryad.8164/6/bitstream
ORE
http://dx.doi.org/10.5061/dryad.8m8r1?format=d1rem&ver=2012-06-26T15:24:29.813-04:00
missing:
http://dx.doi.org/10.5061/dryad.8m8r1/1/bitstream
http://dx.doi.org/10.5061/dryad.8m8r1/2/bitstream
http://dx.doi.org/10.5061/dryad.8m8r1/3/bitstream
ORE
http://dx.doi.org/10.5061/dryad.f385721n?format=d1rem&ver=2012-06-09T13:47:30.984-04:00
missing:
http://dx.doi.org/10.5061/dryad.f385721n/1/bitstream
http://dx.doi.org/10.5061/dryad.f385721n/1?ver=2012-06-09T13:17:44.181-04:00
I do not yet see a pattern. It appears the dryad /object list may not be presenting all identifiers in use by the ORE docs. All the reported 'missing' identifiers can be found via dryad's /meta/{pid} and /object/{pid} services but I do not find them when using /object?formatId= or just object list slicing.
History
#1 Updated by Ryan Scherle about 11 years ago
We determined that most of these errors were due to items with restricted access. These items were correctly filtered out of the object lists, but not out of the resource maps. We have improved the filtering to make the resource map and object list consistent.
A few other items were the result of dirty data on our development server.
Please re-test.
#2 Updated by Ryan Scherle about 11 years ago
- Assignee changed from Ryan Scherle to Skye Roseboom
#3 Updated by Skye Roseboom about 11 years ago
- Status changed from New to In Progress
#4 Updated by Skye Roseboom about 11 years ago
- Status changed from In Progress to Testing
#5 Updated by Skye Roseboom almost 11 years ago
- Assignee changed from Skye Roseboom to Ryan Scherle
Test run 11/21.
https://dev.datadryad.org/mn/object/?count=0&formatId=http://www.openarchives.org/ore/terms
-- shows 1932
https://cn-dev-ucsb-1.test.dataone.org/cn/v1/query/solr/?q=datasource:urn\:node\:mnTestDRYAD%20formatType:RESOURCE
-- shows 1586
So we got a couple hundred more ORE to index.
Inspecting ORE that did not process, seem to indicate this problem still exists:
ORE: http://dx.doi.org/10.5061/dryad.505?format=d1rem&ver=2011-07-28T14:52:47.474-04:00
references data files:
http://dx.doi.org/10.5061/dryad.505/1/bitstream
http://dx.doi.org/10.5061/dryad.505/2/bitstream
however neither of these pids were harvested by the CN and neither appear on dev.datadryad.org's object list although /meta{pid} and /object/{pid} work.
ORE: http://dx.doi.org/10.5061/dryad.389?format=d1rem&ver=2011-07-28T15:13:04.871-04:00
references data file:
http://dx.doi.org/10.5061/dryad.389/1/bitstream
however: https://dev.datadryad.org/mn/object/?count=1000&formatId=text/html yields 0 results. (does not appear on object list). /meta{pid} and /object/{pid} work.
ORE: http://dx.doi.org/10.5061/dryad.509?format=d1rem&ver=2011-07-28T14:53:53.030-04:00
references data file:
http://dx.doi.org/10.5061/dryad.509/1/bitstream
however this pid does not seem to appear on dev.datadryad.org's object list. /meta{pid} and /object/{pid} work however.
Seems to be the same type of error as before. Are these particular ORE meant to be working?
#6 Updated by Skye Roseboom almost 11 years ago
- Status changed from Testing to In Progress
#7 Updated by Ryan Scherle almost 11 years ago
- Assignee changed from Ryan Scherle to Skye Roseboom
We have corrected an inconsistency in the object list. Please re-test this.
#8 Updated by Laura Moyers almost 11 years ago
- Target version changed from Deploy by end of Y5Q2 to Deploy by end of Y5Q3
#9 Updated by Laura Moyers over 10 years ago
- Target version changed from Deploy by end of Y5Q3 to Operational
#10 Updated by Skye Roseboom over 10 years ago
- translation missing: en.field_remaining_hours set to 0.0
- Status changed from In Progress to Closed
Closing this issue as Dryad is now in production and these issues appear resolved.