MNDeployment #3557: LTER Network
Resolve access policy discrepancies between system metadata and science metadata
While working to synchronize all content on the LTER MN, I noticed an inexplicably high number ( 13,914 ) of documents that returned a NotAuthorized exception when trying to call getSystemMetadata() on the production CNs. This seemed odd since the vast majority of these documents are EML science metadata. I compared this with a select number of documents on the MN, and found that the MN, too,returns a NotAuthorized exception. However, in looking directly at some of the EML documents on disk on the CNs, there are public:read ACLs in the EML. For instance, for doi:10.6073/AA/knb-lter-bes.392.39, we get a NotAuthorized exception, but the EML states:
On the CN, xml_access table includes the uid="BES",o=lter,dc=ecoinformatics,dc=org:all ACL, but not the public:read. My thought is that somewhere in Metacat's SystemMetadataFactory we've missed adding some ACLs to system metadata, but I haven't confirmed this. Nevertheless, for the documents in the file attached to this ticket, we need to iterate through them, confirm a public:read ACL in the EML, and call CNAuthorization.setAccessPolicy() to update the system metadata appropriately.
I will update this list with a complete count after re-running my query script because it encountered a number of ServiceFailure exceptions on certain pids, so I'll re-do that subset.
#1 Updated by Ben Leinfelder over 10 years ago
It's entirely possible to change access control rules after EML has been inserted. Your example EML file is not readable by public as far as Metacat (and by extension, DataONE) is concerned:
#3 Updated by Chris Jones over 10 years ago
I've updated the NotAuthorized file, and have these pids remaining: They look to be accessible on the MN, but the CN is throwing a ServiceFailure:
#4 Updated by Mark Servilla over 10 years ago
Here is a list of the affected sites and the number of problem IDs based on the content of "lter-not-authorized.txt"; I only filtered on the canonical scope string for each site (e.g., knb-lter-lno) and ignored other odd names or those that contained "test":
#5 Updated by Mark Servilla over 10 years ago
I have shared a Google spreadsheet that contains the result of site IM queries regarding public read access: https://docs.google.com/spreadsheet/ccc?key=0AvmNJnP7eHevdGMwcGpHRDR5RUMxNVlTc2FyZWQ4T1E&usp=sharing.