The SeriesIdReslover class in d1_index_processor shouldn't use the D1Client.getCN().getSystemMetadata method
The d1_index_processor component is shared by both the cn indexer and the Metacat mn indexer.
The the method getPid(Identifier sid) on the SeriesIdResolver class, it use the line:
SystemMetadata fetchedSysmeta = D1Client.getCN().getSystemMetadata(null, identifier);
It works perfectly with the CN indexers. For MNs, it may have glitches:
1. The head object hasn't been synchronized to cn when the index happens in the MN.
2. MN maybe is configured NOT to being synchronized at all.
#6 Updated by Chris Jones almost 3 years ago
- Assignee changed from Rob Nahf to Jing Tao
- Priority changed from Normal to Immediate
Jing, since we have a critical issue with the ESS-DIVE MN that this bug affects, I'm re-assigning this to you and bumping the priority. While Rob's idea of creating an interface may be a longer-term solution, I'm thinking that a shorter term solution would be to add a property that sets which host use for
getSystemMetadata() calls. Let's discuss this, and at least get an RC tag out the door that we can try on ESS-DIVE.
#7 Updated by Chris Jones almost 3 years ago
- Priority changed from Immediate to High
Jing and I discussed this ticket. We've found a workaround with the ESS-DIVE MN so that their packages with members that have
seriesId synchronize correctly. We had to set the
rightsHolder back to the original group
DN that had been set a few objects back in the chain (it had been set to the ORCID id of the owning researcher). By doing this, the indexer matched the
Subject string with the
Subject string of the current
HEAD of the series, so it determined that the
Subject was authorized (i.e. no series hijacking).
This illuminated the issue: the
SeriesIdResolver class is not expanding group membership, causing
NotAuthorized exceptions for group members. This will be fixed.
Also, having the Member Node rely on
CN.getSystemMetadata() for local indexing of content can be problematic for MNs not registered in any CN environment. So, our plan is to add a baseUrl configuration parameter in Metacat (that defaults to being unset) so the indexer can locally call
MN.getSystemMetadata() (Jing noticed that the indexer doesn't have access to the Hazelcast map, so there's no other internal call to use.) In this way, MNs can be configured to use their own sysmeta, but still call
CN.getSubjectInfo() for group expansion when needed. CN deployments will work as they do now, still calling
I'm lowering the urgency of this ticket since we found an immediate workaround, but it's still high priority. It has arisen because ESS-DIVE is using the
seriesId field to the fullest extent (assigning DOIs there), and I think they may be the first MN to really exercise this, and these bugs are being exposed just from daily use.
#8 Updated by Jing Tao almost 3 years ago
Chris, thank you for the summary. Just one thing need to be clafried - even though the indexer can access the Hazelcast map internally, the return result is NOT the head version of the series chain if you call
systemmetamap.get(seriesId). Only the api calls (
cn/mn.getSystemMetadata(seriesId) can return the head version of the series chains.
#9 Updated by Jing Tao over 2 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
In this method, we will check if a property named mn.dataone.baseURL is defined. If it is defined, it will use the value of this property to get the head version of the system metadata. Otherwise, it will still call cn.getSystemMetadata.
It works for both CN and MN.