change default replication implementation
For objects less than a threshold size, change the default number of replicas for objects to two (instead of the current 0) within the replication management system on CNs.
During today's MN Forum call, the MN operators discussed the impact of the government shutdown on data availability and noted that DataONE's replication system could effectively improve data availability in the midst of massive service shutdowns like we have seen this week. We discussed the pros and cons of changing the default replication policy so that, in the absence of an explicit ReplicationPolicy for an object, DataONE Coordinating Nodes would create two replicas of any object that was smaller than a certain threshold size, probably around 500 MB today. This threshold would increase over time as network bandwidth improved, so should be configurable on the CN without code changes. This would ensure that the replication capabilities of DataONE are more fully utilized, while keeping the control of replication firmly in the hands of Member Nodes (if they don't want replicas, they simply set replicationAllowed=false in the replication policy in system metadata).
I have created a design document that describes our envisioned system here:
#2 Updated by Dave Vieglais over 10 years ago
My preference would be to make no changes and use a tool to assist MNs with updating the replication policy on their content through the CNReplication.setReplicationPolicy method. If it doesn't do so already, the CLI would seem to be the right tool for a generic implementation.
#4 Updated by Chris Jones almost 10 years ago
- Target version changed from 2014.4-Block.1.2 to 2014.14-Block.2.3
- Due date changed from 2014-02-01 to 2014-04-12
- Assignee changed from Chris Jones to Matthew Jones
This policy decision needs concensus from MN operators. Reassigning to Matt for now for discussion at the CCIT 2014 meeting.