Feature #4038
Updated by Matthew Jones about 11 years ago
For objects less than a threshold size, change the default number of replicas for objects to two (instead of the current 0) within the replication management system on CNs.
During today's MN Forum call, the MN operators discussed the impact of the government shutdown on data availability and noted that DataONE's replication system could effectively improve data availability in the midst of massive service shutdowns like we have seen this week. We discussed the pros and cons of changing the default replication policy so that, in the absence of an explicit ReplicationPolicy for an object, DataONE Coordinating Nodes would create two replicas of any object that was smaller than a certain threshold size, probably around 500 MB today. This threshold would increase over time as network bandwidth improved, so should be configurable on the CN without code changes. This would ensure that the replication capabilities of DataONE are more fully utilized, while keeping the control of replication firmly in the hands of Member Nodes (if they don't want replicas, they simply set replicationAllowed=false number of replicas to 0 in the replication policy in system metadata).
I have created a design document that describes our envisioned system here:
http://mule1.dataone.org/ArchitectureDocs-current/design/ReplicationOverview.html
During today's MN Forum call, the MN operators discussed the impact of the government shutdown on data availability and noted that DataONE's replication system could effectively improve data availability in the midst of massive service shutdowns like we have seen this week. We discussed the pros and cons of changing the default replication policy so that, in the absence of an explicit ReplicationPolicy for an object, DataONE Coordinating Nodes would create two replicas of any object that was smaller than a certain threshold size, probably around 500 MB today. This threshold would increase over time as network bandwidth improved, so should be configurable on the CN without code changes. This would ensure that the replication capabilities of DataONE are more fully utilized, while keeping the control of replication firmly in the hands of Member Nodes (if they don't want replicas, they simply set replicationAllowed=false number of replicas to 0 in the replication policy in system metadata).
I have created a design document that describes our envisioned system here:
http://mule1.dataone.org/ArchitectureDocs-current/design/ReplicationOverview.html