Project

General

Profile

Task #2624

Story #2622: d1_replication should prioritize MN replication tasks based on load, failures, and bandwidth factors

Modify ReplicationManager.createAndQueueTasks() to limit replication tasks based on current MN load

Added by Chris Jones over 12 years ago. Updated over 12 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
d1_replication
Start date:
2012-04-20
Due date:
% Done:

100%

Milestone:
CCI-1.0.0
Product Version:
*
Story Points:
Sprint:

Description

As a first pass on trying to throttle replicate() calls to possibly bogged-down MNs, change createAndQueueTasks() such that we evaluate how many currently pending tasks there are for a given node, and cap the calls based on a configuarable limit.

See the epad for algorithm details:

http://epad.dataone.org/20120420-replication-priority-queue

History

#1 Updated by Chris Jones over 12 years ago

  • Status changed from New to In Progress

Added the prioritizeNodes() method to implement the new prioritization algorithms. Needs testing.

#2 Updated by Chris Jones over 12 years ago

  • Status changed from In Progress to Closed

Added the getRequestFactors(), getFailureFactors(), and getBandwidthFactors() methods that feed into prioritizeNodes(). Skye moved these methods into a new ReplicationPrioritizationStrategy class for modularity and independent testing. Testing showed that the prioritization scheme works fine now that we give a 5 try grace period for MNs before calculating scores.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)