Story #2622: d1_replication should prioritize MN replication tasks based on load, failures, and bandwidth factors
Modify Metacat's MNodeService.replicate() to queue requests
At the moment calls to replicate() are handled immediately regardless of how many requests have been made or the size of the objects being replicated. It's possible to saturate the bandwidth with a dozen or so of large replica requests, or possibly overwhelm tomcat/apache's ability to handle requests because of sheer volume.
Add a queue to Metacat (possibly via hazelcast) that is used to manage incoming calls to replicate(). Processing of the queue should have a configurable maximum based on the number of requests currently being processed, or the load caused by the requests (bandwidth in particular).
#2 Updated by Ben Leinfelder over 11 years ago
We're doing a simpler implementation shift now - keeping the initial queue idea on the back burner if all else fails.
1. Share an ExecutorService with a fixed number of threads available for replicate() tasks.
2. each replicate() call is submitted for execution.
3. That's it.
Number of threads in the pool is set to available processors minus 1.