Story #8158: Synchronization appears to fail under high load - Infrastructure - DataONE Tasks

Story #8158

Synchronization appears to fail under high load

Added by Dave Vieglais over 7 years ago. Updated over 6 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Dave Vieglais

Category:

d1_synchronization

Target version:

CCI-2.3.7

Start date:

2017-08-04

Due date:

% Done:

100%

Story Points:

Sprint:

Infrastructure backlog

Description

When the number of objects that needs to be processed by synchronization is high, the process fails / crashes. For example, the Pangaea node with 325k objects fails on initial sync. The cause of this needs to be evaluated and the synchronization process refactored to allow arbitrarily large numbers of entries to be synchronized.

History

#1 Updated by Dave Vieglais over 7 years ago

Assignee changed from Robert Waltz to Dave Vieglais

#2 Updated by Dave Vieglais about 7 years ago

Sprint set to Infrastructure backlog

#3 Updated by Dave Vieglais about 7 years ago

Target version changed from CCI-2.4.0 to CCI-2.3.7

#4 Updated by Rob Nahf almost 7 years ago

harvest was refactored to better keep the queue under 50k items. (It doesn't harvest everything if the queue is too long). Synchronizing in stage only shows sync failing due to Hz errors (connection issues, CONCURRENT_MAP_REMOVES).

So, I believe the sync code is robust with respect to high load.