Project

General

Profile

Task #2177

Story #2166: Hazelcast cluster errors need to be isolated

Use Lock.tryLock() (not Lock.lock()) in d1_replication

Added by Chris Jones over 12 years ago. Updated about 12 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
d1_replication
Start date:
2012-01-09
Due date:
% Done:

100%

Milestone:
CCI-1.0.0
Product Version:
*
Story Points:
Sprint:

Description

In order to avoid potential deadlocks in Hazelcast, convert calls to Lock.lock() to Lock.tryLock() and introduce a timeout for the try. Also introduce a queue structure to re-process operations that need to be re-tried if the lock fails.

History

#1 Updated by Chris Jones over 12 years ago

  • Subject changed from Use Lock.tryLock() (not Lock.lock()) d1_replication to Use Lock.tryLock() (not Lock.lock()) in d1_replication

#2 Updated by Chris Jones over 12 years ago

  • Status changed from New to In Progress

#3 Updated by Chris Jones about 12 years ago

  • Status changed from In Progress to Closed

I've created a ReplicationEventListener class that queues events for processing. entryAdded and entryUpdated events trigger a tryLock() on an event string based on the incoming identifier. Whichever CN instance gets the lock first will queue the identifier and createAndQueueTasks() will be called for it.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)