Story #3470
CN cluster communication needs to be monitored
100%
Description
The Coordinating Node environments heavily rely on consistent network communication, especially in regard to the Hazelcast cluster. Our services can get out of sync if there's a partitioned network where one or more cluster members drop from the cluster. We need to be able to monitor a few key states, with cluster membership being the most important thus far. We need to enable operational alerts through Nagios monitoring when the cluster gets partitioned. We need to programmatically respond to partitioned clusters (read only mode?), and we need to develop a custom merge policy when the cluster comes back into communication such that set, map, and queue entries that are out of sync get back into sync in terms of both number and content.
Subtasks
History
#1 Updated by Chris Jones almost 12 years ago
- translation missing: en.field_remaining_hours set to 0.0
- Due date set to 2013-01-19
- Tracker changed from Task to Story
#2 Updated by Chris Jones almost 12 years ago
- Subject changed from CN cluster communication needs to be monitored and maintained to CN cluster communication needs to be monitored and consistency maintained
#3 Updated by Chris Jones almost 12 years ago
- Description updated (diff)
#4 Updated by Chris Jones almost 12 years ago
- Description updated (diff)
#5 Updated by Skye Roseboom over 11 years ago
- Target version changed from 2013.2-Block.1.1 to 2013.12-Block.2.2
- Due date changed from 2013-01-19 to 2013-03-30
#6 Updated by Skye Roseboom over 11 years ago
- Milestone changed from CCI-1.1.1 to CCI-1.2
- Status changed from New to In Progress
#7 Updated by Skye Roseboom over 11 years ago
- Target version changed from 2013.12-Block.2.2 to 2013.16-Block.2.4
- Due date changed from 2013-03-30 to 2013-04-27
#8 Updated by Skye Roseboom over 11 years ago
- Target version changed from 2013.16-Block.2.4 to 2013.30-Block.4.3
- Due date changed from 2013-04-27 to 2013-08-03
#9 Updated by Skye Roseboom over 11 years ago
- Target version changed from 2013.30-Block.4.3 to 2013.35-Block.5.1
- Due date changed from 2013-08-03 to 2013-09-07
#10 Updated by Skye Roseboom about 11 years ago
- Due date changed from 2013-09-07 to 2013-10-26
- Target version changed from 2013.35-Block.5.1 to 2013.42-Block.5.4
#11 Updated by Chris Jones almost 11 years ago
- Due date changed from 2013-10-26 to 2014-02-15
- Target version changed from 2013.42-Block.5.4 to 2014.6-Block.1.3
#12 Updated by Robert Waltz over 10 years ago
- Subject changed from CN cluster communication needs to be monitored and consistency maintained to CN cluster communication needs to be monitored
#13 Updated by Skye Roseboom over 10 years ago
- Due date changed from 2014-02-15 to 2014-04-12
- Target version changed from 2014.6-Block.1.3 to 2014.14-Block.2.3
#14 Updated by Skye Roseboom over 10 years ago
- Due date changed from 2014-04-12 to 2014-04-26
- Target version changed from 2014.14-Block.2.3 to 2014.16-Block.2.4
#15 Updated by Skye Roseboom over 10 years ago
- Due date changed from 2014-04-26 to 2014-05-10
- Target version changed from 2014.16-Block.2.4 to 2014.18-Block.3.1
#16 Updated by Robert Waltz about 10 years ago
- Due date changed from 2014-05-10 to 2014-09-24
- Target version changed from 2014.18-Block.3.1 to Maintenance Backlog
#17 Updated by Skye Roseboom about 10 years ago
- Status changed from In Progress to Closed
performed with Splunk log monitoring and email notifications.