Project

General

Profile

Story #3375

Updated by Chris Jones about 12 years ago

We've seen a number of issues in each environment where Metacat's PostgreSQL system metadata tables don't stay in sync, replication suffers from inconsistent replica statuses across CNs, and the hzIdentifiers set iterator inconsistently won't iterate through the entire Set. One major factor may be Hazelcast cluster instance communication problems, showing up in the catalina.out logs as:
<pre>
WARNING: /64.106.40.7:5701 [DataONE] hz.1.InThread Closing socket to endpoint Address[160.36.13.152:5701], Cause:java.io.EOFException
</pre>
These are known issues in the Hazelcast forum and issue list for 1.9.X, and the recommended fix is to upgrade to Hazelcast 2.X, where the connection framework has been significantly rewritten. This story documents the components that need to be modified to handle the 2.X API changes.

The plan is to use the Hazelcast 2.4.x series, however there is an outstanding HazelcastClient connection bug (see https://github.com/hazelcast/hazelcast/issues/315) that affects all versions of Hazelcast from 1.9.3 to 2.4. It is fixed in 2.4.1, which has not been released yet. The plan is to use Hudson to build Hazelcast 2.4.1 from the TAG, use this build to refactor the code, and then eventually use the 2.4.1 release from the Hazelcast group once they get it pushed into Maven Central.

Back

Add picture from clipboard (Maximum size: 14.8 MB)