https://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2018-03-27T22:46:29ZDataONE TasksInfrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=301262018-03-27T22:46:29ZRob Nahfrnahf@epscor.unm.edu
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-5 priority-4 priority-default closed" href="/issues/7706">Bug #7706</a>: Hazelcast Runtime exception halts synchronization</i> added</li></ul> Infrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=301292018-03-28T17:10:02ZRob Nahfrnahf@epscor.unm.edu
<ul></ul><p>going to need to know what type of exception is the cause: It seems to be a generic RuntimeException. see the source code: <a href="https://github.com/hazelcast/hazelcast/blob/maintenance-2.x/hazelcast/src/main/java/com/hazelcast/impl/ClientServiceException.java">https://github.com/hazelcast/hazelcast/blob/maintenance-2.x/hazelcast/src/main/java/com/hazelcast/impl/ClientServiceException.java</a></p>
Infrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=310762018-11-13T00:09:31ZDave Vieglaisdave.vieglais@gmail.com
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>Urgent</i></li></ul><p>We are seeing this issue with increased frequency, possibly related to the volume of content held in the map.</p>
<p>In the short term, this service needs to be configured to retry on timeout to prevent synchronization from shutting down on a single timeout.</p>
<p>Longer term solution involves deprecation of hazelcast for inter-CN synchronization of content. Typical error is:</p>
<pre>[ERROR] 2018-11-12 21:17:34,239 [ProcessDaemonTask1] (SyncObjectTaskManager:run:84) java.util.concurrent.ExecutionException: java.lang.RuntimeException: [CONCURRENT_MAP_CONTAINS_KEY] Operation Timeout (with no response!): 0
java.util.concurrent.ExecutionException: java.lang.RuntimeException: [CONCURRENT_MAP_CONTAINS_KEY] Operation Timeout (with no response!): 0
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.dataone.cn.batch.synchronization.SyncObjectTaskManager.run(SyncObjectTaskManager.java:76)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: [CONCURRENT_MAP_CONTAINS_KEY] Operation Timeout (with no response!): 0
at com.hazelcast.impl.ClientServiceException.readData(ClientServiceException.java:63)
at com.hazelcast.nio.Serializer$DataSerializer.read(Serializer.java:104)
at com.hazelcast.nio.Serializer$DataSerializer.read(Serializer.java:79)
at com.hazelcast.nio.AbstractSerializer.toObject(AbstractSerializer.java:121)
at com.hazelcast.nio.AbstractSerializer.toObject(AbstractSerializer.java:156)
at com.hazelcast.client.ClientThreadContext.toObject(ClientThreadContext.java:72)
at com.hazelcast.client.IOUtil.toObject(IOUtil.java:34)
at com.hazelcast.client.ProxyHelper.getValue(ProxyHelper.java:186)
at com.hazelcast.client.ProxyHelper.doOp(ProxyHelper.java:146)
at com.hazelcast.client.ProxyHelper.doOp(ProxyHelper.java:140)
at com.hazelcast.client.MapClientProxy.containsKey(MapClientProxy.java:219)
at org.dataone.cn.batch.synchronization.type.AbstractListenableMapAdapter.containsKey(AbstractListenableMapAdapter.java:49)
at org.dataone.cn.batch.synchronization.type.SyncQueueFacade.poll(SyncQueueFacade.java:220)
at org.dataone.cn.batch.synchronization.tasks.SyncObjectTask.call(SyncObjectTask.java:131)
at org.dataone.cn.batch.synchronization.tasks.SyncObjectTask.call(SyncObjectTask.java:73)
... 2 more
</pre> Infrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=310782018-11-13T17:10:42ZDave Vieglaisdave.vieglais@gmail.com
<ul></ul><p>Just noticed that in <code>/etc/dataone/storage/hazelcast.xml</code> the number of backups is set to 3, which means HZ is trying to make 4 copies of everything (self + 3 copies) which of course doesn't work since there's only 3 CNs.</p>
<p><code>backup-count</code> should be set to "2" for both <code>hzSystemMetadata</code> and <code>hzObjectPath</code> (actually all structures).</p>
<p>Additional suggestions:</p>
<p>Allow read from the backup copies. Backups are created synchronously, so a backup copy will be consistent with the owner's copy. Hence it should be safe to read a backup copy instead of always requesting the owner's copy which is the default.</p>
<pre><read-backup-data>true</read-backup-data>
</pre>
<p>The eviction policy is currently set to dump 25% of the entries when the maximum of 3,000,000 is reached. This seems completely arbitrary and is likely using a huge amount of memory. Instead, we could set the limit much lower and rely on efficient population of the map on demand from postgres. Suggest keeping 5000 entries in memory:</p>
<pre><max-size policy="cluster_wide_map_size">5000</max-size>
</pre>
<p>On startup, <code>MapLoader.loadAllKeys</code> is used to pre-populate the hazelcast map. However, entries are also loaded on demand when a key (i.e. PID) is requested. Is there any benefit to pre-populating the map when there is little chance that the populated entries will be relevant to the current operations? Suggest changing <code>MapLoader.loadAllKeys</code> to return NULL. Then entries will be loaded on demand. This would be a change in Metacat.</p>
<p>See: <a href="https://docs.hazelcast.org/docs/2.4.1/manual/single_html/index.html#Persistence">https://docs.hazelcast.org/docs/2.4.1/manual/single_html/index.html#Persistence</a></p>
Infrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=310882018-11-15T18:27:56ZDave Vieglaisdave.vieglais@gmail.com
<ul></ul><p>WAN replication is available in the version of hazelcast we are using (2.4.1) though becomes an enterprise "feature" on later versions. If we are planning to drop hazelcast for inter-CN sync, then enabling WAN replication may buy us some time for stability before needing to change technologies.</p>
<p><a href="https://docs.hazelcast.org/docs/2.4.1/manual/single_html/index.html#WanReplication">https://docs.hazelcast.org/docs/2.4.1/manual/single_html/index.html#WanReplication</a></p>
Infrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=310892018-11-15T18:34:07ZDave Vieglaisdave.vieglais@gmail.com
<ul></ul><p>wrt to <code>MapLoader.loadAllKeys</code>, it appears this has already been done:</p>
<p><a href="https://github.com/NCEAS/metacat/blob/master/src/edu/ucsb/nceas/metacat/dataone/hazelcast/SystemMetadataMap.java#L114">https://github.com/NCEAS/metacat/blob/master/src/edu/ucsb/nceas/metacat/dataone/hazelcast/SystemMetadataMap.java#L114</a></p>
Infrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=310902018-11-15T19:08:21ZRob Nahfrnahf@epscor.unm.edu
<ul></ul><p>Reducing the number of backups seems like a no-brainer.</p>
<p>Regarding pre-population and how many to hold in memory, I think we need to consider the impact on lookup performance and also the hit distribution. Reducing from 750k in memory to 5k in memory I think really ups the odds that a db lookup + serialization will be needed. </p>
<p>If 95% of the time an HZ map lookup is going to take the same amount of time as a CN.getSystemMetadata call, it may not be worth using hazelcast at all.</p>
<p>Ideally we could pre-populate the map (upon startup) with items likely to be needed, based on archive status and modification date. </p>
<p>I also think because of how we index OREs, we need a larger cache, or else a large ORE will cycle the cache for clients. We didn't seem to have pre-loading or stability issues that long ago, so I would suggest keeping 100k-200k in memory. that's still at least 35% of what we are currently using, and would scale going from 3 CNs to 2 CNs.</p>
<p>Given the current analysis of xerces processing time for the view service, maybe there's a similar performance drag affecting systemMetadata serialization as well. JAXB serialization may also be slower than JiBX, as well. </p>
Infrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=310912018-11-16T17:16:04ZDave Vieglaisdave.vieglais@gmail.com
<ul></ul><p>The following changes have been deployed on production:</p>
<ol>
<li><code><backup-count>2</backup-count></code></li>
<li><code><max-size policy="cluster_wide_map_size">50000</max-size></code></li>
<li><code><read-backup-data>true</read-backup-data></code></li>
</ol>
<p>After tomcat restart sync appears stable. (previously sync was failing with timeouts after an hour or so).</p>
<p>Performance of <code>getSystemMetadata</code>, <code>getObject</code>, and the view service appear unaffected.</p>
<p>wrt. Rob's comments - the reduction was from 3,000,000 entries held in memory. I dropped it down to 50k which may be on the low side, but the primary goal is to get synchronization stable first.</p>
<p>The system metadata map is used to mirror sysmeta between the CNs. An alternative is to use postgres replication but that would involve major refactoring. </p>
<p>We need to do more profiling to determine what the bottlenecks are for basic operations like getSystemMetadata. I suspect authorization as the main blocker, but really need to measure.</p>
<p>wrt. JAXB / JiBX and xerces - this may well be a bottleneck as well. moving in and out of xml is pretty slow.</p>
Infrastructure - Story #8525: timeout exceptions thrown from Hazelcast disable synchronizationhttps://redmine.dataone.org/issues/8525?journal_id=310922018-11-16T17:25:12ZDave Vieglaisdave.vieglais@gmail.com
<ul><li><strong>% Done</strong> changed from <i>0</i> to <i>30</i></li><li><strong>Priority</strong> changed from <i>Urgent</i> to <i>High</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul><p>Adding tasks to ensure config changes are persisted across updates</p>