Bug #7631
synchronization not processing v1 Member Nodes
100%
Description
Following up on issue with EDACGSTORE; Marco "found that cn.dataone.org synchronization is only happening for KNB and GOA since about 2016-01-28 10:05:00 GMT"
KNB and GOA are v2 member nodes. all others are not synchronizing.
Subtasks
Related issues
History
#1 Updated by Rob Nahf almost 9 years ago
stack trace shows all threads (at least greater than 20 threads) held up by a lock on RestClient.doRequestNoBody, a synchronized method.
All NodeCommunication within the same version use the same RestClient, so a hung call via the synchronized doRequestNoBody method is indeed able to block all communication to v1 nodes.
We use timeouts, but apparently, HttpClient 4.3.3 is susceptible to not timing out. see: https://issues.apache.org/jira/browse/HTTPCLIENT-1478.
Now that RestClient passes in a RequestConfiguration with the calls, we have less reason to keep the methods synchronized, and can minimize the affect of a hung call for multithreaded applications.
Attempts to recreate the hung HttpClient call have failed (trying to call CLO eBird.)
SynchronizeTask910" daemon prio=10 tid=0x00007ff8c0009800 nid=0x61f1 waiting for monitor entry [0x00007ff964fce000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.dataone.client.rest.RestClient.doRequestMMBody(RestClient.java:245)
- waiting to lock (a org.dataone.client.rest.RestClient)
at org.dataone.client.rest.RestClient.doPostRequest(RestClient.java:192)
"SynchronizationQuartzScheduler_Worker-35" prio=10 tid=0x0000000002df5000 nid=0x2e15 waiting for monitor entry [0x00007ff98dddc000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.dataone.client.rest.RestClient.doRequestNoBody(RestClient.java:212)
- waiting to lock (a org.dataone.client.rest.RestClient)
at org.dataone.client.rest.RestClient.doGetRequest(RestClient.java:148)
"SynchronizationQuartzScheduler_Worker-38" prio=10 tid=0x0000000002dfb000 nid=0x2e18 runnable [0x00007ff98dad8000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:554)
at sun.security.ssl.InputRecord.read(InputRecord.java:509)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:946)
- locked (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1344)
- locked (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:117)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at org.dataone.client.rest.RestClient.doRequest(RestClient.java:294)
at org.dataone.client.rest.RestClient.doRequestNoBody(RestClient.java:230) < -- a synchronized method
- locked (a org.dataone.client.rest.RestClient)
at org.dataone.client.rest.RestClient.doGetRequest(RestClient.java:148)
at org.dataone.client.rest.HttpMultipartRestClient.doGetRequest(HttpMultipartRestClient.java:329)
at org.dataone.client.rest.HttpMultipartRestClient.doGetRequest(HttpMultipartRestClient.java:318)
at org.dataone.client.v1.impl.MultipartMNode.listObjects(MultipartMNode.java:214)
#2 Updated by Rob Nahf almost 9 years ago
- Related to Task #7633: refactor libclient to remove synchronized from the 2 key methods added
#3 Updated by Rob Nahf almost 9 years ago
- Status changed from In Progress to Testing
- % Done changed from 30 to 50
tasks for operations are complete. This turned out to be a non-testable bug, so the solution in libclient is inferred to be correct based on overlap of symptoms and stack traces.
#4 Updated by Dave Vieglais almost 9 years ago
- Status changed from Testing to Closed
- % Done changed from 50 to 100