Project

General

Profile

Bug #7644

LDAP Resource contention leads to InterruptedNamingException

Added by Robert Waltz almost 9 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Robert Waltz
Category:
d1_cn_noderegistry
Start date:
2016-02-16
Due date:
% Done:

100%

Story Points:
Sprint:

Description

Noticed a lot of Exceptions in the cn-process-daemon.log file in /var/log/dataone/daemon directory that contains the text 'Interrupted during LDAP operation'

The exception is thrown from this class: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/com/sun/jndi/ldap/Connection.java?av=f#477

[ERROR] 2016-02-16 18:49:01,059 (ReplicaAuditingDelegate:isCNodeReplica:126) Unable to get node from node registry service for node ref: urn:node:CN
org.dataone.service.exceptions.ServiceFailure: Interrupted during LDAP operation
at org.dataone.service.cn.impl.v2.NodeRegistryService.getNode(NodeRegistryService.java:161)
[ERROR] 2016-02-16 11:23:13,128 (ReplicaAuditingDelegate:isCNodeReplica:126) Unable to get node from node registry service for node ref: urn:node:PISCO
org.dataone.service.exceptions.ServiceFailure: Interrupted during LDAP operation
at org.dataone.service.cn.impl.v2.NodeRegistryService.getNode(NodeRegistryService.java:161)
[ERROR] 2016-02-16 11:29:14,620 (ReplicaAuditingDelegate:isCNodeReplica:126) Unable to get node from node registry service for node ref: urn:node:SANPARKS
org.dataone.service.exceptions.ServiceFailure: Interrupted during LDAP operation

[ERROR] 2016-02-16 11:18:29,659 (StdOutErrLog$2:print:51) 31336114 [SynchronizeTask10] ERROR org.dataone.cn.ldap.LDAPService - Could not set up TLS connection, using non-secure communication
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) javax.naming.InterruptedNamingException: Interrupted during LDAP operation
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.Connection.readReply(Connection.java:488)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.LdapClient.extendedOp(LdapClient.java:1204)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.LdapCtx.extendedOperation(LdapCtx.java:3207)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at org.dataone.cn.ldap.LDAPService.getContext(LDAPService.java:79)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at org.dataone.service.cn.impl.v2.ReserveIdentifierService.lookupReservedIdentifiers(ReserveIdentifierService.java:478)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at org.dataone.service.cn.impl.v2.ReserveIdentifierService.expireEntries(ReserveIdentifierService.java:412)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at org.dataone.service.cn.impl.v2.ReserveIdentifierService.hasReservation(ReserveIdentifierService.java:299)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at org.dataone.cn.batch.synchronization.NodeCommSyncObjectFactory$1.hasReservation(NodeCommSyncObjectFactory.java:155)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at org.dataone.cn.batch.synchronization.tasks.V2TransferObjectTask.processNewObject(V2TransferObjectTask.java:418)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at org.dataone.cn.batch.synchronization.tasks.V2TransferObjectTask.processTask(V2TransferObjectTask.java:380)
[ERROR] 2016-02-16 11:18:29,661 (StdOutErrLog$2:print:51) at org.dataone.cn.batch.synchronization.tasks.V2TransferObjectTask.call(V2TransferObjectTask.java:197)
[ERROR] 2016-02-16 11:18:29,662 (StdOutErrLog$2:print:51) at org.dataone.cn.batch.synchronization.tasks.V2TransferObjectTask.call(V2TransferObjectTask.java:108)
[ERROR] 2016-02-16 11:18:29,662 (StdOutErrLog$2:print:51) at java.util.concurrent.FutureTask.run(FutureTask.java:262)
[ERROR] 2016-02-16 11:18:29,662 (StdOutErrLog$2:print:51) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[ERROR] 2016-02-16 11:18:29,662 (StdOutErrLog$2:print:51) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[ERROR] 2016-02-16 11:18:29,662 (StdOutErrLog$2:print:51) at java.lang.Thread.run(Thread.java:745)

[ERROR] 2016-02-16 12:00:22,956 (StdOutErrLog$2:print:51) 33849411 [pool-11-thread-4] ERROR org.dataone.cn.ldap.ServiceMethodRestrictionsAccess - Problem search Nodes for Nodelist
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) javax.naming.InterruptedNamingException: Interrupted during LDAP operation; remaining name 'dc=org'
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.Connection.readReply(Connection.java:488)
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.LdapClient.getSearchReply(LdapClient.java:640)
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.LdapClient.search(LdapClient.java:563)
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.LdapCtx.doSearch(LdapCtx.java:1986)
ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.LdapCtx.searchAux(LdapCtx.java:1848)
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.ldap.LdapCtx.c_search(LdapCtx.java:1773)
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_search(ComponentDirContext.java:386)
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.search(PartialCompositeDirContext.java:356)
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.search(PartialCompositeDirContext.java:339)
[ERROR] 2016-02-16 12:00:22,957 (StdOutErrLog$2:print:51) at javax.naming.directory.InitialDirContext.search(InitialDirContext.java:267)
[ERROR] 2016-02-16 12:00:22,958 (StdOutErrLog$2:print:51) at org.dataone.cn.ldap.ServiceMethodRestrictionsAccess.getServiceMethodRestrictionList(ServiceMethodRestrictionsAccess.java:126)
[ERROR] 2016-02-16 12:00:22,958 (StdOutErrLog$2:print:51) at org.dataone.service.cn.impl.v2.NodeRegistryService.listNodes(NodeRegistryService.java:117)
[ERROR] 2016-02-16 12:00:22,958 (StdOutErrLog$2:print:51) at org.dataone.service.cn.replication.auditor.v1.strategy.CoordinatingNodeReplicaAuditingStrategy.auditCNodeReplicas(CoordinatingNodeReplicaAuditingStrategy.java:114)
[ERROR] 2016-02-16 12:00:22,958 (StdOutErrLog$2:print:51) at org.dataone.service.cn.replication.auditor.v1.strategy.CoordinatingNodeReplicaAuditingStrategy.auditPid(CoordinatingNodeReplicaAuditingStrategy.java:104)
[ERROR] 2016-02-16 12:00:22,958 (StdOutErrLog$2:print:51) at org.dataone.service.cn.replication.auditor.v1.strategy.CoordinatingNodeReplicaAuditingStrategy.auditPids(CoordinatingNodeReplicaAuditingStrategy.java:86)
[ERROR] 2016-02-16 12:00:22,958 (StdOutErrLog$2:print:51) at org.dataone.service.cn.replication.auditor.v1.task.CoordinatingNodeReplicaAuditTask.call(CoordinatingNodeReplicaAuditTask.java:55)
[ERROR] 2016-02-16 12:00:22,958 (StdOutErrLog$2:print:51) at org.dataone.service.cn.replication.auditor.v1.task.CoordinatingNodeReplicaAuditTask.call(CoordinatingNodeReplicaAuditTask.java:39)
[ERROR] 2016-02-16 12:00:22,958 (StdOutErrLog$2:print:51) at java.util.concurrent.FutureTask.run(FutureTask.java:262)
[ERROR] 2016-02-16 12:00:22,959 (StdOutErrLog$2:print:51) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[ERROR] 2016-02-16 12:00:22,959 (StdOutErrLog$2:print:51) at java.lang.Thread.run(Thread.java:745)
ERROR] 2016-02-16 12:00:22,960 (CoordinatingNodeReplicaAuditingStrategy:auditCNodeReplicas:228) UNable to get node list from node registry service
org.dataone.service.exceptions.ServiceFailure: Interrupted during LDAP operation
at org.dataone.cn.ldap.ServiceMethodRestrictionsAccess.getServiceMethodRestrictionList(ServiceMethodRestrictionsAccess.java:158)
at org.dataone.service.cn.impl.v2.NodeRegistryService.listNodes(NodeRegistryService.java:117)
at org.dataone.service.cn.replication.auditor.v1.strategy.CoordinatingNodeReplicaAuditingStrategy.auditCNodeReplicas(CoordinatingNodeReplicaAuditingStrategy.java:114)
at org.dataone.service.cn.replication.auditor.v1.strategy.CoordinatingNodeReplicaAuditingStrategy.auditPid(CoordinatingNodeReplicaAuditingStrategy.java:104)

Associated revisions

Revision 17872
Added by Robert Waltz over 8 years ago

refs #7644

LDAP Resource contention leads to InterruptedNamingException. Trying to determine why downstream projects are not testing out appropriately. In doing so, i have updated the tests associated with ApacheDSSuiteRunner. Upgraded to the latest version of ApacheDS. The upgrade led to several other modifications since ApacheDS not longer assumes 'localhost' is the default address for its Transport Layer, rather its gets the default from InetAddress.getLocalHost().getHostName() and in the future will use its own abstraction: Network.LOOPBACK_HOSTNAME.

Revision 17872
Added by Robert Waltz over 8 years ago

refs #7644

LDAP Resource contention leads to InterruptedNamingException. Trying to determine why downstream projects are not testing out appropriately. In doing so, i have updated the tests associated with ApacheDSSuiteRunner. Upgraded to the latest version of ApacheDS. The upgrade led to several other modifications since ApacheDS not longer assumes 'localhost' is the default address for its Transport Layer, rather its gets the default from InetAddress.getLocalHost().getHostName() and in the future will use its own abstraction: Network.LOOPBACK_HOSTNAME.

Revision 17873
Added by Robert Waltz over 8 years ago

refs #7644

LDAP Resource contention leads to InterruptedNamingException. Trying to determine why downstream projects are not testing out appropriately. In doing so, i have updated the tests associated with ApacheDSSuiteRunner. Upgraded to the latest version of ApacheDS. The upgrade led to several other modifications since ApacheDS not longer assumes 'localhost' is the default address for its Transport Layer, rather its gets the default from InetAddress.getLocalHost().getHostName() and in the future will use its own abstraction: Network.LOOPBACK_HOSTNAME.

Revision 17873
Added by Robert Waltz over 8 years ago

refs #7644

LDAP Resource contention leads to InterruptedNamingException. Trying to determine why downstream projects are not testing out appropriately. In doing so, i have updated the tests associated with ApacheDSSuiteRunner. Upgraded to the latest version of ApacheDS. The upgrade led to several other modifications since ApacheDS not longer assumes 'localhost' is the default address for its Transport Layer, rather its gets the default from InetAddress.getLocalHost().getHostName() and in the future will use its own abstraction: Network.LOOPBACK_HOSTNAME.

Revision 17930
Added by Robert Waltz over 8 years ago

refs #7644

Added documentation about how to generate ApacheDS schema from OpenLDAP Schema. Regenerated the dataone-schema to test it out. Added access control at the dc=org level and reduced the number of rule sets.

Revision 17930
Added by Robert Waltz over 8 years ago

refs #7644

Added documentation about how to generate ApacheDS schema from OpenLDAP Schema. Regenerated the dataone-schema to test it out. Added access control at the dc=org level and reduced the number of rule sets.

Revision 17931
Added by Robert Waltz over 8 years ago

refs #7644

LDAP Resource contention leads to InterruptedNamingException. Added in apache commons pool to manage LDAP context connections. Refactored LdapService to remove from the base class instantiation of dircontext.

Added in unit tests for pooling mechanism

Revision 17931
Added by Robert Waltz over 8 years ago

refs #7644

LDAP Resource contention leads to InterruptedNamingException. Added in apache commons pool to manage LDAP context connections. Refactored LdapService to remove from the base class instantiation of dircontext.

Added in unit tests for pooling mechanism

Revision 17932
Added by Robert Waltz over 8 years ago

refs #7644

Refactored library to use pooling of ldap contexts. Abstracted NodeRegistryService interface from implementation, hopefully for use in future rewrites that will decouple dependency on the NodeRegistry implementation. Added in NodeFacade that acts as a public abstraction layer over the LDAP DAO classes.

Revision 17932
Added by Robert Waltz over 8 years ago

refs #7644

Refactored library to use pooling of ldap contexts. Abstracted NodeRegistryService interface from implementation, hopefully for use in future rewrites that will decouple dependency on the NodeRegistry implementation. Added in NodeFacade that acts as a public abstraction layer over the LDAP DAO classes.

Revision 17933
Added by Robert Waltz over 8 years ago

refs #7644

Refactored library to use pooling of ldap contexts. Abstracted NodeRegistryService interface from implementation, hopefully for use in future rewrites that will decouple dependency on the NodeRegistry implementation. Added in NodeFacade that acts as a public abstraction layer over the LDAP DAO classes.

Revision 17933
Added by Robert Waltz over 8 years ago

refs #7644

Refactored library to use pooling of ldap contexts. Abstracted NodeRegistryService interface from implementation, hopefully for use in future rewrites that will decouple dependency on the NodeRegistry implementation. Added in NodeFacade that acts as a public abstraction layer over the LDAP DAO classes.

Revision 17934
Added by Robert Waltz over 8 years ago

refs #7644

Refactored library to use pooling of ldap contexts. Refactored Unit Testing to use ApacheDS.

Revision 17934
Added by Robert Waltz over 8 years ago

refs #7644

Refactored library to use pooling of ldap contexts. Refactored Unit Testing to use ApacheDS.

Revision 17935
Added by Robert Waltz over 8 years ago

refs #7644

Refactored library to use pooling of ldap contexts. Refactored Unit Testing to use ApacheDS. Added in additional test from earlier commit that I did not catch in an update.

Revision 17935
Added by Robert Waltz over 8 years ago

refs #7644

Refactored library to use pooling of ldap contexts. Refactored Unit Testing to use ApacheDS. Added in additional test from earlier commit that I did not catch in an update.

Revision 17936
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism. added in Facade to include synchronization only LDAP interactions. Moved and renamed service interfaces.

Revision 17936
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism. added in Facade to include synchronization only LDAP interactions. Moved and renamed service interfaces.

Revision 17937
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism. added in Facade to include log aggregation only LDAP interactions.

Revision 17937
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism. added in Facade to include log aggregation only LDAP interactions.

Revision 17938
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17938
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17939
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17939
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17940
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17940
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17941
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17941
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17942
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17942
Added by Robert Waltz over 8 years ago

refs #7644

Updated to use new LDAP pooling mechanism.

Revision 17943
Added by Robert Waltz over 8 years ago

refs #7644

increased a couple default pooling property values

Revision 17943
Added by Robert Waltz over 8 years ago

refs #7644

increased a couple default pooling property values

Revision 17945
Added by Robert Waltz over 8 years ago

refs #7644

wrong package name

Revision 17945
Added by Robert Waltz over 8 years ago

refs #7644

wrong package name

Revision 17946
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts

Revision 17946
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts

Revision 17947
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Close is being called but threads and connections are not actually closed.

Revision 17947
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Close is being called but threads and connections are not actually closed.

Revision 17948
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Changing default values to test connection close problem.

Revision 17948
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Changing default values to test connection close problem.

Revision 17949
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. forcing a close, even though documentation states is not necessary.

Revision 17949
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. forcing a close, even though documentation states is not necessary.

Revision 17952
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Try to suppress the running of idle object eviction threads.

Revision 17952
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Try to suppress the running of idle object eviction threads.

Revision 17953
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Close tls connection before closing dirContext.

Revision 17953
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Close tls connection before closing dirContext.

Revision 17955
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Close tls connection after closing dirContext. Getting a lot of communication exceptions.

Revision 17955
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Close tls connection after closing dirContext. Getting a lot of communication exceptions.

Revision 17956
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Still getting Comm exceptions due to TLS close. Try to remove listener that prints out the message before closing TLS.

Revision 17956
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. Still getting Comm exceptions due to TLS close. Try to remove listener that prints out the message before closing TLS.

Revision 17957
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. was not removing entries from a map after connections closed. changed info level log messages to debug.

Revision 17957
Added by Robert Waltz over 8 years ago

refs #7644

something is spawning off threads creating ldap dir contexts. was not removing entries from a map after connections closed. changed info level log messages to debug.

Revision 17959
Added by Robert Waltz over 8 years ago

refs #7644

Missed a spot in SessionAuthorizationFilterStrategy

Revision 17959
Added by Robert Waltz over 8 years ago

refs #7644

Missed a spot in SessionAuthorizationFilterStrategy

Revision 17960
Added by Robert Waltz over 8 years ago

refs #7644

Refactored for use of Apache Directory Server Tests.

Revision 17960
Added by Robert Waltz over 8 years ago

refs #7644

Refactored for use of Apache Directory Server Tests.

Revision 17961
Added by Robert Waltz over 8 years ago

refs #7644

Cleaning up a few imports

Revision 17961
Added by Robert Waltz over 8 years ago

refs #7644

Cleaning up a few imports

Revision 18044
Added by Robert Waltz over 8 years ago

refs #7644

verification method was not reporting when the LDAP connection has become stale.

Revision 18044
Added by Robert Waltz over 8 years ago

refs #7644

verification method was not reporting when the LDAP connection has become stale.

Revision 18052
Added by Robert Waltz over 8 years ago

refs #7644

Oops. completed NodeRegistryLogAggregationServiceImpl by calling respective methods in NodeRegistryLogAggregationFacade.

Revision 18052
Added by Robert Waltz over 8 years ago

refs #7644

Oops. completed NodeRegistryLogAggregationServiceImpl by calling respective methods in NodeRegistryLogAggregationFacade.

Revision 18053
Added by Robert Waltz over 8 years ago

refs #7644

setAggregateLogs back to true even if an exception has occurred. In other words, keep retrying a failing member node until log aggregation is turned off manually.

Revision 18053
Added by Robert Waltz over 8 years ago

refs #7644

setAggregateLogs back to true even if an exception has occurred. In other words, keep retrying a failing member node until log aggregation is turned off manually.

Revision 18054
Added by Robert Waltz over 8 years ago

refs #7644

Fixing import statements. Formatting

Revision 18054
Added by Robert Waltz over 8 years ago

refs #7644

Fixing import statements. Formatting

History

#1 Updated by Robert Waltz over 8 years ago

  • Target version set to CCI-2.2.0

#2 Updated by Robert Waltz over 8 years ago

  • % Done changed from 0 to 30
  • Status changed from New to In Progress

#3 Updated by Robert Waltz over 8 years ago

  • % Done changed from 30 to 50
  • Status changed from In Progress to Testing

#4 Updated by Robert Waltz over 8 years ago

  • Status changed from Testing to Closed
  • % Done changed from 50 to 100

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)