Project

General

Profile

Bug #1160

Replication fails becuase null error

Added by Robert Waltz almost 14 years ago. Updated almost 14 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Chad Berkley
Category:
Metacat
Target version:
Start date:
Due date:
% Done:

100%

Milestone:
CCI-0.5
Product Version:
*
Story Points:
Sprint:

History

#1 Updated by Robert Waltz almost 14 years ago

  • Status changed from New to In Progress
  • Category set to Metacat
  • Assignee set to Chad Berkley
  • Milestone set to CCI-0.5

I am having problems with replication. I find that documents that are replicated show up in the /knb/documents directory with the ids as originally assigned from the replicator machine. The replicant machine throws these 'null' errors. I find the entries in the xml_documents table too. So, i don't know where metacat failed to write the doc in the database. I also receive a document list when i hit knb/d1/object, but when I query for any single object using /knb/d1/object/XXX, i receive a dataone 404 xml error.

here are some errors I received in my log file:
knb 2010-12-11T23:58:48: [ERROR]: ReplicationHandler.handleSingleXMLDocument - 32 Failed to write xml doc autogen.201034511152359.1 into xml_documents from cn-rpw-orc.communalware.net/knb/servlet/replication because null
knb 2010-12-11T23:58:48: [ERROR]: ReplicationHandler.handleSingleXMLDocument - Failed to write doc autogen.201034511152359.1 into db because null
knb 2010-12-11T23:58:48: [ERROR]: ReplicationHandler.handleDocList - error to handle update doc in xml_documents in time replicationReplicationHandler.handleSingleXMLDocument - generic exception writing Replication: null

#2 Updated by Dave Vieglais almost 14 years ago

  • Target version changed from Sprint-2010.50 to Sprint-2011.01

#3 Updated by Chad Berkley almost 14 years ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Closed

I got replication running between cn-orc-1 and cn-unm-1. There were two problems with the setup:

1) The admin servlet process had not been completed
2) The server name in the admin servlet must match the server name in replControl.html. The server name was set to cn-orc-1 but replication was set to go to cn-orc-1.dataone.org (same on cn-unm-1). The name must match exactly or the SSL handshake will be rejected.

I tested with the integration test. I inserted a document into cn-unm-1 and saw it replicate to cn-orc-1 about 10-15 seconds later with the message:
knb 2011-01-03T19:36:12: [INFO]: replicate D1GUID:knb:testid:20113113535255:D1SCIMETADATA:autogen.20113113535632.1:

I verified that the document knb:testid:20113113535255 now exists on both nodes.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)