Bug #3065: cn.listObjects slicing broken in production - Infrastructure - DataONE Tasks

Bug #3065

cn.listObjects slicing broken in production

Added by Rob Nahf over 12 years ago. Updated about 12 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Ben Leinfelder

Category:

Metacat

Target version:

Sprint-2012.50-Block.6.4

Start date:

2012-12-02

Due date:

2013-01-05

% Done:

100%

Milestone:

CCI-1.1

Product Version:

Story Points:

Sprint:

Description

When retrieving the full list of objects 'page at a time' using slicing (start and count parameters), I get duplicate entries, and by inference, am not getting all identifiers.

For example, for one run against cn-orc-1, there are 101247 identifiers, but when put into a hash table, only 89635 unique keys exist. In another test run, the number of unique keys (identifiers) was 95760. The phenomenon was noticed Java, and confirmed in perl script running curl. When I pull the objectList in one gulp, there are 101247 unique identifiers.

Most likely retrieval from the database table does not guarantee the same order on repeat calls.

parseObjectList.pl - the perl script that checks id uniqueness (1.1 KB) Rob Nahf, 2012-07-13 08:55

parseORCresults.txt - results from a run against cn-orc-1 (469 KB) Rob Nahf, 2012-07-13 08:55

Related issues

History

#1 Updated by Rob Nahf over 12 years ago

Subject changed from cn.listObjects slicing broken to cn.listObjects slicing broken in production
File parseObjectList.pl added
File parseORCresults.txt added
Category set to Metacat
Assignee set to Ben Leinfelder

#2 Updated by Ben Leinfelder over 12 years ago

Status changed from New to In Progress

Now ordering the results from the db by identitifer.
This is fine for a static system, but when objects are continually being added, there's really no guarantee that the slicing won't then include additional identifiers in the results and throw off the original paging counts.

This will be included in the Metacat 2.0.2 release (currently in RC testing)

#3 Updated by Rob Nahf over 12 years ago

if the database table contained a sequence # field, then additional entries would be tacked onto the end of the list, and it would be possible for an iterating listObjects routine to pick up those new entries with the final iteration. The downside compared to the current fix is that clients are more likely to want to sort after retrieval.

#4 Updated by Ben Leinfelder over 12 years ago

Position deleted (3)
Target version changed from Sprint-2012.27-Block.4.2 to Sprint-2012.29-Block.4.3
Position set to 1

#5 Updated by Ben Leinfelder over 12 years ago

We'd then need a way for all CNs to have the same sequence number for each pid since we never know which CN will actually return the next batch of objects.

Rob Nahf wrote:

if the database table contained a sequence # field, then additional entries would be tacked onto the end of the list, and it would be possible for an iterating listObjects routine to pick up those new entries with the final iteration. The downside compared to the current fix is that clients are more likely to want to sort after retrieval.

#6 Updated by Ben Leinfelder over 12 years ago

Other than for diagnostic purposes, is CN.listObjects() ever used in a manner that requires exactly consistent paging? I can't think of another product or project that actually utilizes that service method. For the MN, sure, synchronization calls it all day long. But on the CN?

#7 Updated by Chris Jones over 12 years ago

Target version changed from Sprint-2012.29-Block.4.3 to Sprint-2012.37-Block.5.3

#8 Updated by Dave Vieglais over 12 years ago

Milestone changed from None to CCI-1.1
Due date set to 2012-09-22
Start date set to 2012-09-09
translation missing: en.field_remaining_hours set to 0.0

Is this still an issue?

#9 Updated by Dave Vieglais over 12 years ago

Due date changed from 2012-09-22 to 2012-10-27
Target version changed from Sprint-2012.37-Block.5.3 to Sprint-2012.41-Block.6.1

#10 Updated by Ben Leinfelder over 12 years ago

Status changed from In Progress to Testing

needs to be tested on the dev cns

#11 Updated by Ben Leinfelder over 12 years ago

Due date changed from 2012-10-27 to 2012-11-10
Target version changed from Sprint-2012.41-Block.6.1 to Sprint-2012.44-Block.6.2

#12 Updated by Ben Leinfelder about 12 years ago

Added DB-based slicing to Metacat 2.0.5 so that it is not done in memory.

I think the final thing needed for this is an artificial id on each row of SM so that the slice order is guaranteed even when new content is added (always to the end of the list).

#13 Updated by Ben Leinfelder about 12 years ago

Target version changed from Sprint-2012.44-Block.6.2 to Sprint-2012.50-Block.6.4
Due date changed from 2012-12-07 to 2013-01-05

#14 Updated by Ben Leinfelder about 12 years ago

Status changed from Testing to Closed

Given the limits of our current CN architecture, we have done all we can to improve the performance of listObject slicing. And for paged requests that occur in rapid succession on a perfectly in-synch unchanging system the current paging mechanism will work (sorted by guid). But we have a volatile, imperfect system of multiple CNs on our hands and cannot guarantee consistent listObject slicing.

Please see the new task #3468 for a continuation of this issue.

Also available in: Atom PDF

Project

General

Profile

Infrastructure

Issues

Custom queries