Project

General

Profile

Bug #3065

cn.listObjects slicing broken in production

Added by Rob Nahf almost 9 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Ben Leinfelder
Category:
Metacat
Start date:
2012-12-02
Due date:
2013-01-05
% Done:

100%

Milestone:
CCI-1.1
Product Version:
*
Story Points:
Sprint:

Description

When retrieving the full list of objects 'page at a time' using slicing (start and count parameters), I get duplicate entries, and by inference, am not getting all identifiers.

For example, for one run against cn-orc-1, there are 101247 identifiers, but when put into a hash table, only 89635 unique keys exist. In another test run, the number of unique keys (identifiers) was 95760. The phenomenon was noticed Java, and confirmed in perl script running curl. When I pull the objectList in one gulp, there are 101247 unique identifiers.

Most likely retrieval from the database table does not guarantee the same order on repeat calls.

parseObjectList.pl Magnifier - the perl script that checks id uniqueness (1.1 KB) Rob Nahf, 2012-07-13 08:55

parseORCresults.txt Magnifier - results from a run against cn-orc-1 (469 KB) Rob Nahf, 2012-07-13 08:55


Related issues

Related to Infrastructure - Bug #3128: Production CN listObjects slow response Closed 2012-08-13 2013-01-05
Related to Infrastructure - Task #3465: Improve Metacat listObject slicing New 2012-12-07

History

#1 Updated by Rob Nahf almost 9 years ago

  • Subject changed from cn.listObjects slicing broken to cn.listObjects slicing broken in production
  • File parseObjectList.plMagnifier added
  • File parseORCresults.txtMagnifier added
  • Category set to Metacat
  • Assignee set to Ben Leinfelder

#2 Updated by Ben Leinfelder almost 9 years ago

  • Status changed from New to In Progress

Now ordering the results from the db by identitifer.
This is fine for a static system, but when objects are continually being added, there's really no guarantee that the slicing won't then include additional identifiers in the results and throw off the original paging counts.

This will be included in the Metacat 2.0.2 release (currently in RC testing)

#3 Updated by Rob Nahf almost 9 years ago

if the database table contained a sequence # field, then additional entries would be tacked onto the end of the list, and it would be possible for an iterating listObjects routine to pick up those new entries with the final iteration. The downside compared to the current fix is that clients are more likely to want to sort after retrieval.

#4 Updated by Ben Leinfelder almost 9 years ago

  • Position deleted (3)
  • Target version changed from Sprint-2012.27-Block.4.2 to Sprint-2012.29-Block.4.3
  • Position set to 1

#5 Updated by Ben Leinfelder almost 9 years ago

We'd then need a way for all CNs to have the same sequence number for each pid since we never know which CN will actually return the next batch of objects.

Rob Nahf wrote:

if the database table contained a sequence # field, then additional entries would be tacked onto the end of the list, and it would be possible for an iterating listObjects routine to pick up those new entries with the final iteration. The downside compared to the current fix is that clients are more likely to want to sort after retrieval.

#6 Updated by Ben Leinfelder almost 9 years ago

Other than for diagnostic purposes, is CN.listObjects() ever used in a manner that requires exactly consistent paging? I can't think of another product or project that actually utilizes that service method. For the MN, sure, synchronization calls it all day long. But on the CN?

#7 Updated by Chris Jones almost 9 years ago

  • Target version changed from Sprint-2012.29-Block.4.3 to Sprint-2012.37-Block.5.3

#8 Updated by Dave Vieglais over 8 years ago

  • Milestone changed from None to CCI-1.1
  • Due date set to 2012-09-22
  • Start date set to 2012-09-09
  • translation missing: en.field_remaining_hours set to 0.0

Is this still an issue?

#9 Updated by Dave Vieglais over 8 years ago

  • Due date changed from 2012-09-22 to 2012-10-27
  • Target version changed from Sprint-2012.37-Block.5.3 to Sprint-2012.41-Block.6.1

#10 Updated by Ben Leinfelder over 8 years ago

  • Status changed from In Progress to Testing

needs to be tested on the dev cns

#11 Updated by Ben Leinfelder over 8 years ago

  • Due date changed from 2012-10-27 to 2012-11-10
  • Target version changed from Sprint-2012.41-Block.6.1 to Sprint-2012.44-Block.6.2

#12 Updated by Ben Leinfelder over 8 years ago

Added DB-based slicing to Metacat 2.0.5 so that it is not done in memory.

I think the final thing needed for this is an artificial id on each row of SM so that the slice order is guaranteed even when new content is added (always to the end of the list).

#13 Updated by Ben Leinfelder over 8 years ago

  • Target version changed from Sprint-2012.44-Block.6.2 to Sprint-2012.50-Block.6.4
  • Due date changed from 2012-12-07 to 2013-01-05

#14 Updated by Ben Leinfelder over 8 years ago

  • Status changed from Testing to Closed

Given the limits of our current CN architecture, we have done all we can to improve the performance of listObject slicing. And for paged requests that occur in rapid succession on a perfectly in-synch unchanging system the current paging mechanism will work (sorted by guid). But we have a volatile, imperfect system of multiple CNs on our hands and cannot guarantee consistent listObject slicing.

Please see the new task #3468 for a continuation of this issue.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)