Project

General

Profile

Story #7939

Indexing is too slow, especially with large packages

Added by Dave Vieglais about 8 years ago. Updated almost 7 years ago.

Status:
Rejected
Priority:
High
Assignee:
-
Category:
d1_indexer
Target version:
Start date:
2016-11-25
Due date:
% Done:

30%

Story Points:

Description

It appears that the indexing process is far too slow to keep up with content additions and changes. Since the version 2.3 upgrade which includes support for multiple indexing threads, the performance appears improved, but it falls far short of what is needed to provide reasonable currency.

In particular, it appears that large resource maps such as those provided by the ARCTIC node are very slow to evaluate.

Some optimization may be possible without major refactoring of the indexing process.

A few possible options:

  1. Check that changes to properties such as ownership do not trigger an entire re-index of the package. If permissions change, then there is no need to reindex the entire package since other properties are unchanged. This should be in place now since content is immutable, and only mutable metadata fields should be updated.

  2. Dedicate a single thread to resource map processing, expanding to more threads when there is no backlog of other content. This would allow efficient processing of content on which the resource map indexing may depend.

  3. Refactor the index so that resource maps may be processed independently, without the need for all other objects to be loaded and processed.

  4. Refactor the indexing of resource maps so that a partially processed resource map is persisted so that processing may continue as content becomes available rather than starting from scratch each time.


Related issues

Related to Infrastructure - Story #8175: Solr crashing with out of memory error Closed 2017-09-05

History

#1 Updated by Dave Vieglais about 8 years ago

  • Target version changed from CCI-2.3.1 to CCI-2.4.0

#2 Updated by Dave Vieglais over 7 years ago

  • Category set to d1_indexer

#3 Updated by Dave Vieglais over 7 years ago

  • Related to Story #8175: Solr crashing with out of memory error added

#4 Updated by Dave Vieglais almost 7 years ago

  • Sprint set to Infrastructure backlog

#5 Updated by Dave Vieglais almost 7 years ago

  • % Done changed from 0 to 30
  • Status changed from New to In Progress

#6 Updated by Dave Vieglais almost 7 years ago

  • Status changed from In Progress to Rejected

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)