Task #7771
Story #7769: Improve the performance on solr index
Use multiple threads to index objects
100%
Description
Worked on generating index on multiple threads:
while (getNextTask() != null ) {
process(nextTask); }
while (getNextTask() != null ) {
exceutor.submit(nextTask);
}
Fortunately there is no shared class variables can’t be static. So we don’t need to lock them.
Handling resource maps has a race issue:
R1: s1 documents d1
R2: s1 documents d2
At the beginning, there is no documents and resourceMap on the solr index of s1.
Sequence:
After processing R1 and the solr index of s1:
documents d1
resourceMap R1
After processing R2 and the solr index of s1:
documents d1
documents d2
resourceMap R1
resourceMap R2
Concurrent:
1. Both threads to handle R1 and R2 read a copy without documents and resourceMap information.
2. Thread 1 handling R1 finished first and send it to the solr server:
documents d1
resourceMap R1
3. Thread 2 handling R2 finished later and send it to the solr server. It will overwrite what thread 1 did. So the eventual result will be:
documents d2
resourceMap R2
Wrong!
Handle resource map objects sequentially? no.
Proposed Solution:
1. Maintain a set containing the relevant objects’ id (s1 and d1) when it processes a resource map
2. Before we process a resource map, check its relevant ids are on the set. If they are on the set, please wait and try again later (with max attempts); otherwise, put those ids on the set and start to process it.
3. The processing is done, remove those ids from the set
ConcurrentSkipListSet vs HashSet + lock vs Hash+ synchronize
Related issues
History
#1 Updated by Jing Tao over 8 years ago
- Status changed from New to In Progress
- Category set to d1_cn_index_processor
- % Done changed from 0 to 30
#2 Updated by Dave Vieglais about 8 years ago
- % Done changed from 30 to 100
- Status changed from In Progress to Closed
#3 Updated by Rob Nahf over 7 years ago
- Related to Story #8172: investigate atomic updates for some solr updates added
#4 Updated by Rob Nahf over 7 years ago
- Related to Story #8173: add checks for retrograde systemMetadata changes added