Project

General

Profile

Task #8775

Make taxonomic rank fields in Solr index non-case-sensitive

Added by Bryce Mecum over 2 years ago. Updated over 2 years ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
2019-03-11
Due date:
2019-03-12
% Done:

30%

Milestone:
None
Product Version:
*
Story Points:
Sprint:

Description

In the current system, EML documents with taxonomic coverage get indexed into fields such as species if they contain XML such as:

...snip
<taxonomicCoverage>
  <taxonomicClassification>
    <taxonRankName>Species</taxonRankName>
    <taxonRankValue>Some species</taxonRankValue>
...snip

The field values are extracted using the XPath in In https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-eml-base.xml:

//taxonomicClassification/taxonRankValue[../taxonRankName="Species"]/text()

We ran into a case where the taxonRankName had been entered as 'species' instead of 'Species' and we decided that the XPath is too restrictive and that the strictness is needless and surprising. This change should result in a slight but negligible decrease in performance.

  • Change all EML taxonomy fields to also match the lowercase form of each taxonomic rank
  • Check over other indexing field definitions related to taxonomy to make sure the above change is consistent

Associated revisions

Revision 19516
Added by Bryce Mecum over 2 years ago

Make taxonomic rank fields in Solr index non-case-sensitive (refs #8775)

Revision 19516
Added by Bryce Mecum over 2 years ago

Make taxonomic rank fields in Solr index non-case-sensitive (refs #8775)

History

#1 Updated by Bryce Mecum over 2 years ago

  • % Done changed from 0 to 30
  • Status changed from New to In Progress

Dropped this fix, with a backing regression test in r19516.

I didn't mark this as done yet because I wasn't sure if this needs to get targeted for an upcoming CCI release.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)