Task #8775
Make taxonomic rank fields in Solr index non-case-sensitive
30%
Description
In the current system, EML documents with taxonomic coverage get indexed into fields such as species
if they contain XML such as:
...snip <taxonomicCoverage> <taxonomicClassification> <taxonRankName>Species</taxonRankName> <taxonRankValue>Some species</taxonRankValue> ...snip
The field values are extracted using the XPath in In https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-eml-base.xml:
//taxonomicClassification/taxonRankValue[../taxonRankName="Species"]/text()
We ran into a case where the taxonRankName
had been entered as 'species' instead of 'Species' and we decided that the XPath is too restrictive and that the strictness is needless and surprising. This change should result in a slight but negligible decrease in performance.
- Change all EML taxonomy fields to also match the lowercase form of each taxonomic rank
- Check over other indexing field definitions related to taxonomy to make sure the above change is consistent
Associated revisions
Make taxonomic rank fields in Solr index non-case-sensitive (refs #8775)
Make taxonomic rank fields in Solr index non-case-sensitive (refs #8775)
History
#1 Updated by Bryce Mecum over 5 years ago
- % Done changed from 0 to 30
- Status changed from New to In Progress
Dropped this fix, with a backing regression test in r19516.
I didn't mark this as done yet because I wasn't sure if this needs to get targeted for an upcoming CCI release.