Bug #3514
CN solr search returns the 'text' field
100%
Description
The 'text' field in CN solr is used to facilitate searching, but is an unstructured amalgam of science metadata, is usually very long, and probably not very useful.
for example:
https://cn-dev.test.dataone.org/cn/v1/query/solr/?q=text:*&rows=1
With opening up Solr search as a public method, it would be better to make the 'text' field non-returnable by default. (Many clients will fail to specify return fields, and so get a lot of content.)
History
#1 Updated by Dave Vieglais about 12 years ago
the config for specifying the default field list is in d1-index-solrconfig.xml:
…
*,score
The * returns all fields and should be replaced with the list of fields that should be returned.
I think this might be preferable over making the field noon-returnable as there may be a use case for taking advantage of the full text returned for fine tuning the indexing process.
#2 Updated by Skye Roseboom almost 12 years ago
Currently it is necessary to return the text field - due to our strategy for modeling relationships in the index.
When the ORE document is indexed, each references document index record must be updated with the ORE information (resourceMap, documents, documentedBy) fields. If the full text field is not returned as part of the index record - then the original science metadata documents must be re-indexed off disk - rather than just writing the records back (with updated ORE data) - since the full text field cannot be regenerated from the index data - must be created using the original document text.
Given the above - Dave's solution looks preferable and should still allow indexing strategy to continue to work with updated query strings.
#3 Updated by Skye Roseboom almost 12 years ago
- Due date changed from 2013-01-23 to 2013-03-02
- Target version changed from 2013.2-Block.1.1 to 2013.8-Block.1.4
#4 Updated by Skye Roseboom almost 12 years ago
- Due date changed from 2013-03-02 to 2013-03-16
- Target version changed from 2013.8-Block.1.4 to 2013.10-Block.2.1
#5 Updated by Skye Roseboom almost 12 years ago
- Milestone changed from None to CCI-1.2
#6 Updated by Skye Roseboom almost 12 years ago
- Target version changed from 2013.10-Block.2.1 to 2013.14-Block.2.3
- Due date changed from 2013-03-16 to 2013-04-13
#7 Updated by Skye Roseboom almost 12 years ago
- Status changed from New to In Progress
#8 Updated by Skye Roseboom almost 12 years ago
- Status changed from In Progress to Testing
Updated deployment and test resources for the CN search index so that the search handler now contains a default 'fl' (field list) parameter. It lists all fields that are returnable except the (full) 'text' field.
Installed in the cn-dev environment.
#9 Updated by Skye Roseboom over 11 years ago
- Status changed from Testing to Closed
Deployed to all three prod CN. text field only returned if listed in the fl parameter.