double xml-escaping allowed by encoding the value upon SolrField creation and SolrElementField serialization
There are two effective locations where xml-escaping can be performed - both in preparing field values for inclusion into a SolrDoc, and serialization of the SolrDoc. While the former location has the boolean controling escaping set to false, it is uncoordinated with downstream serialization.
Escaping is only important to the serialized XML, so keep escaping in the serialize method, and remove the option to set it or not - it always should be done.
#1 Updated by Rob Nahf almost 4 years ago
- Status changed from In Progress to Testing
- % Done changed from 30 to 50
removed optional escaping from SolrField, MergeSolrField, and CommonBaseSolrField classes, and also the isEscapeXml and setEscapeXml methods from SolrField and SolrElementField. Escaping illegal characters is done in every case now, using the StringEscapeUtils.escapeXml11 method.
I researched interactions with CDATA, and confirmed that CDATA characters need to follow the same rules as non-CData wrt/ character escaping, except the original 5 markup characters [<>&'"]