Bug #7886
double xml-escaping allowed by encoding the value upon SolrField creation and SolrElementField serialization
100%
Description
There are two effective locations where xml-escaping can be performed - both in preparing field values for inclusion into a SolrDoc, and serialization of the SolrDoc. While the former location has the boolean controling escaping set to false, it is uncoordinated with downstream serialization.
Escaping is only important to the serialized XML, so keep escaping in the serialize method, and remove the option to set it or not - it always should be done.
Associated revisions
fixes #7886: removed potential source of double xml-escaping, and the ability to control whether or not escaping will happen upon serialization - there is no good reason NOT to escape illegal XML characters.
History
#1 Updated by Rob Nahf over 8 years ago
- Status changed from In Progress to Testing
- % Done changed from 30 to 50
removed optional escaping from SolrField, MergeSolrField, and CommonBaseSolrField classes, and also the isEscapeXml and setEscapeXml methods from SolrField and SolrElementField. Escaping illegal characters is done in every case now, using the StringEscapeUtils.escapeXml11 method.
I researched interactions with CDATA, and confirmed that CDATA characters need to follow the same rules as non-CData wrt/ character escaping, except the original 5 markup characters [<>&'"]
#2 Updated by Rob Nahf over 8 years ago
- Status changed from Testing to Closed
- % Done changed from 50 to 100
Applied in changeset d1-python:d1_python|r18298.