[DUG] Strange Error in Android

Jolyon Smith jsmith at deltics.co.nz
Fri Dec 6 14:59:56 NZDT 2013


It's a bug alright - in the CheckEncoding function in Xml.XMLDoc.

I love the way the author of that function naively assumes that the 
complete encoding declaration consists of only the encoding attribute 
plus 12 characters.  That's 9 characters for "encoding=" plus 2 for the 
attribute value quotes and an extra one for an assumed trailing space.

It then chops out a chunk of the XML declaration to remove the encoding 
declaration, but due to the incorrect naive assumptions this results in 
an invalid XML declaration.  e.g. if the original XML declaration was:

<?xml version="1.0" encoding = "UTF-8" ?>

Then it becomes:

<?xml version="1.0"8" ?>

Sadly this isn't even a new bug.  The exact same naive mistake exists 
even in Delphi 2006 (and possibly earlier, but I don't have an older 
version to check right now).  I suspect that the problem stems from a 
misunderstanding of the formal specification of an attribute name/value 
pair in the XML specification which identifies it as:

   Name Eq AttrValue

Implying at first glance that there is and can be no white-space around 
the Eq (=).  Except that if you follow the link for the Eq entity, this 
is defined as

   S? = S?

i.e. "Eq" means "an equal sign optionally surrounded by any amount of 
white space"


The reason it doesn't occur if you load the XML via a file is that the 
code path for loading the XML varies according to whether it is loaded 
from a file, a stringlist, a simple string or a stream (which is a huge 
WTF in itself) and it appears to be only the simple string code path 
which performs this CheckEncoding() song and dance.

To add to the complication, in Delphi 2006 (and probably all pre-Unicode 
versions) the code path for a String is different again for a 
WideString.  With a String (ANSIString) being loaded via the stream 
mechanism, only a formally declared WideString (DOMString) triggers the 
problem in earlier versions pre-Unicode.

Frankly, this code is a mess and it's a miracle that it hasn't caused 
problems and been fixed long before now.


Jeremy Coulter wrote:
> <?xml version='1.0' encoding = 'UTF-8' ?>


More information about the Delphi mailing list