[DUG] Upgrading to XE - Unicode strings questions
John Bird
johnkbird at paradise.net.nz
Tue Nov 23 15:35:47 NZDT 2010
My main remaining question is the best way to handle code that up to now
looked like:
for i:=1 to length(string1) do
begin
DoSomethingWithOneChar(string1[i]);
end;
If I got the gist correctly, string1[i] is one unicode character, but
length(string1) is the number of codepoints in the string and not the number
of characters. This is gonna be confusing!
Other comments:
Comment 1 - I saw quite a few commentators say that they in general approved
of the way that the unicode had been implemented - everything that was ansi
string before is now unicode consistently throughout the whole language and
IDE, and in the main the only code that needs altering is where Delphi is
communicating outside the standard language: ie
-DLL calls
-SavetoFile and LoadFromFile and other file access - even here smart
defaults have been put in to retain expected behaviour.
-Sending strings to COM/TCP etc you might need to convert to get the kind
expected
-Database fields - usually handled by making sure the right encoding is
sent.
Comment 2 - The worst inconveniences are for those who have already tried to
do some unicode type processing using WideChar, and the functions that were
used for these. Undoing these changes is usually the best way to cater
for unicode. Also some of the routines introduced then have horribly
confusing names, like AnsiPos which is for searching widechars and is
still what should be used for searching. It seems to me that some
identical routines should be introduced - eg called UnicodePos(.....)
just so that those who are new to Unicode can use at least a consistently
named set of tools. I would probably make routines named like this which
I use just to be clear.
Comment 3 - I see a few people arguing that there should have been a
compiler switch to allow compiling to ansistring or unicode string
depending on the compiler switch, to ease converting people to D2009/XE.
There are merits either way on this - in the long term if everyone is going
to have to live in a unicode world then its probably better to bite the
bullet and be made to convert code as eventually you cannot escape it. In
such a case a simpler compiler and VCL is a big advantage. This is sort
of related to being able to cross compile to 64 bit, iPhone, Android -
whatever way makes it easy to have these forward looking options. The
quite stark reality is that in 5 years it looks like much but not all
commercial software will be running on Windows, its likely to be a mix of
Web/iPhone/Android/GoogleOS/MacOS so the forwards portability of compiling
Delphi for different environments is way more important than whether it
should be able to do Strings as AnsiString.
Comment 4 - Has anyone at Embarcadero considered 2 ways to make cross
platform? option A is to go for a native compiler for different OS's -
best if can be done. option B is the Java route - compile to intermediate
code for a Delphi Virtual Machine which can run interpreted with a runtime
on many OS's. Could be called the Delphi Virtual VCL Machine. The reason
why this might be a good way to go is that Delphi was originally designed as
a teaching language - ie formally very strongly typed and formally well
structured language- it could be about the best candidate around for
generalised compiling and a simple cross platform runtime. Also with
Java now owned by Oracle there is questions over if it has such a bright
future and there is room for another similar approach. DotNet is a similar
idea too, but will only ever really be Windows. A Delphi Virtual Machine
might not matter too much if its slower if its portable.
[But I digress - The last point is way off topic for Unicode however]
Comment and question 5 - What is the status of Free Pascal/Lazarus wrt to
unicode? Does Delphi XE code port or not to Free Pascal? Its an issue
to consider as well.
More information about the Delphi
mailing list