[DUG] Unicode [redux]
Jolyon Smith
jsmith at deltics.co.nz
Fri Oct 16 08:58:56 NZDT 2009
> You're fast putting me off upgrading.
I think all I'm doing is voicing the reasons that many people are *already* put off upgrading.
Instead of threatening to withdraw upgrade rights I think Embarcadero could improve upgrade take-up by simply offering a free copy of Delphi 2007 to everyone upgrading to Delphi 2010 from D2006 or earlier.
> Looking at Delphi 6 I see the problem. AnsiUpperCase is defined as
> function AnsiUpperCase(const S: string): string;
>
> when it should have been defined as
> function AnsiUpperCase(const S: Ansistring): Ansistring;
It *was* defined as ANSIUppercase(ANSIString) - remember that "String" back then was synonymous for ANSIString.
The mistake was created when they *left* it as String in Delphi 2009. At that point it should have been changed to ANSIString explicitly, not left as String which is now *UnicodeString*. By not changing it, they did actually change it. If you follow me.
The reason they left is as String was presumably that people calling ANSIUppercase() are doing so because they are working with MBCS strings, so ANSIUppercase() needs (if you are thinking that way) to behave correctly with extended character sets - such as Unicode.
So if you fall into that group of people who have a historical use and experience of ANSIUppercase() then having it behave as a Unicode function makes sense.
But if you're a newbie - either to Delphi or to extended character sets - then what it does is create confusion.
> I don't see how Uppercase(UnicodeString) or Uppercase(WideString)
> makes any sense?
To my mind it's quite simple
Uppercase(UnicodeString) - should behave as ANSIUppercase() does (i.e. MBCS/Unicode support)
Uppercase(ANSIString) - should behave as Uppercase() does (i.e. simple ASCII case conversion)
Uppercase(WideString) - is needed simply to enable you to work with WideStrings without incurring unnecessary conversions from UnicodeString to WideString on the way in and out.
ANSIUpperCase() should be deprecated. People working with MBCS ANSI strings are now better off working with UnicodeString. The same probably holds for just about all ANSI() string routines I think (although I've not done an exhaustive investigation into them so there may be some that still make sense).
In fact, imho, *all* the various string routines scattered around SysUtils, StrUtils and the new Character unit should be deprecated and a properly thought out framework of string support units put together to replace them.
(but please, none of those ridiculous .NET-alike class-contained function libraries - Delphi supports first class functions and doesn't need cumbersome "container" classes)
> I'm also wondering what the {$H–} compiler directive does now?
> Does it turn a Unicode string into a ShortString?
The answer appears to be: nothing (anymore).
With "Long Strings By Default" = FALSE ($H-}
S: String[1]; >> element size = 1
S: String; >> element size = 2
Which isn't what I'd expect, or is it what I *should* expect?
Incidentally, the help for the $H directive in Delphi 2010 still says:
"By default {$H+}, Delphi defines the generic string type to be the long AnsiString."
(headslap)
More information about the Delphi
mailing list