LowerCase vs. UnicodeLowerCase

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

LowerCase vs. UnicodeLowerCase

LacaK
Hi,

I have this simple program (on Windows):

uses
   Classes, SysUtils;

var
   US: UnicodeString;

begin
   US := 'ÁÉÍ';
   writeln(LowerCase(US)); // prints ÁÉÍ
   writeln(UnicodeLowerCase(US)); // prints áéí
end;

Why first LowerCase() does nothing while second wroks as expected?

Looking at sources there is UnicodeString version of LowerCase() and in
both cases is called widestringmanager.LowerUnicodeStringProc ...

Aha ... (I got it while writing this email)
There is Lowercase(Const S : UnicodeString) : UnicodeString; also in
System and also in SysUtils !
And LowerCase in SysUtils works only for A-Z while in System works with
widestringmanager

So
   writeln(System.LowerCase(US)); // prints áéí

Now I understand what happens, but isn't it bit confusing ?
(as lot of programs has SysUtils in uses clause, so SysUtils version
overrides System version ?)

-Laco.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Graeme Geldenhuys-6
On 2016-10-05 08:11, LacaK wrote:
> Now I understand what happens, but isn't it bit confusing ?

Yup, I would agree, and if possible, one should be removed.

Regards,
  Graeme

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Virgo Pärna
In reply to this post by LacaK
On Wed, 5 Oct 2016 09:11:51 +0200, LacaK <[hidden email]> wrote:
>    writeln(LowerCase(US)); // prints ÁÉÍ
>    writeln(UnicodeLowerCase(US)); // prints áéí
>
> Why first LowerCase() does nothing while second wroks as expected?
>

        SysUtils.LowerCase is only supposed to work on ascii characters.
From manual:
-----------------------------------------------------------------------
LowerCase returns the lowercase equivalent of S. Ansi characters are not
taken into account, only ASCII codes below 127 are converted. It is
completely equivalent to the lowercase function of the system unit, and
is provided for compatibility only.
-----------------------------------------------------------------------
        But that last sentence suggests that System.LowerCase should work same
way. But it's manual entry says:
-----------------------------------------------------------------------
Lowercase returns the lowercase version of its argument C. If its
argument is a string, then the complete string is converted to
lowercase. The type of the returned value is the same as the type of the
argument.
-----------------------------------------------------------------------
        So, which version is correct one? Delphi LowerCase also only
works on ascii characters.

--
Virgo Pärna
[hidden email]

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

LacaK
One problem is, that documentation of SysUtils.LowerCase is not correct
regarding to System.LowerCase
http://www.freepascal.org/docs-html/rtl/sysutils/lowercase.html :
"... It is completely equivalent to the *lowercase function of the
system unit*"

But another problem, which I have pointed out, is that UnicodeString
version of SysUtils.LowerCase hides System.LowerCase,
while SysUtils.LowerCase converts only A..Z but System.LowerCase uses
widestringmanager so converts also accented characters.
So System.LowerCase is superset of SysUtils.LowerCase, but in most
programs is hidden by SysUtils.

What seems to me logical, is oposite, that System unit has basic
functionality, which is extended by SysUtils.
Now System unit has extended capability that is hidden by basic
functionality of SysUtils ;-)

-Laco.

>>     writeln(LowerCase(US)); // prints ÁÉÍ
>>     writeln(UnicodeLowerCase(US)); // prints áéí
>>
>> Why first LowerCase() does nothing while second wroks as expected?
>>
> SysUtils.LowerCase is only supposed to work on ascii characters.
>  From manual:
> -----------------------------------------------------------------------
> LowerCase returns the lowercase equivalent of S. Ansi characters are not
> taken into account, only ASCII codes below 127 are converted. It is
> completely equivalent to the lowercase function of the system unit, and
> is provided for compatibility only.
> -----------------------------------------------------------------------
> But that last sentence suggests that System.LowerCase should work same
> way. But it's manual entry says:
> -----------------------------------------------------------------------
> Lowercase returns the lowercase version of its argument C. If its
> argument is a string, then the complete string is converted to
> lowercase. The type of the returned value is the same as the type of the
> argument.
> -----------------------------------------------------------------------
> So, which version is correct one? Delphi LowerCase also only
> works on ascii characters.
>
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Santiago A.
In reply to this post by Graeme Geldenhuys-6
El 05/10/2016 a las 9:23, Graeme Geldenhuys escribió:
> On 2016-10-05 08:11, LacaK wrote:
>> Now I understand what happens, but isn't it bit confusing ?
> Yup, I would agree, and if possible, one should be removed.

Yes and no.

Yes, probably in this case one of them should be removed, but the reason
of why it's confusing is still there.

If you declare two functions with identical name in the same unit, you
get a "redeclared" error, unless you add the overload directive.
If you declare two functions with identical name in the different units,
the second "used" unit overrides, hides, the first declaration without
any warning.

This behavior sometimes leads unexpected compiler errors that stops you
saying "what the...#@&?" for some time. Sometimes minutes, sometimes
until people get an answer from a forum . Handle types are a common
case. And compiler errors are the nice case, if there are not compiler
errors because the both declarations are compatibles, you get unexpected
behaviors that drives you nuts, like this case

I think that "automatic overriding" is a wrong design from the first
turbo pascal and should be fixed. The need of overriding system
functions like memory managers is a corner case to treat, not a reason
to not solve the unexpected hide of declarations.


--
Saludos

Santiago A.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Marco van de Voort
In our previous episode, Santiago A. said:
> I think that "automatic overriding" is a wrong design from the first
> turbo pascal and should be fixed. The need of overriding system
> functions like memory managers is a corner case to treat, not a reason
> to not solve the unexpected hide of declarations.

Well, the problem mainly exists because you don't have any control over
importing symbols from an unit other than "all".

A different route would to fix that (like Modula2 or "qualified" importing
like in again M2 but also GPC).

If you only use a few functions from an unit with many identifiers (like
the various OS headers units e.g.  windows), you can go to qualified for
from.. importing.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Santiago A.
El 11/10/2016 a las 10:03, Marco van de Voort escribió:

> In our previous episode, Santiago A. said:
>> I think that "automatic overriding" is a wrong design from the first
>> turbo pascal and should be fixed. The need of overriding system
>> functions like memory managers is a corner case to treat, not a reason
>> to not solve the unexpected hide of declarations.
> Well, the problem mainly exists because you don't have any control over
> importing symbols from an unit other than "all".
>
> A different route would to fix that (like Modula2 or "qualified" importing
> like in again M2 but also GPC).
My two cents:

Whenever there is a conflict, an ambiguity, you must full qualify the
identifier otherwise the compiler will complain.
For the special cases, when you need to hide the declaration, you could
use the directive "override" for global declarations (types, functions,
procedures, const, vars) , so the compiler knows that it must ignore
previous declaration if the identifier is not full qualified. Just like
it does now.

For compatibility issues, you could add a compiler check, i.e.
{$CHECK_AMBIGUITY ON/OFF}

--
Saludos

Santiago A.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Marcos Douglas B. Santos
On Tue, Oct 11, 2016 at 7:21 AM, Santiago A. <[hidden email]> wrote:

> My two cents:
>
> Whenever there is a conflict, an ambiguity, you must full qualify the
> identifier otherwise the compiler will complain.
> For the special cases, when you need to hide the declaration, you could
> use the directive "override" for global declarations (types, functions,
> procedures, const, vars) , so the compiler knows that it must ignore
> previous declaration if the identifier is not full qualified. Just like
> it does now.
>
> For compatibility issues, you could add a compiler check, i.e.
> {$CHECK_AMBIGUITY ON/OFF}

You're, indeed that's a good idea.

But what I do, for years, is import my units using an order by
priority, for example:
<fpc units>
<laz units>
<3rd units>
<my libs units>
<my domain units>
...and so on.

The compiler will give me the identifier from below to above.

Best regards,
Marcos Douglas
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Santiago A.
El 11/10/2016 a las 14:15, Marcos Douglas escribió:
>
> But what I do, for years, is import my units using an order by
That's what everybody does, otherwise you run into a troubles.

The problem is when you change the order unnoticed
The problem is when you use non standard packages and you don't know
what's inside, and there is a function or type that hides a well known
function.

Fortunately it doesn't happen many times, But when it happens, it can
keep you hunting phantoms for a long time. And that's what I think
should be solved.

(In delphi) Once I came across a unit with a procedure
move(Source:TDataxxx;var Dest:TDataxxx), and I was using the standard
move(var source,dest;count:integer), the compiler said mismatch type
arguments. I wasted 15 minutes checking help etc. scratching my head
with the same error. I've read several times in blogs questions about
errors related to THandle because it's not unusual that a unit declares
a type THandle. In fact, it's logical that different units uses same
names for similar concepts.

Being aware of the order is a workaround that works, but it doesn't mean
that relying on the order is the best idea.


--
Saludos

Santiago A.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Marcos Douglas B. Santos
On Tue, Oct 11, 2016 at 11:57 AM, Santiago A. <[hidden email]> wrote:
> Being aware of the order is a workaround that works, but it doesn't mean
> that relying on the order is the best idea.

Of course, no doubts about it.

Regards,
Marcos Douglas
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: LowerCase vs. UnicodeLowerCase

Marco van de Voort
In reply to this post by Santiago A.
In our previous episode, Santiago A. said:
> >
> > A different route would to fix that (like Modula2 or "qualified" importing
> > like in again M2 but also GPC).
> My two cents:
>
> Whenever there is a conflict, an ambiguity, you must full qualify the
> identifier otherwise the compiler will complain.

I meant: If an identifier is common, you can then force it to be qualified in the
original unit, and solve it for all cases.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal