optimization for strlicomp()

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

optimization for strlicomp()

Free Pascal - General mailing list

function strlicomp(str1,str2 : pwidechar;l : SizeInt) : SizeInt;
  var
   counter: sizeint;
   c1, c2: char;
  begin
    counter := 0;
    if l=0 then
      begin
        strlicomp := 0;
        exit;
      end;
    repeat
      c1:=simplewideupcase(str1[counter]);
      c2:=simplewideupcase(str2[counter]);
      if (c1=#0) or (c2=#0) then break;
      inc(counter);
    until (c1<>c2) or (counter>=l);
    strlicomp:=ord(c1)-ord(c2);
  end;

suggestions:

a)
      if (c1=#0) or (c2=#0) then break;
->
 if c1=#0 then break;
 if c2=#0 then break;
 
b)
embed upcase to avoid CALL
      c1:=simplewideupcase(str1[counter]);
      c2:=simplewideupcase(str2[counter]);
->
 c1:= str1[counter];
 c2:= str2[counter];
 if (c1>='a') and (c1<='z') then dec(c1, 32);
 if (c2>='a') and (c2<='z') then dec(c2, 32);

c) not sure..
we have 2 same diff's
c1<>c2
and
ord(c1)-ord(c2)

Alexey Torgashin

_______________________________________________
fpc-pascal maillist  -  [hidden email]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: optimization for strlicomp()

Michael Van Canneyt


On Sun, 31 May 2020, Alexey Tor. via fpc-pascal wrote:

> function strlicomp(str1,str2 : pwidechar;l : SizeInt) : SizeInt;
>   var
>    counter: sizeint;
>    c1, c2: char;
>   begin
>     counter := 0;
>     if l=0 then
>       begin
>         strlicomp := 0;
>         exit;
>       end;
>     repeat
>       c1:=simplewideupcase(str1[counter]);
>       c2:=simplewideupcase(str2[counter]);
>       if (c1=#0) or (c2=#0) then break;
>       inc(counter);
>     until (c1<>c2) or (counter>=l);
>     strlicomp:=ord(c1)-ord(c2);
>   end;
>
> suggestions:
>
> a)
>       if (c1=#0) or (c2=#0) then break;
> ->
>  if c1=#0 then break;
>  if c2=#0 then break;
Why do you think this is faster ? If boolean shortcut evaluation is used, it
amounts to the same.

>
> b)
> embed upcase to avoid CALL
>       c1:=simplewideupcase(str1[counter]);
>       c2:=simplewideupcase(str2[counter]);
> ->
>  c1:= str1[counter];
>  c2:= str2[counter];
>  if (c1>='a') and (c1<='z') then dec(c1, 32);
>  if (c2>='a') and (c2<='z') then dec(c2, 32);
I think the correct solution is to use a correct widestring upcase, not the simple
one.

>
> c) not sure..
> we have 2 same diff's
> c1<>c2
> and
> ord(c1)-ord(c2)

That can be changed, but the gain is almost zero as the second one is
outside of the loop anyway..

Michael.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: optimization for strlicomp()

Marco van de Voort-2

Op 2020-06-01 om 09:22 schreef Michael Van Canneyt:
>
> I think the correct solution is to use a correct widestring upcase,
> not the simple
> one.
>
I think he means to have a simple shortcut for some known <127
codepoints, to cut widestring uppercase for long trivial routines. It's
not a bad idea IMHO.


_______________________________________________
fpc-pascal maillist  -  [hidden email]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: optimization for strlicomp()

Jonas Maebe-3
In reply to this post by Michael Van Canneyt
On 01/06/2020 09:22, Michael Van Canneyt wrote:

>
>
> On Sun, 31 May 2020, Alexey Tor. via fpc-pascal wrote:
>
>> b)
>> embed upcase to avoid CALL
>>       c1:=simplewideupcase(str1[counter]);
>>       c2:=simplewideupcase(str2[counter]);
>> ->
>>  c1:= str1[counter];
>>  c2:= str2[counter];
>>  if (c1>='a') and (c1<='z') then dec(c1, 32);
>>  if (c2>='a') and (c2<='z') then dec(c2, 32);
>
> I think the correct solution is to use a correct widestring upcase, not
> the simple one.

No, strlicomp is defined as not taking internationalisation into
account:
http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.StrLIComp


Jonas
_______________________________________________
fpc-pascal maillist  -  [hidden email]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: optimization for strlicomp()

Benito van der Zander
In reply to this post by Free Pascal - General mailing list
Hi,

I had a custom case-insensitive compare function, and it was very slow.

Then I benchmarked it and noticed, case-insensitiveness is rarely needed in practice. 

Then I changed it to something like:


      c1:=str1[counter];
      c2:=str2[counter];
      if c1 <> c2 then begin
          c1:=simplewideupcase(c1);
          c2:=simplewideupcase(c2);
      end;

and it was like 10 times faster

Best,
Benito 
On 31.05.20 14:18, Alexey Tor. via fpc-pascal wrote:

function strlicomp(str1,str2 : pwidechar;l : SizeInt) : SizeInt;
  var
   counter: sizeint;
   c1, c2: char;
  begin
    counter := 0;
    if l=0 then
      begin
        strlicomp := 0;
        exit;
      end;
    repeat
      c1:=simplewideupcase(str1[counter]);
      c2:=simplewideupcase(str2[counter]);
      if (c1=#0) or (c2=#0) then break;
      inc(counter);
    until (c1<>c2) or (counter>=l);
    strlicomp:=ord(c1)-ord(c2);
  end;

suggestions:

a)
      if (c1=#0) or (c2=#0) then break;
->
 if c1=#0 then break;
 if c2=#0 then break;
 
b)
embed upcase to avoid CALL
      c1:=simplewideupcase(str1[counter]);
      c2:=simplewideupcase(str2[counter]);
->
 c1:= str1[counter];
 c2:= str2[counter];
 if (c1>='a') and (c1<='z') then dec(c1, 32);
 if (c2>='a') and (c2<='z') then dec(c2, 32);

c) not sure..
we have 2 same diff's
c1<>c2
and
ord(c1)-ord(c2)

Alexey Torgashin

_______________________________________________
fpc-pascal maillist  -  [hidden email]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

_______________________________________________
fpc-pascal maillist  -  [hidden email]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal