string concatenation speed

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

string concatenation speed

Marc Santhoff
Hi,

in an application of mine occurs a lot of string separation and
re-concatenation. Since it is using masses of ANSI strings this is some
performance problem.

The strings get stored and loaded from  TStringList's and concatenated
by simply using '+' and separated with the function copy().

What I'd like to know is: What's are fastest ways of handling ANSI
strings?

Since I'm not good at reading assembler code I have to ask ... and yes,
profiling is planned but not for tomorrow. ;)

TIA,
Marc



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: string concatenation speed

Marco van de Voort
> in an application of mine occurs a lot of string separation and
> re-concatenation. Since it is using masses of ANSI strings this is some
> performance problem.
>
> The strings get stored and loaded from  TStringList's and concatenated
> by simply using '+' and separated with the function copy().

You could try to avoid repeated setlengths. However this would require
two passes, something like

tot:=0;
for i:=0 to strlst.count-1 do  inc(tot,length(strlst[i]));
setlength(targetstring,tot);
j:=1;
for i:=0 to strlst.count-1 do
  begin
    move (targetstring[j],strlst[i]);
    inc(j,length(strlst[i]));
  end;
 
> What I'd like to know is: What's are fastest ways of handling ANSI
> strings?

- avoid copying of strings if possible
- operate on pchar level in speed dependant parts.

> Since I'm not good at reading assembler code I have to ask ... and yes,
> profiling is planned but not for tomorrow. ;)

Posting more code would help too.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: string concatenation speed

Michael Van Canneyt
In reply to this post by Marc Santhoff


On Tue, 21 Jun 2005, Marc Santhoff wrote:

> Hi,
>
> in an application of mine occurs a lot of string separation and
> re-concatenation. Since it is using masses of ANSI strings this is some
> performance problem.
>
> The strings get stored and loaded from  TStringList's and concatenated
> by simply using '+' and separated with the function copy().
>
> What I'd like to know is: What's are fastest ways of handling ANSI
> strings?

It's hard to say in general; You would have to supply some example code.

if you're doing things like
  For I:=X to Y do
   S:=S+L[i];  // S string, L list.

then
  It might be better to do

  Len:=0;
  For I:=X to Y do
    Inc(Len,Length(L[i]));  // S string, L list.
  SetLength(S,Len);
  P:=1;
  For I:=X to Y do
    begin
    T:=L[i];
    Len:=Length(T);
    Move(T[1],S[P],Len);
    inc(P,Len)
    end;

This will avoid a lot of calls to uniquestring, get/setlength etc.

Also, keep in mind that getting the I-th string from a list is an expensive operation.

So it is better to do
   T:=L[i];
   S:=Copy(T,X,Length(T)-X);
than to do
   S:=Copy(L[i],X,Length(L[i])-X);

The first option will call GetString only once, the second will call it twice.

This is the kind of thing you should pay attention to.

Michael.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: string concatenation speed

Marc Santhoff
Am Dienstag, den 21.06.2005, 19:46 +0200 schrieb Michael Van Canneyt:

>
> On Tue, 21 Jun 2005, Marc Santhoff wrote:
>
> > Hi,
> >
> > in an application of mine occurs a lot of string separation and
> > re-concatenation. Since it is using masses of ANSI strings this is some
> > performance problem.
> >
> > The strings get stored and loaded from  TStringList's and concatenated
> > by simply using '+' and separated with the function copy().
> >
> > What I'd like to know is: What's are fastest ways of handling ANSI
> > strings?
>
> It's hard to say in general; You would have to supply some example code.
>
> if you're doing things like
>   For I:=X to Y do
>    S:=S+L[i];  // S string, L list.

Somewhat similar to that:

if (sl[4] <> '') then
                if ((sl[4][1] = '"') AND (sl[4][length(sl[4])] = '"'))
                        then BaseColor := copy(sl[4],2,length(sl[4])-2) else BaseColor :=
sl[4];

where BaseColor is a field of a class-type object, sl is a TStringList.
I have to look for double quotes here and cut them out if necessary.

Sequences like this are heavily used in loading CSV files. This takes
rather long on a 'small' cpu (Geode 300MHz).

> then
>   It might be better to do
>
>   Len:=0;
>   For I:=X to Y do
>     Inc(Len,Length(L[i]));  // S string, L list.
>   SetLength(S,Len);
>   P:=1;
>   For I:=X to Y do
>     begin
>     T:=L[i];
>     Len:=Length(T);
>     Move(T[1],S[P],Len);
>     inc(P,Len)
>     end;
>
> This will avoid a lot of calls to uniquestring, get/setlength etc.

Another snippet esecially for concatenation is this:

result := IntToStr(ID) + SEP +
                IntToStr(ID_Customer) + SEP +
                QT + Treatment + QT + SEP +
                DateToStr(Date) + SEP +
                QT + BaseColor + QT + SEP +
... and so on for approx. 15 fields

I think I'll try your technique here ...

> Also, keep in mind that getting the I-th string from a list is an expensive operation.
>
> So it is better to do
>    T:=L[i];
>    S:=Copy(T,X,Length(T)-X);
> than to do
>    S:=Copy(L[i],X,Length(L[i])-X);
>
> The first option will call GetString only once, the second will call it twice.

This will help me, I'm sure. :)

Thank you so far,
Marc



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: string concatenation speed

Marc Santhoff
In reply to this post by Marco van de Voort
Am Dienstag, den 21.06.2005, 19:21 +0200 schrieb Marco van de Voort:

> > in an application of mine occurs a lot of string separation and
> > re-concatenation. Since it is using masses of ANSI strings this is some
> > performance problem.
> >
> > The strings get stored and loaded from  TStringList's and concatenated
> > by simply using '+' and separated with the function copy().
>
> You could try to avoid repeated setlengths. However this would require
> two passes, something like
>
> tot:=0;
> for i:=0 to strlst.count-1 do  inc(tot,length(strlst[i]));
> setlength(targetstring,tot);
> j:=1;
> for i:=0 to strlst.count-1 do
>   begin
>     move (targetstring[j],strlst[i]);
>     inc(j,length(strlst[i]));
>   end;
>  

So in general reserving buffer space and copying all string in and
finally setting the length would be the way to go here, if I understand
correctly.

> Posting more code would help too.

See my other mail.

Thanks,
Marc



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: string concatenation speed

Marco van de Voort
>
> So in general reserving buffer space and copying all string in and
> finally setting the length would be the way to go here, if I understand
> correctly.

Yes, and avoid repeated use of stringlist[x]. First assign to a local var,
then use for length() and normal use, as Michael already noted.
 

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: string concatenation speed

Leonhard Holz
In reply to this post by Marc Santhoff
Hi,

>
> if (sl[4] <> '') then
> if ((sl[4][1] = '"') AND (sl[4][length(sl[4])] = '"'))
> then BaseColor := copy(sl[4],2,length(sl[4])-2) else BaseColor :=
> sl[4];

You can tweak this a little by storing sl[4] and length(sl[4]) in local
vars, but the whole approach tends to be slow. If you want it faster,
dont allocate memory ("copy" & ":=" on the stack) and dont move string
data. This could be done by "translating" BaseColor to an int or so.
Either case you should not to use the StringList - read out BaseColor
(and the other fields) directly from the source, skipping the " at reading.

> Another snippet esecially for concatenation is this:
>
> result := IntToStr(ID) + SEP +
> IntToStr(ID_Customer) + SEP +
> QT + Treatment + QT + SEP +
> DateToStr(Date) + SEP +
> QT + BaseColor + QT + SEP +
> ... and so on for approx. 15 fields

Same as above - try to avoid the concatenation. What do you do with
result? If you write it to a file, write it directly. If you echo it
somewhere, echo it directly. If you pass it to another function, make it
  a record with pointers to the string data.

Leo

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: string concatenation speed

Marc Santhoff
Am Dienstag, den 21.06.2005, 21:19 +0200 schrieb Leonhard Holz:
> Hi,

Hi Leo,

> >
> > if (sl[4] <> '') then
> > if ((sl[4][1] = '"') AND (sl[4][length(sl[4])] = '"'))
> > then BaseColor := copy(sl[4],2,length(sl[4])-2) else BaseColor :=
> > sl[4];
>
> You can tweak this a little by storing sl[4] and length(sl[4]) in local
> vars, but the whole approach tends to be slow. If you want it faster,
> dont allocate memory ("copy" & ":=" on the stack) and dont move string
> data. This could be done by "translating" BaseColor to an int or so.
> Either case you should not to use the StringList - read out BaseColor
> (and the other fields) directly from the source, skipping the " at reading.

I'm trying local temporary vars next, time measuring code is already in.

> > Another snippet esecially for concatenation is this:
> >
> > result := IntToStr(ID) + SEP +
> > IntToStr(ID_Customer) + SEP +
> > QT + Treatment + QT + SEP +
> > DateToStr(Date) + SEP +
> > QT + BaseColor + QT + SEP +
> > ... and so on for approx. 15 fields
>
> Same as above - try to avoid the concatenation. What do you do with
> result? If you write it to a file, write it directly. If you echo it
> somewhere, echo it directly. If you pass it to another function, make it
>   a record with pointers to the string data.

Since I'm lazy and love the TStringList since Delphi 1 I do not write at
all myself but let the stringlist do it (saveToFile / loadFromFile). ;)

But I have to validate my habits anew in this case.

Thank you, I stand corrected,
Marc



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: string concatenation speed

Luiz Americo Pereira Camara-2
In reply to this post by Michael Van Canneyt
Michael Van Canneyt wrote:

>
>   Len:=0;
>   For I:=X to Y do
>     Inc(Len,Length(L[i]));  // S string, L list.
>   SetLength(S,Len);
>   P:=1;
>   For I:=X to Y do
>     begin
>     T:=L[i];
>     Len:=Length(T);
>     Move(T[1],S[P],Len);
>     inc(P,Len)
>     end;
>

This is the behavior you get using TStrings.Text property.

See TStrings.GetTextStr function

Luiz

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal