# fast integer multiplication

13 messages
Open this post in threaded view
|

## fast integer multiplication

 Hi, Suppose I have the following code: var    a,b: dword;    c: qword; begin    a := 10000000;    b := 20000000;    c := a * b;    writeln(c); end. Now, although c is large enough to contain the result only the lower dword is filled. I can force correct results by using c := qword(a) * b, but the slow fpc_mul_qword is used. The code generated for the above sample is: # [16] c:=a*b;         movl U_P\$PROJECT1_A,%edx         movl U_P\$PROJECT1_B,%eax         mull %edx         movl \$0,%edx         movl %eax,U_P\$PROJECT1_C         movl %edx,U_P\$PROJECT1_C+4 What I want is the above code, but without the "movl \$0,%edx" instruction. Is there a way to do this (wihtout using fpc_mul_qword). Regards, Vincent. _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 Vincent Snijders wrote: > Hi, > > Suppose I have the following code: > > var >   a,b: dword; >   c: qword; > > begin >   a := 10000000; >   b := 20000000; >   c := a * b; >   writeln(c); > end. > > Now, although c is large enough to contain the result only the lower > dword is filled. I can force correct results by using c := qword(a) * b, > but the slow fpc_mul_qword is used. > > The code generated for the above sample is: > # [16] c:=a*b; >     movl    U_P\$PROJECT1_A,%edx >     movl    U_P\$PROJECT1_B,%eax >     mull    %edx >     movl    \$0,%edx >     movl    %eax,U_P\$PROJECT1_C >     movl    %edx,U_P\$PROJECT1_C+4 > > What I want is the above code, but without the "movl \$0,%edx" > instruction. Is there a way to do this (wihtout using fpc_mul_qword). Only assembler for now. Any suggestions how the compiler could be told to generate such code? _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 Florian Klaempfl a écrit : > Vincent Snijders wrote: > > >>Hi, >> >>Suppose I have the following code: >> >>var >>  a,b: dword; >>  c: qword; >> >>begin >>  a := 10000000; >>  b := 20000000; >>  c := a * b; >>  writeln(c); >>end. >> >>Now, although c is large enough to contain the result only the lower >>dword is filled. I can force correct results by using c := qword(a) * b, >>but the slow fpc_mul_qword is used. >> >>The code generated for the above sample is: >># [16] c:=a*b; >>    movl    U_P\$PROJECT1_A,%edx >>    movl    U_P\$PROJECT1_B,%eax >>    mull    %edx >>    movl    \$0,%edx >>    movl    %eax,U_P\$PROJECT1_C >>    movl    %edx,U_P\$PROJECT1_C+4 >> >>What I want is the above code, but without the "movl \$0,%edx" >>instruction. Is there a way to do this (wihtout using fpc_mul_qword). > > > Only assembler for now. Any suggestions how the compiler could be told > to generate such code? A small function (like the one I asked for a few weeks ago :-) that would be inlined by the compiler? Currently I use this one function UI32x32To64(A,B: Longword): QWord; assembler; register; nostackframe; asm      mull %edx end; It is fast but certainly much less than if it were inlined. MM _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 In reply to this post by Florian Klaempfl-2 Florian Klaempfl a écrit : > Vincent Snijders wrote: > > >>Hi, >> >>Suppose I have the following code: >> >>var >>  a,b: dword; >>  c: qword; >> >>begin >>  a := 10000000; >>  b := 20000000; >>  c := a * b; >>  writeln(c); >>end. >> >>Now, although c is large enough to contain the result only the lower >>dword is filled. I can force correct results by using c := qword(a) * b, >>but the slow fpc_mul_qword is used. >> >>The code generated for the above sample is: >># [16] c:=a*b; >>    movl    U_P\$PROJECT1_A,%edx >>    movl    U_P\$PROJECT1_B,%eax >>    mull    %edx >>    movl    \$0,%edx >>    movl    %eax,U_P\$PROJECT1_C >>    movl    %edx,U_P\$PROJECT1_C+4 >> >>What I want is the above code, but without the "movl \$0,%edx" >>instruction. Is there a way to do this (wihtout using fpc_mul_qword). > > > Only assembler for now. Any suggestions how the compiler could be told > to generate such code? Since you ask the question, I suppose you cannot simply suppress the "movl \$0,%edx" line generation [*]. What about using a compiler directive, something like {\$EXTENDEDMUL ON/OFF} for instance? [*] Note that because of this line, currently FPC has not a "coherent" behaviour. If you write x := y*z with x: Word and y,z: Byte or with x: Longword and y,z: Word, the resulting value x is the exact multiplication result and not the result truncated to the size of the multiplicand and multiplicator type (like it is with x: QWord and y,z: Longword). MM _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 In reply to this post by Florian Klaempfl-2 Florian Klaempfl wrote: >Vincent Snijders wrote: > >   > >>Hi, >> >>Suppose I have the following code: >> >>var >>  a,b: dword; >>  c: qword; >> >>begin >>  a := 10000000; >>  b := 20000000; >>  c := a * b; >>  writeln(c); >>end. >> >>Now, although c is large enough to contain the result only the lower >>dword is filled. I can force correct results by using c := qword(a) * b, >>but the slow fpc_mul_qword is used. >> >>The code generated for the above sample is: >># [16] c:=a*b; >>    movl    U_P\$PROJECT1_A,%edx >>    movl    U_P\$PROJECT1_B,%eax >>    mull    %edx >>    movl    \$0,%edx >>    movl    %eax,U_P\$PROJECT1_C >>    movl    %edx,U_P\$PROJECT1_C+4 >> >>What I want is the above code, but without the "movl \$0,%edx" >>instruction. Is there a way to do this (wihtout using fpc_mul_qword). >>     >> > >Only assembler for now. Any suggestions how the compiler could be told >to generate such code? >   > IMHO it should automatically optimize expressions like:   qword(a) * b where a and b are dwords _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 In reply to this post by Marcel Martin > Florian Klaempfl a écrit : >> Vincent Snijders wrote: >> >> >>>Hi, >>> >>>Suppose I have the following code: >>> >>>var >>>  a,b: dword; >>>  c: qword; >>> >>>begin >>>  a := 10000000; >>>  b := 20000000; >>>  c := a * b; >>>  writeln(c); >>>end. >>> >>>Now, although c is large enough to contain the result only the lower >>>dword is filled. I can force correct results by using c := qword(a) * b, >>>but the slow fpc_mul_qword is used. >>> >>>The code generated for the above sample is: >>># [16] c:=a*b; >>>    movl    U_P\$PROJECT1_A,%edx >>>    movl    U_P\$PROJECT1_B,%eax >>>    mull    %edx >>>    movl    \$0,%edx >>>    movl    %eax,U_P\$PROJECT1_C >>>    movl    %edx,U_P\$PROJECT1_C+4 >>> >>>What I want is the above code, but without the "movl \$0,%edx" >>>instruction. Is there a way to do this (wihtout using fpc_mul_qword). No. At the time a*b is calculated there is nothing known about the size of the result. >> Only assembler for now. Any suggestions how the compiler could be told >> to generate such code? > > Since you ask the question, I suppose you cannot simply > suppress the "movl \$0,%edx" line generation [*]. What about > using a compiler directive, something like {\$EXTENDEDMUL ON/OFF} > for instance? I don't like this. This means we have to add directives for all kind of special cpu features to improve some special corner cases. _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 Peter Vreman wrote: >>Florian Klaempfl a ?crit : >> >>>Vincent Snijders wrote: >>> >>> >>> >>>>Hi, >>>> >>>>Suppose I have the following code: >>>> >>>>var >>>> a,b: dword; >>>> c: qword; >>>> >>>>begin >>>> a := 10000000; >>>> b := 20000000; >>>> c := a * b; >>>> writeln(c); >>>>end. >>>> >>>>Now, although c is large enough to contain the result only the lower >>>>dword is filled. I can force correct results by using c := qword(a) * b, >>>>but the slow fpc_mul_qword is used. >>>> >>>>The code generated for the above sample is: >>>># [16] c:=a*b; >>>>   movl    U_P\$PROJECT1_A,%edx >>>>   movl    U_P\$PROJECT1_B,%eax >>>>   mull    %edx >>>>   movl    \$0,%edx >>>>   movl    %eax,U_P\$PROJECT1_C >>>>   movl    %edx,U_P\$PROJECT1_C+4 >>>> >>>>What I want is the above code, but without the "movl \$0,%edx" >>>>instruction. Is there a way to do this (wihtout using fpc_mul_qword). > > > No. At the time a*b is calculated there is nothing known about the size of > the result. a*b knows that it should do a qword multiplication and it can check if both operands are dword. It could remove the type casts then and do a mulll. > > > >>>Only assembler for now. Any suggestions how the compiler could be told >>>to generate such code? >> >>Since you ask the question, I suppose you cannot simply >>suppress the "movl \$0,%edx" line generation [*]. What about >>using a compiler directive, something like {\$EXTENDEDMUL ON/OFF} >>for instance? > > > I don't like this. This means we have to add directives for all kind of > special cpu features to improve some special corner cases. Me neither. _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 In reply to this post by Peter Vreman Peter Vreman a écrit :  >Marcel Martin wrote : >>Since you ask the question, I suppose you cannot simply >>suppress the "movl \$0,%edx" line generation [*]. What about >>using a compiler directive, something like {\$EXTENDEDMUL ON/OFF} >>for instance? > > > I don't like this. This means we have to add directives for all kind of > special cpu features to improve some special corner cases. But it is a special case because FPC makes it special. As I wrote yesterday, when you write Word := Byte * Byte, what you get is the exact result (not the result mod 2^8), when you write Longword := Word * Word, what you get is the exact result (not the Result mod 2^16), but when you write QWord := Longword * Longword, there, what you get is not (always) the exact result but the result mod 2^32. This is FPC which makes special this case by setting the product high part to 0 (what it doesn't do with other cases). Being said that, I agree with you that adding compiler directives is not a good thing [*]. Here, it would be sufficient to suppress the generation of the line "movl \$0,%edx". Since you do not do so, I suppose there would be some side effects I don't see at the moment, but it would be the simplest solution. [*] If QWord := Longword * Longword gave the exact result, it would allow me to suppress a lot of compiler directives :-) MM _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 On 29 Jul 2005, at 11:36, Marcel Martin wrote: > But it is a special case because FPC makes it special. No, it's simply a consequence of the fact that Pascal performs all   calculations using the native integer type by default. Jonas _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|

## Re: fast integer multiplication

 In reply to this post by Marcel Martin >>>Since you ask the question, I suppose you cannot simply >>>suppress the "movl \$0,%edx" line generation [*]. What about >>>using a compiler directive, something like {\$EXTENDEDMUL ON/OFF} >>>for instance? >> >> I don't like this. This means we have to add directives for all kind of >> special cpu features to improve some special corner cases. > > But it is a special case because FPC makes it special. As I wrote > yesterday, when you write Word := Byte * Byte, what you get is > the exact result (not the result mod 2^8), when you write > Longword := Word * Word, what you get is the exact result (not the > Result mod 2^16), but when you write QWord := Longword * Longword, > there, what you get is not (always) the exact result but the result > mod 2^32. This is FPC which makes special this case by setting the > product high part to 0 (what it doesn't do with other cases). > > Being said that, I agree with you that adding compiler directives > is not a good thing [*]. Here, it would be sufficient to suppress > the generation of the line "movl \$0,%edx". Since you do not do so, > I suppose there would be some side effects I don't see at the > moment, but it would be the simplest solution. > > [*] If QWord := Longword * Longword gave the exact result, it would > allow me to suppress a lot of compiler directives :-) The rules are the same as with int64:=longint*longint. Below is a test. The result is -2 on both fpc and delphi. var   e : int64;   i,j : longint; begin   i:=\$7fffffff;   j:=2;   e:=i*j;   writeln(e); end. _______________________________________________ fpc-pascal maillist  -  [hidden email] http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Open this post in threaded view
|