Optimizer in 2.0

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Optimizer in 2.0

Adrian Veith
Hi,

I am newbie with fpc (but not with pascal, which I use more than 20y now).
We have lot of existing delphi code, which some of it, I want to port to
new platforms - and fpc looks like the right tool for it. But I am
concerned about the speed. I did some basic benchmarks and it seems,
that the optimizers has no effect in the 2.0 compiler (or the code even
get's slower).

With the 1.9.8 compiler, the same code executes 4 times quicker - what
happend (or what do I wrong) ?

For optimizing I use the -OG3rp3 switch  - is it still valid ?

Thanks,

Adrian.



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Optimizer in 2.0

Florian Klämpfl
Adrian Veith wrote:

> Hi,
>
> I am newbie with fpc (but not with pascal, which I use more than 20y now).
> We have lot of existing delphi code, which some of it, I want to port to
> new platforms - and fpc looks like the right tool for it. But I am
> concerned about the speed. I did some basic benchmarks and it seems,
> that the optimizers has no effect in the 2.0 compiler (or the code even
> get's slower).

This shouldn't be the case in general, can you give examples or post the
benchmark? And don't try with useless code, we don't care about optimziations
which test only useless code which never happens in real programs.

>
> With the 1.9.8 compiler, the same code executes 4 times quicker - what
> happend (or what do I wrong) ?
>
> For optimizing I use the -OG3rp3 switch  - is it still valid ?
>
> Thanks,
>
> Adrian.
>
>
>
> _______________________________________________
> fpc-pascal maillist  -  [hidden email]
> http://lists.freepascal.org/mailman/listinfo/fpc-pascal


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Optimizer in 2.0

Adrian Veith
It shows with useless code like simple nested for .. to loops, but also
with some more useful code like the attached RSA_Angriff from the C'T
magazine.

My results are with:

fpc -Sd -OG3rp3 -XX RSA_Angriff_D5.dpr

5266 ms

with:

fpc -Sd -XX RSA_Angriff_D5.dpr

4844 ms

The unoptimized code is faster than the optimized code.

(BTW. I choosed an example where the Delphi compiler is really worse ;-)
: 65953 ms)

Same picture with the whet.pas (whether or not the whetstone benchmark
is useful) from your source distribution.

With optimization on:  7393.72 MIPS

With optimization off:  8368.20 MIPS

cheers,

Adrian.

Florian Klaempfl schrieb:

>Adrian Veith wrote:
>
>  
>
>>Hi,
>>
>>I am newbie with fpc (but not with pascal, which I use more than 20y now).
>>We have lot of existing delphi code, which some of it, I want to port to
>>new platforms - and fpc looks like the right tool for it. But I am
>>concerned about the speed. I did some basic benchmarks and it seems,
>>that the optimizers has no effect in the 2.0 compiler (or the code even
>>get's slower).
>>    
>>
>
>This shouldn't be the case in general, can you give examples or post the
>benchmark? And don't try with useless code, we don't care about optimziations
>which test only useless code which never happens in real programs.
>
>  
>
>>With the 1.9.8 compiler, the same code executes 4 times quicker - what
>>happend (or what do I wrong) ?
>>
>>For optimizing I use the -OG3rp3 switch  - is it still valid ?
>>
>>Thanks,
>>
>>Adrian.
>>
>>
>>
>>_______________________________________________
>>fpc-pascal maillist  -  [hidden email]
>>http://lists.freepascal.org/mailman/listinfo/fpc-pascal
>>    
>>
>
>
>_______________________________________________
>fpc-pascal maillist  -  [hidden email]
>http://lists.freepascal.org/mailman/listinfo/fpc-pascal
>
>  
>

program RSA_Angriff_D5;
(* Benchmarkversion testet Laufzeitverhalten primitiver Datentypen.
 *  - Konsolenprogramm
 *  - Auswahl zwischen bitorientierter Routine mit aufw?ndiger Rekursion DoPrim2()
 *    und BruteForce-Routine DoPrimBruteForce() schlicht durch Auskommentieren in main()
 *  - zu zerlegende Zahl als BigNumber definiert.
 *
 *  Anmerkung: DoPrimBruteForce() findet auch kleine Primfaktoren,
 *             DoPrim2() ignoriert diese, da RSA-Paare immer gro?e Primfaktoren haben
 *             (thats a bug, but also a feature)
*)

{$APPTYPE CONSOLE}
uses
        SysUtils,
        Windows;

const
  BigNumber: int64 = 34571237124234713;

procedure DoPrimBruteForce(NumberToCrunch: int64);
var i, j, UBound: int64; UBoundC: Comp;
begin
  i := 3;
  // ohne den Umweg ?ber eine Gleitkommazahl
  // schafft Delphi das nicht
  UBoundC := NumberToCrunch;
        UBound := Round(Sqrt(UBoundC));
        while i < UBound do
  begin
    if NumberToCrunch mod i = 0 then
    begin
      j := NumberToCrunch div i;
      Writeln(i, ' * ', j, ' = ', j * i);
    end;
    Inc(i,2);
  end;
end;

procedure DoPrim2(NumberToCrunch: int64);
var UBound, z: int64;

  procedure Prim(
    n: Integer;   // Bitmaske bis m-te Stelle, also 2^m - 1
    m: Integer;   // Stelle (Zweierpotenz)
    i,            // 1. Kandidat f?r Primzahl
    j,            // 2. Kandidat f?r Primzahl
    res: int64);   // bisher auf m Stellen synthetisiertes Produkt
  var
    product: int64;
    z0: int64;
    m1, n1: Integer;
    im, jm: int64;
  begin
                product := i * j;   // ?berlauf? dann Ende der Rekursion

    // Rekursion bricht ab, wenn
    //   - 1. Kandidat gr??er als Wurzel der Zahl
    //   - Produkt zu gro?
    //   - m wegen ?berlauf negativ wird
    if (i < UBound) and (product < z) and (m > 0) then
    begin
                                z0 := z and n;   // Ausfiltern der relevanten Stellen
                                m1 := m shl 1;     // n?chste Stelle
                                n1 := n or m1;     // Bitmaske erweitern

        // Tiefensuche rekursiv!
        // Es gibt vier m?gliche F?lle, je zwei pro Kandidat
        // zwei davon gehen in die Rekursion
        // relevant sind nur m Stellen

                                if (res and n) = z0        // +0, +0 (i, j unver?ndert)
                                        then Prim(n1, m1, i, j, res);

                                im := i or m;     // m-tes Bit der Kandidaten setzen
                                jm := j or m;
        // Testen ob die letzten m Stellen des Kandidatenprodukts stimmen
        res := im * j;
        if (res and n) = z0    // +1, +0 (m-tes Bit von i gesetzt)
          then Prim(n1, m1, im, j, res);
        res := jm * i;         // +0, +1 (m-tes Bit von j gesetzt)
        if (res and n) = z0
                                        then Prim(n1, m1, i, jm, res);
        res := im * jm;        // +1, +1 (m-tes Bit von i,j gesetzt)
                                if (res and n) = z0
                                        then Prim(n1, m1, im, jm, res);
    end else
    begin // Rekursion bricht ab. Stimmt das Produkt etwa?
                        if (i < UBound) and (product = z)
        then Writeln(i, ' * ', j, ' = ', j * i);
    end;
  end; // Prim
var UBoundC: Comp;
begin
  z := NumberToCrunch;
  // Wurzelkriterium: ohne den Umweg ?ber eine Gleitkommazahl
  // schafft Delphi das nicht
  UBoundC := NumberToCrunch;
  UBound := Round(Sqrt(UBoundC));
  Prim(3, 2, 1, 1, 1);              // Start der Rekursion
end;  // DoPrim2


var STime: Cardinal;
begin
  Writeln('Delphi: Zerlege ',BigNumber);
  STime := GetTickCount;
  if (ParamCount = 0) or (AnsiUpperCase(ParamStr(1)) <> '-R') then
  begin
    Writeln('Brute Force (Aufruf mit -R = rekursives Verfahren)');
    DoPrimBruteForce(BigNumber);
  end else
  begin
    Writeln('Rekursiver Ansatz (Aufruf ohne -R = Brute Force)');
    DoPrim2(BigNumber);
  end;

        STime := GetTickCount-STime;
        Writeln('Dauer (msec): ', STime);
        Write('Weiter mit ENTER...');
        Readln;
end.



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Optimizer in 2.0

Peter Vreman
> It shows with useless code like simple nested for .. to loops, but also
> with some more useful code like the attached RSA_Angriff from the C'T
> magazine.
>
> My results are with:
>
> fpc -Sd -OG3rp3 -XX RSA_Angriff_D5.dpr
>
> 5266 ms
>
> with:
>
> fpc -Sd -XX RSA_Angriff_D5.dpr
>
> 4844 ms
>
> The unoptimized code is faster than the optimized code.

Try without regvars. It is a known issue that register variables don't
produce better code on intel cpus because it increase register pressure.






_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Optimizer in 2.0

Florian Klämpfl
In reply to this post by Adrian Veith
Adrian Veith wrote:

> It shows with useless code like simple nested for .. to loops, but also
> with some more useful code like the attached RSA_Angriff from the C'T
> magazine.
>
> My results are with:
>
> fpc -Sd -OG3rp3 -XX RSA_Angriff_D5.dpr
>
> 5266 ms
>
> with:
>
> fpc -Sd -XX RSA_Angriff_D5.dpr
>
> 4844 ms
>
> The unoptimized code is faster than the optimized code.

For me the version compiled with -OG3rp3 is the fastest (AthlonXP). Just
curious, what CPU do you use?

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Optimizer in 2.0

Adrian Veith
Florian Klaempfl schrieb:

>Adrian Veith wrote:
>  
>
>>It shows with useless code like simple nested for .. to loops, but also
>>with some more useful code like the attached RSA_Angriff from the C'T
>>magazine.
>>
>>My results are with:
>>
>>fpc -Sd -OG3rp3 -XX RSA_Angriff_D5.dpr
>>
>>5266 ms
>>
>>with:
>>
>>fpc -Sd -XX RSA_Angriff_D5.dpr
>>
>>4844 ms
>>
>>The unoptimized code is faster than the optimized code.
>>    
>>
>
>For me the version compiled with -OG3rp3 is the fastest (AthlonXP). Just
>curious, what CPU do you use?
>  
>
Intel Pentium M 730.

I will do the test on a Athlon XP as well. Maybe the picture changes.



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal