EpikTimer v1.0.1 released

classic Classic list List threaded Threaded
40 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
On 05/28/2014 01:47 PM, Reimar Grabowski wrote:
> Where is the f****** problem?

Supposedly none (as I already stated in the Lazarus list).

Thanks !
-Michael




(What I'd like to prevent is that here again the award for the best way
to avoid ubiquitous usage of community based code is close at hand:
  - someone asks "what can I use to ...."
  - (s)he gets the name of a perfectly working thingy
  - (s)he searches for it in the internet
  - what (s)he finds is an outdated release and documentation that does
not easily give away it the thingy is usable for exactly this purpose
  - (s)he takes pains to install it
  - does not work
  - frustration, even though the current release would have been a
perfectly fit.
)


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
In reply to this post by Marco van de Voort
On 05/28/2014 02:00 PM, Marco van de Voort wrote:
>
> In that case some attention points:
> - help implementing and testing fine grained timings on *nix. Now it only has a special
>    case for linux.
> - Seems high precision is not used on anything but x86.
> - Is rdtsc safe for CPUs that can vary clock of cores independently like
>    Core Mono? What if the process changed  CPU to a different clocked core?

While I  of course see you point, besides high resolution, another
benefit of EpikTimer (at least on X86) is low overhead access to a time
source. And this is what makes it useful  for my needs.

Already the current version on X86 (at least on Linux 32 bit) it uses
the appropriate ASM instruction: Perfect (but supposedly some testing
regarding the  safety issue you state).

I suppose on any arch Linux vDSO could be used to provide both lowest
possible overhead and highest possible resolution. So this is what I
will try to do some day soon.

-Michael

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Henry Vermaak
In reply to this post by Marco van de Voort
On Wed, May 28, 2014 at 02:00:06PM +0200, Marco van de Voort wrote:
> > In fact I do want the best possible stuff and not a fork. I am just
> > trying to help (as I would like to use it in the said current project).
>
> In that case some attention points:
> - help implementing and testing fine grained timings on *nix. Now it only has a special
>   case for linux.
> - Seems high precision is not used on anything but x86.
> - Is rdtsc safe for CPUs that can vary clock of cores independently like
>   Core Mono? What if the process changed  CPU to a different clocked core?

- The rdtsc instruction needs to be protected from out of order
  execution.  Some people use cpuid, which is expensive.  It looks like
  the linux kernel uses mfence or lfence/mfence depending on CPU type.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
On 05/28/2014 04:26 PM, Henry Vermaak wrote:
>
> - The rdtsc instruction needs to be protected from out of order
>    execution.  Some people use cpuid, which is expensive.  It looks like
>    the linux kernel uses mfence or lfence/mfence depending on CPU type.
>

... meaning the current version of incorrect ?!?!?

What would the effect of ignoring this = ?

Things like this is why I'd rather use dVSO.

-Michael
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Henry Vermaak
In reply to this post by Marco van de Voort
On Wed, May 28, 2014 at 02:00:06PM +0200, Marco van de Voort wrote:
> - Is rdtsc safe for CPUs that can vary clock of cores independently like
>   Core Mono? What if the process changed  CPU to a different clocked core?

I've read that on recent CPUs, the TSC is unaffected by the actual clock
rate of the CPU.  On linux, The TSC gets calibrated and the
synchronisation is tested, which may result in the TSC clock source
being marked as unstable and disabled.  In this case, it will fall back
to using other clock sources (HPET is next in line on my computer).

Henry
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Marco van de Voort
In our previous episode, Henry Vermaak said:
> On Wed, May 28, 2014 at 02:00:06PM +0200, Marco van de Voort wrote:
> > - Is rdtsc safe for CPUs that can vary clock of cores independently like
> >   Core Mono? What if the process changed  CPU to a different clocked core?
>
> I've read that on recent CPUs, the TSC is unaffected by the actual clock
> rate of the CPU.

Yes, on Nehalem and newer afaik the clock is on the uncore. But that is
still to new a requirement to assume for general purpose code like RTL and
FCL, if you agree to clip real old stuff.

Both the Core2 generations (Conroe and Wolfdale) are still too common.

> On linux, The TSC gets calibrated and the synchronisation is tested, which
> may result in the TSC clock source being marked as unstable and disabled.
> In this case, it will fall back to using other clock sources (HPET is next
> in line on my computer).

I assume the same system underlies queryperformancecounter and family on
Windows. But that means you need to use OS timing functions, and not ASM.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Henry Vermaak
On Wed, May 28, 2014 at 05:41:08PM +0200, Marco van de Voort wrote:
> In our previous episode, Henry Vermaak said:
> > On linux, The TSC gets calibrated and the synchronisation is tested, which
> > may result in the TSC clock source being marked as unstable and disabled.
> > In this case, it will fall back to using other clock sources (HPET is next
> > in line on my computer).
>
> I assume the same system underlies queryperformancecounter and family on
> Windows. But that means you need to use OS timing functions, and not ASM.

I assumed that, too.  All the clocksource calibration and selection
happens at startup, so by the time you call
clock_gettime()/QueryPerformanceCounter() it knows whether to query the
TSC/HPET/whatever.

Blindly making assumptions about TSC stability can get you into trouble.
Microsoft advises against this, too:

http://msdn.microsoft.com/en-gb/library/windows/desktop/ee417693%28v=vs.85%29.aspx

Henry
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Henry Vermaak
In reply to this post by Michael Schnell
On Wed, May 28, 2014 at 04:31:53PM +0200, Michael Schnell wrote:
> On 05/28/2014 04:26 PM, Henry Vermaak wrote:
> >
> >- The rdtsc instruction needs to be protected from out of order
> >   execution.  Some people use cpuid, which is expensive.  It looks like
> >   the linux kernel uses mfence or lfence/mfence depending on CPU type.
> >
>
> ... meaning the current version of incorrect ?!?!?

Indeed.  What's worse is that it _always_ uses the TSC on x86, without
knowing whether it's actually a reliable clock source for the particular
hardware configuration.

I can only recommend to never use this component, just use an ifdef to
call QueryPerformanceCounter/()clock_gettime() based on OS.

> Things like this is why I'd rather use dVSO.

Calling the vDSO will certainly make things faster.  I don't know what
the overhead of QueryPerformanceCounter() is.

Henry
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Marco van de Voort
In reply to this post by Marco van de Voort
In our previous episode, Marco van de Voort said:
> still to new a requirement to assume for general purpose code like RTL and
> FCL, if you agree to clip real old stuff.

_EVEN_ if you agree to clip real old stuff (read: anything Pre core2 intel
like Pentium D and Core duo)
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Graeme Geldenhuys-6
In reply to this post by Michael Schnell
On 28/05/14 09:32, Michael Schnell wrote:
> In fact I do want the best possible stuff and not a fork. I am just
> trying to help (as I would like to use it in the said current project).

Then fork it on Github and start publishing your changes. I'll gladly
review the suggestions and merge it what works.

Regards,
  Graeme

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
In reply to this post by Henry Vermaak
On 05/28/2014 06:09 PM, Henry Vermaak wrote:
> an ifdef to call QueryPerformanceCounter/()clock_gettime() based on OS.
>> Things like this is why I'd rather use dVSO.
> Calling the vDSO will certainly make things faster.
For me, the point is, that with vDSO, the Linux infrastructure will
handle the dirty stuff  that is involved with such low level greatly
arch depending things. It is supposed to work "out of the box", even if
for certain archs no support from the hardware is available and provide
the best possible support by the Kernel software.

For C programmers, this is automatically in place, as libC and the
Kernel are done appropriately. fpc (supposedly for good reasons) is
determined to reduce libC dependance as much as possible. Hence the rtl
needs to provide arch dependent low level stuff internally.

As it is provided directly by the Linux Kernel, vDSO is a way to take
advantage of Kernel provided arch-independence without using libC.

-Michael
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
In reply to this post by Marco van de Voort
On 05/28/2014 05:41 PM, Marco van de Voort wrote:
> . But that means you need to use OS timing functions, and not ASM.

Meaning either syscalls or vDSO.

As in Linux syscalls do a usermode->Kernelmode->usermode switch, they
introduce a huger overhead.

In Windows I suppose syscalls usually are not done directly by the rtl,
but functions calls to a Kernel dll are done, so that that
Windows.provide dll might decide to stay in usermode if possible.

-Michael
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
In reply to this post by Henry Vermaak
On 05/28/2014 06:03 PM, Henry Vermaak wrote:
> Blindly making assumptions about TSC stability can get you into
> trouble. Microsoft advises against this, too:
> http://msdn.microsoft.com/en-gb/library/windows/desktop/ee417693%28v=vs.85%29.aspx


As my Target is Linux, this does not help with the implementation. But
the warning not to use TSC holds, anyway !

-Michael
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
In reply to this post by Graeme Geldenhuys-6
On 05/29/2014 01:17 AM, Graeme Geldenhuys wrote:
> Then fork it on Github and start publishing your changes. I'll gladly
> review the suggestions and merge it what works.
OK. (Of course only after I did as much testing as possible - in fact I
can't do by far enough.)

This discussion shows that the arch depending stuff in the
library/packet really should be part of the rtl. There is not much
chance to finally provide  a decent / manageable / stable arch
independent functionality to the users.

-Michael
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Sven Barth-2
In reply to this post by Michael Schnell

Am 02.06.2014 10:38 schrieb "Michael Schnell" <[hidden email]>:
> In Windows I suppose syscalls usually are not done directly by the rtl, but functions calls to a Kernel dll are done, so that that Windows.provide dll might decide to stay in usermode if possible.

On Windows the QueryPerfomanceCounter/Frequency calls always do a syscall, because the corresponding code is implemented in the HAL which is only accessible from kernel mode.

Regards,
Sven


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Sven Barth-2
In reply to this post by Michael Schnell


Am 02.06.2014 10:38 schrieb "Michael Schnell" <[hidden email]>:
> In Windows I suppose syscalls usually are not done directly by the rtl, but functions calls to a Kernel dll are done, so that that Windows.provide dll might decide to stay in usermode if possible.

Addendum: yes, the RTL calls the core DLLs of the Win32 subsystem like kernel32.dll, but they are just that: the core DLLs of the Win32 subsystem. They don't implement any core OS functionality like hardware/device management, because that is done by the NT kernel below it which is called by the Win32 DLLs if needed using the ntdll.dll which acts as a gateway to the kernel (some calls will do a Syscall then, for example hardware related functions, while others like functions to handle list structures will stay in the current mode). That's basically a remnant of the microkernel nature of the NT OS.

Regards,
Sven


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
On 06/02/2014 11:06 AM, Sven Barth wrote:

>
>
> Addendum: yes, the RTL calls the core DLLs of the Win32 subsystem like
> kernel32.dll, but they are just that: the core DLLs of the Win32
> subsystem. They don't implement any core OS functionality like
> hardware/device management, because that is done by the NT kernel
> below it which is called by the Win32 DLLs if needed using the
> ntdll.dll which acts as a gateway to the kernel (some calls will do a
> Syscall then, for example hardware related functions, while others
> like functions to handle list structures will stay in the current
> mode). That's basically a remnant of the microkernel nature of the NT OS.
>
>

This is rather similar to  what I assumed.

Hence, if (as your other post suggests) that Kernel dll switches to
Kernel mode to access TSC in an X86 archs while this is not necessary,
that is a fault of Windows' and not a fault of fpc's and hence we can't
help it. As we found, "blindly" reading is not a good idea, so a really
fast rtl function would need to detect the CPU sub-arch and act
appropriately. I understand that this is close to impossible.

-Michael
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Marco van de Voort
In reply to this post by Michael Schnell
In our previous episode, Michael Schnell said:
> On 05/28/2014 05:41 PM, Marco van de Voort wrote:
> > . But that means you need to use OS timing functions, and not ASM.
>
> Meaning either syscalls or vDSO.
>
> As in Linux syscalls do a usermode->Kernelmode->usermode switch, they
> introduce a huger overhead.

Maybe, but is that relevant? We were talking about precision, not speed.
 
> In Windows I suppose syscalls usually are not done directly by the rtl,

No. Windows calls kernel32/user32, which then mostly calls nt.dll functions
to do the actual syscalls afaik. And a lot more is userland on Windows.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
On 06/02/2014 11:49 AM, Marco van de Voort wrote:
> Maybe, but is that relevant? We were talking about precision, not speed.
I have been talking about overhead (speed) all the time. This is my
intention to discuss the issue. Bus regarding time measurement of course
speed and precision is highly related, so IMHO it's really decent to
handle both aspects together.

-Michael
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: EpikTimer v1.0.1 released

Michael Schnell
In reply to this post by Marco van de Voort
On 06/02/2014 11:49 AM, Marco van de Voort wrote:
>
>> In Windows I suppose syscalls usually are not done directly by the rtl,
> No. Windows calls kernel32/user32, which then mostly calls nt.dll functions
I feel free to translate this to: "In Windows, the fpc rtl calls
kernel32/user32, ..."

Which is _exactly_ what I said.

-Michael
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
12