Single threaded application on multicore CPU

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Single threaded application on multicore CPU

LacaK
Hi *,

may be that this question is bit off-topic here, but I am sure, that
here are experts, which know answer ;-)

I have simple Lazarus/FPC application (with no explicit threads) which
does intensive calculations (local thresholding with big window size) on
image, which is stored into memory as 2D byte array.
(so only memory access is done and some integer calculations)

When I run this application and look at Task Manager or Resource Monitor
I see, that all 4 cores "are used" (at least performance graph shows
usage or in other words activity in graph increasees)
Total CPU usage is <= 25% (which points out to fact, that only 1/4 cores
is used)

Why this ? As far as I expect, that single threaded application should
use only one core, so I would expect activity only on one core not on
all four cores
(I know that in theory CPU can switch single thread between cores, but I
doubt that this is case (as switching has extra cost)... or is ?)
(Btw: When I set affinity to only one core, then this core is 100% and
others are 0% as expected)

Thanks

-Laco.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

Torsten Bonde Christiansen
On 2016-06-22 09:02, LacaK wrote:

> Hi *,
>
> may be that this question is bit off-topic here, but I am sure, that
> here are experts, which know answer ;-)
>
> I have simple Lazarus/FPC application (with no explicit threads) which
> does intensive calculations (local thresholding with big window size)
> on image, which is stored into memory as 2D byte array.
> (so only memory access is done and some integer calculations)
>
> When I run this application and look at Task Manager or Resource
> Monitor I see, that all 4 cores "are used" (at least performance graph
> shows usage or in other words activity in graph increasees)
> Total CPU usage is <= 25% (which points out to fact, that only 1/4
> cores is used)
>
> Why this ? As far as I expect, that single threaded application should
> use only one core, so I would expect activity only on one core not on
> all four cores
> (I know that in theory CPU can switch single thread between cores, but
> I doubt that this is case (as switching has extra cost)... or is ?)

Nailed it right there....

All modern CPU automatically swaps running threads (unless affinity is
set) to other cores in case there is a high load. This is done from a
heat perspective, since running on a single cores will make a local heat
spot on the dye - a things which is not prefered and also not really
cost effient.

> (Btw: When I set affinity to only one core, then this core is 100% and
> others are 0% as expected)
>

-Torsten.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

el_es
On 22/06/16 08:07, Torsten Bonde Christiansen wrote:

> On 2016-06-22 09:02, LacaK wrote:
>> Hi *,
>>
>> may be that this question is bit off-topic here, but I am sure, that here are experts, which know answer ;-)
>>
>> I have simple Lazarus/FPC application (with no explicit threads) which does intensive calculations (local thresholding with big window size) on image, which is stored into memory as 2D byte array.
>> (so only memory access is done and some integer calculations)
>>
>> When I run this application and look at Task Manager or Resource Monitor I see, that all 4 cores "are used" (at least performance graph shows usage or in other words activity in graph increasees)
>> Total CPU usage is <= 25% (which points out to fact, that only 1/4 cores is used)
>>
>> Why this ? As far as I expect, that single threaded application should use only one core, so I would expect activity only on one core not on all four cores
>> (I know that in theory CPU can switch single thread between cores, but I doubt that this is case (as switching has extra cost)... or is ?)
>
> Nailed it right there....
>
> All modern CPU automatically swaps running threads (unless affinity is set) to other cores in case there is a high load. This is done from a

^^^^^^^^^^^^^^^^^ s/CPU/OS Kernel ;)

> heat perspective, since running on a single cores will make a local heat spot on the dye - a things which is not prefered and also not really
> cost effient.
>
>> (Btw: When I set affinity to only one core, then this core is 100% and others are 0% as expected)
>>
>
> -Torsten.

-L.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

LacaK
In reply to this post by Torsten Bonde Christiansen

>> I have simple Lazarus/FPC application (with no explicit threads)
>> which does intensive calculations (local thresholding with big window
>> size) on image, which is stored into memory as 2D byte array.
>> (so only memory access is done and some integer calculations)
>>
>> When I run this application and look at Task Manager or Resource
>> Monitor I see, that all 4 cores "are used" (at least performance
>> graph shows usage or in other words activity in graph increasees)
>> Total CPU usage is <= 25% (which points out to fact, that only 1/4
>> cores is used)
>>
>> Why this ? As far as I expect, that single threaded application
>> should use only one core, so I would expect activity only on one core
>> not on all four cores
>> (I know that in theory CPU can switch single thread between cores,
>> but I doubt that this is case (as switching has extra cost)... or is ?)
>
> Nailed it right there....
>
> All modern CPU automatically swaps running threads (unless affinity is
> set) to other cores in case there is a high load. This is done from a
> heat perspective, since running on a single cores will make a local
> heat spot on the dye - a things which is not prefered and also not really
> cost effient.

Thanks. It gives sense to me.
But moving thread from one core to another has any nonzero cost, does not ?
(So I wonder that CPU does that also when time for thread completion is
not so much big ... so temperature of CPU does not increase so much)

-Laco.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

Graeme Geldenhuys-6
In reply to this post by LacaK
On 2016-06-22 08:02, LacaK wrote:
> I know that in theory CPU can switch single thread between cores

As Torsten said, that is exactly what happens. It is not “just in
theory”, it happens all the time. I see that constantly on my i7 CPU
with long running processes.

As for the cost of moving it between cores. I have no clue, but I would
imagine that switching between cores in a single CPU is not that
expensive (the cores are so close together and tightly integrated),
compared to switching between physical CPU’s. I don’t have a multi CPU
system to give actual observations, but that’s my theory. ;-)

Regards,
  Graeme

--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

Stephen Chrzanowski
In reply to this post by LacaK
Everything has a cost.  But swapping CPU threads isn't as costly as a fried CPU.  Keeping the CPU cool at all costs is better than having a hotspot on the die which COULD damage the heat sink.

The computing cost of swapping CPUs is probably close to zero.  Your CPU only has so much on-die memory that it has to push things out CONSTANTLY to on board RAM, so there may be a time when your CPU (Not the Operating System) has zero knowledge of your application.  When your OS takes the information back to the CPU, the OS will look at the particularities of CPU (Heat, load, in use, etc) and specify which CPU will get the task.


On Wed, Jun 22, 2016 at 4:19 AM, LacaK <[hidden email]> wrote:


Thanks. It gives sense to me.
But moving thread from one core to another has any nonzero cost, does not ?
(So I wonder that CPU does that also when time for thread completion is not so much big ... so temperature of CPU does not increase so much)

-Laco.


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

Mattias Gaertner
On Wed, 22 Jun 2016 07:41:04 -0400
Stephen Chrzanowski <[hidden email]> wrote:

> Everything has a cost.  But swapping CPU threads isn't as costly as a fried
> CPU.  Keeping the CPU cool at all costs is better than having a hotspot on
> the die which COULD damage the heat sink.

On my Linux it does not swap the CPU and I can't find any hard data that
Windows is swapping because of hotspot problems, but I do find many
pages about pinning processes to cores.
Please provide a link for the hotspot claims.

 
> The computing cost of swapping CPUs is probably close to zero.  Your CPU
> only has so much on-die memory that it has to push things out CONSTANTLY to
> on board RAM, so there may be a time when your CPU (Not the Operating
> System) has zero knowledge of your application.  When your OS takes the
> information back to the CPU, the OS will look at the particularities of CPU
> (Heat, load, in use, etc) and specify which CPU will get the task.

Swapping a process costs about 1000-100000ns on an idle system (only
one busy thread), mostly depending on caches. So even if Windows
would swap a hundred times per second, the overhead would be less than
1%.
To make sure, pin your process to a cpu and compare.

Mattias
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

Graeme Geldenhuys-6
On 2016-06-22 13:14, Mattias Gaertner wrote:
> On my Linux it does not swap the CPU and I can't find any hard data that
> Windows is swapping because of hotspot problems,

It’s not just a Windows thing. On my FreeBSD system it swaps between CPU
cores too. I see this with any long running process that drives up the
CPU load. eg: while compiling with FPC, running unit tests, compiling
INF help etc. All these are single threaded applications. Long running
processes that sit idle are not moved between CPU cores (or so it seems).

I have an Intel i7-3770K with 4 cores (8 if you enable hyper-threading).

I monitored the running process using “htop” and “Mate System Monitor”.

Regards,
  Graeme

--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

Mattias Gaertner
On Wed, 22 Jun 2016 13:36:08 +0100
Graeme Geldenhuys <[hidden email]> wrote:

> On 2016-06-22 13:14, Mattias Gaertner wrote:
> > On my Linux it does not swap the CPU and I can't find any hard data that
> > Windows is swapping because of hotspot problems,  
>
> It’s not just a Windows thing. On my FreeBSD system...

I don't doubt that it is swapping. I would like to know why it is
swapping. Is it avoiding hotspots, or for longevity, or turbo boost, or
system processes, or security, or ....
Related: What is the downside of pinning the process to a cpu?

Mattias
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

Graeme Geldenhuys-6
On 2016-06-22 14:01, Mattias Gaertner wrote:
> I don't doubt that it is swapping. I would like to know why it is
> swapping.

Indeed a good question, and something I've wondered about before. My
answer is I don't know. :)

 When I did Google it before (some years ago), the most frequent answer
I got was that the scheduler is more intelligent than the computer user,
so trust what it is doing.

The other frequent answer I got was to do with logical cores versus
physical cores (ie: hyperthreading). Apparently the physical cores get
more priority over the logical ones, and the logical ones only have
about 30% of the processing power of physical ones. Once again the
scheduler load balances the processes for overall good performance and
power consumption.

How true all this is - I honestly don't know. But from what I remember,
various website listed these answers.

Regards,
  Graeme

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

José Mejuto
In reply to this post by Mattias Gaertner
El 22/06/2016 a las 15:01, Mattias Gaertner escribió:
> I don't doubt that it is swapping. I would like to know why it is
> swapping. Is it avoiding hotspots, or for longevity, or turbo boost, or
> system processes, or security, or ....
> Related: What is the downside of pinning the process to a cpu?

Hello,

AFAIK it's more related to heat and whole system response, a heat core
can enter in speed throttling thus lowering performance loosing the
computing power in the other core(s). Also there are more threads than
cores, when a thread finish its time slice if there are more threads
scheduled for this core, current process will be directly remapped to
another already free cpu core, or if there is not other free cpu core it
is queued to the less loaded core given a better opportunity to be
executed before.

After all this is a statistical game, in average you get better system
performance (all threads) if you swap and mix the threads across all
cores, better system performance is the gain, and less per given thread
performance is the pay ;)


--

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Single threaded application on multicore CPU

Marc Santhoff-2
In reply to this post by Mattias Gaertner
On Mi, 2016-06-22 at 15:01 +0200, Mattias Gaertner wrote:

> On Wed, 22 Jun 2016 13:36:08 +0100
> Graeme Geldenhuys <[hidden email]> wrote:
>
> > On 2016-06-22 13:14, Mattias Gaertner wrote:
> > > On my Linux it does not swap the CPU and I can't find any hard data that
> > > Windows is swapping because of hotspot problems,  
> >
> > It’s not just a Windows thing. On my FreeBSD system...
>
> I don't doubt that it is swapping. I would like to know why it is
> swapping. Is it avoiding hotspots, or for longevity, or turbo boost, or
> system processes, or security, or ....
> Related: What is the downside of pinning the process to a cpu?

AFAIK it is mostly a matter of the OSs scheduler implementation. E.g. on
FreeBSD you can select at least two variants (sched_ule, sched_4bsd).

A short glimpse on what thats all about may be read there:

http://www.freebsd.org/cgi/man.cgi?query=sched_4bsd&sektion=4&apropos=0&manpath=FreeBSD+10.3-RELEASE+and+Ports

http://www.freebsd.org/cgi/man.cgi?query=sched_ule&apropos=0&sektion=0&manpath=FreeBSD+10.3-RELEASE+and+Ports&arch=default&format=html

An abstraction of the scheduler from the physical count of CPUs is a
good thing, at least from OS writers view. If that would not be the case
everyone running that OS would have to compile or at least install a
specially designed kernel...

The decision not to pin processes to cpus by default sounds logical to
me. Process number CPU-count + 1 would have to wait immediately.

The other way round, switching a process from cpu to cpu happens
automatically, I think. If there is some form of process queue with
processes ready to run, the will get distributed to the cpu free to
execute them in that (very small) moment. Being able to pin processes to
cpus is the logical consequence of such design.

After all, I am only guessing, too... ;)

Marc


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal