performance when resizing a dynamic array

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

performance when resizing a dynamic array

Graeme Geldenhuys-6
Hi,

If I use an array to hold a list of say Integers. Is there any serious
performance penalty for resizing (growing) the dynamic array as I add
items. My point being, I don't know the number of elements I'll need
beforehand.

Regards,
  Graeme

--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Martin Schreiber-2
On 12/04/2016 11:28 AM, Graeme Geldenhuys wrote:
> Hi,
>
> If I use an array to hold a list of say Integers. Is there any serious
> performance penalty for resizing (growing) the dynamic array as I add
> items. My point being, I don't know the number of elements I'll need
> beforehand.
>
The problem with enlarging big dynamic arrays is that AFAIK there always
is a move operation of the whole existing data. Libc realloc() on the
other hand only relocates the pointer of a virtual memory block if possible.

Martin
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Martin Schreiber-2
On 12/04/2016 11:39 AM, Martin Schreiber wrote:

> On 12/04/2016 11:28 AM, Graeme Geldenhuys wrote:
>> Hi,
>>
>> If I use an array to hold a list of say Integers. Is there any serious
>> performance penalty for resizing (growing) the dynamic array as I add
>> items. My point being, I don't know the number of elements I'll need
>> beforehand.
>>
> The problem with enlarging big dynamic arrays is that AFAIK there always
> is a move operation of the whole existing data.

That happens with every reallocmem() with FPC memory manager so using a
getmem() block instead of a dynamic array has no advantage in this
regard. Dynamic arrays additionally call fillchar(). If fillchar() is
not necessary for unmanaged items the MSEgui functions
AllocUnInitedArray()/ReAllocUnInitedArray() can be used instead.

Martin

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Graeme Geldenhuys-6
On 2016-12-04 11:30, Martin Schreiber wrote:
> That happens with every reallocmem() with FPC memory manager so using a
> getmem() block instead of a dynamic array has no advantage in this

Maybe a good old linked list implementation is the best performer then.
Back to the Pascal of the 80's and 90's. :)

Regards,
  Graeme

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

dmitry boyarintsev
In reply to this post by Graeme Geldenhuys-6
On Sun, Dec 4, 2016 at 5:28 AM, Graeme Geldenhuys <[hidden email]> wrote:
If I use an array to hold a list of say Integers. Is there any serious
performance penalty for resizing (growing) the dynamic array as I add
items. My point being, I don't know the number of elements I'll need
beforehand.
 
Power of 2 increments then (less re-allocations, less memory fragmentation). 
You'll also need to store the actual size somewhere next to the dynamic array, since you can no longer rely on length()

thanks,
Dmitry

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Adriaan van Os-2
In reply to this post by Martin Schreiber-2
Martin Schreiber wrote:

> On 12/04/2016 11:28 AM, Graeme Geldenhuys wrote:
>> Hi,
>>
>> If I use an array to hold a list of say Integers. Is there any serious
>> performance penalty for resizing (growing) the dynamic array as I add
>> items. My point being, I don't know the number of elements I'll need
>> beforehand.
>>
> The problem with enlarging big dynamic arrays is that AFAIK there always
> is a move operation of the whole existing data. Libc realloc() on the
> other hand only relocates the pointer of a virtual memory block if possible.

It depends on the memory manager you use, e.g. if you use the system memory manager instead of fpc.

Linux has mremap <http://man7.org/linux/man-pages/man2/mremap.2.html> which makes an efficient
realloc possible (see the source code for malloc in glibc).

Mac OS X has no mremap but at least it can do a realloc, either in-place or by using vm_copy. The
Mach kernel has vm_remap, but Apple is not smart enough to use it (as of OS X 10.8.5).

Windows neither has a mremap nor a realloc, which implies that on every resize a new block has to
be allocated, with all bytes physically moved to the new location. The Windows memory manager is
absurdly inefficient <http://www.mgroeber.de/misc/windows_heap.html>.

Because of this, it may even be fastest to execute your code twice, the first time just to know the
number of elements of the array.

> Maybe a good old linked list implementation is the best performer then.
> Back to the Pascal of the 80's and 90's.  :)

There is nothing 80's or 90's about intelligent and advanced data structures. Every data structure
has its built-in limitations when it comes to speed and one should choose the right one based on
its characteristics <http://microbizz.nl/difftrees.pdf>.

Regards,

Adriaan van Os

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Graeme Geldenhuys-6
In reply to this post by dmitry boyarintsev
On 2016-12-04 14:10, Dmitry Boyarintsev wrote:
> You'll also need to store the actual size somewhere next to the dynamic
> array, since you can no longer rely on length()


How I'm longing for Java now. Java has a 101 different container classes
as standard. And if that was not enough, you can now mix in generics
based containers too. From Unique lists, sorted lists, filtered sets
(which stay in sync with the original list it was based from) etc. Both
Delphi and FPC are seriously lacking in the container department, though
it seems generics are helping out a lot nowadays.

Regards,
  Graeme

--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Graeme Geldenhuys-6
In reply to this post by Adriaan van Os-2
On 2016-12-04 14:48, Adriaan van Os wrote:
> It depends on the memory manager you use,...

Wow, that's some interesting information there. Thanks for sharing.


> There is nothing 80's or 90's about intelligent and advanced data structures.

I simply meant, back in the day when I learned Turbo Pascal, link-lists
were all the rage. It was also a common test question Pascal exams -
implementing a link-list or double-linked-list.

These days you don't hear much about linked lists, but that could simply
be because they are hidden behind other classes.


Regards,
  Graeme

--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Marco van de Voort
In reply to this post by Graeme Geldenhuys-6
In our previous episode, Graeme Geldenhuys said:
> If I use an array to hold a list of say Integers. Is there any serious
> performance penalty for resizing (growing) the dynamic array as I add
> items. My point being, I don't know the number of elements I'll need
> beforehand.

Yes, ordered insertion is always a pain at high counts. But under
50000-100000 don't worry too much.

See also this thread:
http://forum.lazarus-ide.org/index.php/topic,34348.msg226312.html#msg226312

(search for my post with "One can make the TStringlist problem also very visible with the
website")

I mostly use my generic version of the lightcontainer (which is basically an
array based map using multiple arrays instead of one), it is in the
benchmark above, also with int64/tdatetime etc as keys.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Jürgen Hestermann
In reply to this post by Graeme Geldenhuys-6
Am 2016-12-04 um 14:09 schrieb Graeme Geldenhuys:
 > On 2016-12-04 11:30, Martin Schreiber wrote:
 >> That happens with every reallocmem() with FPC memory manager so using a
 >> getmem() block instead of a dynamic array has no advantage in this
 > Maybe a good old linked list implementation is the best performer then.
 > Back to the Pascal of the 80's and 90's. :)

I think that it's not worth the hassle to use linked lists.
Whenever I thought that it would speed up things (i.e. inserting)
I always went back to dynamic arrays because of the following:

1.) Linked lists need a lot more memory (depending on how it is linked).

2.) You cannot see the number of elements directly in linked lists
(or you need to keep track of this which can be an annoying and error prone task).

3.) With dynamic arrays you can easily pick the n-th element without having
to run through the whole list from the beginning (or end).
Sorting is much easier with an array.
It keeps code simpler in any respect.

4.) When performance is an issue (when you have *very* many elements
(millions) *and* the size of each element is much larger than the
size of a pointer) then I always use an array of pointers (to data).
Moving around elements (i.e. when sorting) then only involves moving pointers.
Of course, you need to new() each element but then the allocated data is fixed
and only the pointer needs to be moved in case of changes.
Deleting (or inserting) can be done with move:

-------------------------------------------------------------------
procedure DeleteIndexInFArray(const index      : SizeInt;
                                     var FArray : FileArrayTyp);

begin // DeleteIndexInFArray
if index<High(FArray) then
    move(FArray[index+1],
         FArray[index],
         Sizeof(FArray[index])*(High(FArray)-index));
SetLength(FArray,Length(FArray)-1);
-------------------------------------------------------------------

This way on average you move Length(FArray)/2*sizeof(pointer) bytes when
deleting (or inserting) and this with a very efficient function (move).

I love dynamic arrays and I never found that performance was an issue.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Martin Schreiber-2
On 12/05/2016 10:52 AM, Jürgen Hestermann wrote:
>
> I love dynamic arrays and I never found that performance was an issue.
>
Agreed.

Martin
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Adriaan van Os-2
Martin Schreiber wrote:
> On 12/05/2016 10:52 AM, Jürgen Hestermann wrote:
>> I love dynamic arrays and I never found that performance was an issue.
>>
> Agreed.

I find this big nonsene, but everybody his opinion. Take the time to study and profile the issue in
detail or keep using dynamic arrays.

Regards,

Adriaan van Os

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

David Emerson
In reply to this post by Graeme Geldenhuys-6
Graeme Geldenhuys wrote:
> Martin Schreiber wrote:
>> That happens with every reallocmem() with FPC memory manager so using a
>> getmem() block instead of a dynamic array has no advantage in this
>
> Maybe a good old linked list implementation is the best performer then.
> Back to the Pascal of the 80's and 90's. :)

I created a list class template, using macros, to deal with this and
other issues. Would be happy to share the code if you'd like it.

~David.


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: performance when resizing a dynamic array

Adriaan van Os-2
David Emerson wrote:
> Graeme Geldenhuys wrote:
>> Martin Schreiber wrote:
>>> That happens with every reallocmem() with FPC memory manager so using a
>>> getmem() block instead of a dynamic array has no advantage in this
>>
>> Maybe a good old linked list implementation is the best performer then.
>> Back to the Pascal of the 80's and 90's. :)

Note that, when using trees and linked lists, using a pooled memory manager manager may speed up
things. Something like the pooledmm in the FCL.

Regards,

Adriaan van Os

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal