TimSort

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

TimSort

Mattias Gaertner

 

Is there already a TimSort implementation in fpc?

 

http://en.wikipedia.org/wiki/Timsort

 

Mattias

 


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

michael.vancanneyt


On Fri, 20 May 2011, Mattias Gaertner wrote:

>  
> Is there already a TimSort implementation in fpc?
>  
> http://en.wikipedia.org/wiki/Timsort


Not to my knowledge.

Michael.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

Marco van de Voort
In our previous episode, [hidden email] said:
> > ?
> > Is there already a TimSort implementation in fpc?
> > ?
> > http://en.wikipedia.org/wiki/Timsort
>
> Not to my knowledge.

One reference implementation in the article (the goolge one) is
"GPL-with-classpath-exception" licensed, the other Python one seems to be
PSF.

While the PSF doesn't seem to be particularly evil, I think the first step
would be to find a version with a more compatible license.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

etrusco
On Fri, May 20, 2011 at 9:26 AM, Marco van de Voort <[hidden email]> wrote:

> In our previous episode, [hidden email] said:
>> > ?
>> > Is there already a TimSort implementation in fpc?
>> > ?
>> > http://en.wikipedia.org/wiki/Timsort
>>
>> Not to my knowledge.
>
> One reference implementation in the article (the goolge one) is
> "GPL-with-classpath-exception" licensed, the other Python one seems to be
> PSF.
>
> While the PSF doesn't seem to be particularly evil, I think the first step
> would be to find a version with a more compatible license.

Maybe this one?
http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/TimSort.java?view=co
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

José Mejuto
In reply to this post by Mattias Gaertner
Hello FPC-Pascal,

Friday, May 20, 2011, 12:21:43 PM, you wrote:

MG> Is there already a TimSort implementation in fpc?
MG> http://en.wikipedia.org/wiki/Timsort

Why is TimSort specially interesting to you ?

--
Best regards,
 José

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

Uffe Kousgaard
"José Mejuto" <[hidden email]> wrote in
message

> Why is TimSort specially interesting to you ?

If it is the best all-purpose sorting algorithm and now the standard in
several other programming languages, it should be obvious why it is also
worth looking at for pascal / delphi developers.



_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

Mattias Gaertner
In reply to this post by José Mejuto

 

 

"José Mejuto" <[hidden email]> hat am 24. Mai 2011 um 11:59 geschrieben:

> Hello FPC-Pascal,
>
> Friday, May 20, 2011, 12:21:43 PM, you wrote:
>
> MG> Is there already a TimSort implementation in fpc?
> MG> http://en.wikipedia.org/wiki/Timsort
>
> Why is TimSort specially interesting to you ?

I need a fast stable sort, so multiple sorts work as expected (contrary to QuickSort).
TimSort is a candidate.

Mattias


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re[2]: [fpc-pascal] TimSort

José Mejuto
Hello FPC-Pascal,

Tuesday, May 24, 2011, 1:06:43 PM, you wrote:

 >> Why is TimSort specially interesting to you ?
MG>  I need a fast stable sort, so multiple sorts work as expected (contrary to
MG> QuickSort).
MG>  TimSort is a candidate.

That's exactly the answer I was looking for, the stability need ;)
I'll try to write an implementation to expand my sorters collection :)
but the algo seems quite "confuse" and plenty of "tricks".

--
Best regards,
 José

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Re[2]: [fpc-pascal] TimSort

Mattias Gaertner

 

 

"José Mejuto" <[hidden email]> hat am 24. Mai 2011 um 18:07 geschrieben:

> Hello FPC-Pascal,
>
> Tuesday, May 24, 2011, 1:06:43 PM, you wrote:
>
>  >> Why is TimSort specially interesting to you ?
> MG>  I need a fast stable sort, so multiple sorts work as expected (contrary to
> MG> QuickSort).
> MG>  TimSort is a candidate.
>
> That's exactly the answer I was looking for, the stability need ;)
> I'll try to write an implementation to expand my sorters collection :)
> but the algo seems quite "confuse" and plenty of "tricks".

It uses a lot of tricks, but it seems to me that Tim explained all the important things.
It would be nice, if thetimsort unit has the same license as the FCL (modified LGPL-2).
Maybe you can ask him for permission.
The c implementation uses some ugly macros, but the rest should be easily translated to pascal.


Mattias


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

Marco van de Voort
In reply to this post by Uffe Kousgaard
In our previous episode, Uffe Kousgaard said:
> > Why is TimSort specially interesting to you ?
>
> If it is the best all-purpose sorting algorithm and now the standard in
> several other programming languages, it should be obvious why it is also
> worth looking at for pascal / delphi developers.

A quick look at wikipedia will show that timsort has a disadvantage too. It
needs up to N records memory, not just Log(n) records like e.g. Quicksort.

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

Mattias Gaertner

 

 

Marco van de Voort <[hidden email]> hat am 24. Mai 2011 um 18:53 geschrieben:

> In our previous episode, Uffe Kousgaard said:
> > > Why is TimSort specially interesting to you ?
> >
> > If it is the best all-purpose sorting algorithm and now the standard in
> > several other programming languages, it should be obvious why it is also
> > worth looking at for pascal / delphi developers.
>
> A quick look at wikipedia will show that timsort has a disadvantage too. It
> needs up to N records memory, not just Log(n) records like e.g. Quicksort.

It *can* be implemented to need only log(n). But the current fpc implementation of QuickSort seems to need O(n) memory on the stack.

TimSort needs up to N/2 pointer, which is better than a simple MergeSort.

Mattias


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Re[2]: [fpc-pascal] TimSort

Vincent Snijders-3
In reply to this post by José Mejuto
2011/5/24 José Mejuto <[hidden email]>:

> Hello FPC-Pascal,
>
> Tuesday, May 24, 2011, 1:06:43 PM, you wrote:
>
>  >> Why is TimSort specially interesting to you ?
> MG>  I need a fast stable sort, so multiple sorts work as expected (contrary to
> MG> QuickSort).
> MG>  TimSort is a candidate.
>
> That's exactly the answer I was looking for, the stability need ;)
> I'll try to write an implementation to expand my sorters collection :)
> but the algo seems quite "confuse" and plenty of "tricks".

I suggest you start from the explanation on
http://svn.python.org/projects/python/trunk/Objects/listsort.txt

The same text be found on other places too.

I more or less explains the algorithm, which you can implement without
copyright restrictions, don't you think?

Vincent
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re[4]: [fpc-pascal] TimSort

José Mejuto
In reply to this post by Mattias Gaertner
Hello FPC-Pascal,

Tuesday, May 24, 2011, 6:31:56 PM, you wrote:

MG>  It uses a lot of tricks, but it seems to me that Tim explained all the
MG> important things.
MG>  It would be nice, if thetimsort unit has the same license as the FCL (modified
MG> LGPL-2).
MG>  Maybe you can ask him for permission.
MG>  The c implementation uses some ugly macros, but the rest should be easily
MG> translated to pascal.

I'm now implementing from scratch, maybe it will not be the "fastest"
implementation but I hope that at least works :) After partial
implemenation it looks like a combination of insertion sort for small
runs and mergesort to combine the already sorted runs.

Anyway it looks to need O(n) auxiliary storage, where O could be a
pointer.

Implementing from paper needs permission to licensing ? Is the algo
copyrighted/patented ?

--
Best regards,
 José

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Re[4]: [fpc-pascal] TimSort

Jonas Maebe-2

On 24 May 2011, at 19:53, José Mejuto wrote:

> Implementing from paper needs permission to licensing ?

Normally not.

> Is the algo
> copyrighted/patented ?

Algorithms cannot be copyrighted. Whether or not it's patented I don't know (while the individual sorting algorithms have been known for a long time, the combination may well be considered patentable by some patent office), but
a) that's unlikely, given that it was developed specifically for a free software project (unless of course someone else already patented it beforehand, and the author didn't know that)
b) pretty much any useful program you can write probably infringes on several different software patents anyway (e.g. anything using COM interfaces for cross-process communication, but that's a discussion for the fpc-other list)

Anyway: go ahead.


Jonas_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

Mattias Gaertner
In reply to this post by José Mejuto
On Tue, 24 May 2011 19:53:12 +0200
José Mejuto <[hidden email]> wrote:

> Hello FPC-Pascal,
>
> Tuesday, May 24, 2011, 6:31:56 PM, you wrote:
>
> MG>  It uses a lot of tricks, but it seems to me that Tim explained all the
> MG> important things.
> MG>  It would be nice, if thetimsort unit has the same license as the FCL (modified
> MG> LGPL-2).
> MG>  Maybe you can ask him for permission.
> MG>  The c implementation uses some ugly macros, but the rest should be easily
> MG> translated to pascal.
>
> I'm now implementing from scratch, maybe it will not be the "fastest"
> implementation but I hope that at least works :) After partial
> implemenation it looks like a combination of insertion sort for small
> runs and mergesort to combine the already sorted runs.

Yes, and some more tricks. You should read
http://svn.python.org/projects/python/trunk/Objects/listsort.txt

If you do only the insertion+mergesort, then you have only implemented
a long known improved mergesort variant and you can not call it
TimSort. Of course then you don't need to ask Tim for permission.

 
> Anyway it looks to need O(n) auxiliary storage, where O could be a
> pointer.

Read Tim's text: N/2 pointer, so on a 32bit system 2*N byte.

 
> Implementing from paper needs permission to licensing ? Is the algo
> copyrighted/patented ?

IMO it is simply good manner to ask Tim.
He spent quite some time in testing and tuning and explained the
algorithm in detail, so I guess he will be happy to see his fame
spreading.


Mattias

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re[2]: [fpc-pascal] TimSort

José Mejuto
Hello FPC-Pascal,

Tuesday, May 24, 2011, 8:25:07 PM, you wrote:

MG> If you do only the insertion+mergesort, then you have only implemented
MG> a long known improved mergesort variant and you can not call it
MG> TimSort. Of course then you don't need to ask Tim for permission.

Well, in first "try" I'm implementing the Insert+Merge, them refine
with galloping and some other conditions :)

>> Anyway it looks to need O(n) auxiliary storage, where O could be a
>> pointer.
MG> Read Tim's text: N/2 pointer, so on a 32bit system 2*N byte.

I had read it, but I do not understand it :( specially the "2*N byte"
because in the paper he states that it requieres much more, specially
in auxiliary storage for merge sort, not counting the stack for the
runs or the auxiliary copy for one insertion sort pointer.

Anyway I'll read it again, maybe something escapes my english skills
;)
 
>> Implementing from paper needs permission to licensing ? Is the algo
>> copyrighted/patented ?
MG> IMO it is simply good manner to ask Tim.
MG> He spent quite some time in testing and tuning and explained the
MG> algorithm in detail, so I guess he will be happy to see his fame
MG> spreading.

I'll e-mail him anyway. Netiquete :)

--
Best regards,
 José

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

Marco van de Voort
In reply to this post by Mattias Gaertner
In our previous episode, Mattias Gaertner said:
>  > A quick look at wikipedia will show that timsort has a disadvantage too. It
>  > needs up to N records memory, not just Log(n) records like e.g. Quicksort.
>  It *can* be implemented to need only log(n). But the current fpc implementation
> of QuickSort seems to need O(n) memory on the stack.

Hmm. Then that should be fixed.
 
>  TimSort needs up to N/2 pointer, which is better than a simple MergeSort.

For heavier sorting I usually use heapsort and quicksort routines that date
back to my M2 days

Usually heapsort since it is quite fast for already sorted collections.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: TimSort

Mattias Gaertner
On Wed, 25 May 2011 09:02:46 +0200 (CEST)
[hidden email] (Marco van de Voort) wrote:

> In our previous episode, Mattias Gaertner said:
> >  > A quick look at wikipedia will show that timsort has a disadvantage too. It
> >  > needs up to N records memory, not just Log(n) records like e.g. Quicksort.
> >  It *can* be implemented to need only log(n). But the current fpc implementation
> > of QuickSort seems to need O(n) memory on the stack.
>
> Hmm. Then that should be fixed.

It must choose the smaller half for the tail call.

 
> >  TimSort needs up to N/2 pointer, which is better than a simple MergeSort.
>
> For heavier sorting I usually use heapsort and quicksort routines that date
> back to my M2 days
>
> Usually heapsort since it is quite fast for already sorted collections.

Yes, but Heapsort is not stable too.

Mattias
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re[2]: [fpc-pascal] Re: TimSort

José Mejuto
In reply to this post by Marco van de Voort
Hello FPC-Pascal,

Wednesday, May 25, 2011, 9:02:46 AM, you wrote:

MvdV> For heavier sorting I usually use heapsort and quicksort routines that date
MvdV> back to my M2 days
MvdV> Usually heapsort since it is quite fast for already sorted collections.

Instead HeapSort you can use SmoothSort which is around 10-20% faster
in almost all conditions (sorted or not) and have the same memory and
best/worst case constraints.

--
Best regards,
 José

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re[2]: [fpc-pascal] TimSort

José Mejuto
In reply to this post by Mattias Gaertner
Hello FPC-Pascal,

Tuesday, May 24, 2011, 8:25:07 PM, you wrote:

MG> IMO it is simply good manner to ask Tim.
MG> He spent quite some time in testing and tuning and explained the
MG> algorithm in detail, so I guess he will be happy to see his fame
MG> spreading.

Do you know the email address of Tim Peters ?

--
Best regards,
 José

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
12