quick sort 的简化实现

Pivot 随机选取意义不大

第一种方法使用随机pivot,使得尽可能平均二分序列,而实际上一般来说需要排序的集合往往是乱序的,无需重新生成随机数作为pivot,大可使用固定位置的数作为pivot,这样便可以适应绝大多数情况,并且简化了逻辑,便有了第二种simple quick Sort。

从运算结果来看不管使用不使用随机pivot,性能几乎一样。

存在大量重复元素时,时间复杂度退化至 O(n2

在大多教材上说,在输入已经有序等特定情况下,快速排序 (quicksort) 的时间复杂度会退化到 O(n2)。实际上,只要要被排序的对象集合中包括较多的相同元素,大部分教材上的实现都会退化至 O(n2)。大多数实现的partition 实现方法把与 pivot 相等的元素分到 pivot 的同一边是问题所在。

第三种实现quick_sort_quick便是为了解决这个问题的(大量重复元素存在情况下有n*10倍的性能提升)。

====== sort 80*500 numbers ======  
quick QS:       2014-11-18 10:08:34.281000
my_quick_sort:   2014-11-18 10:08:35.425000
default sort:    2014-11-18 10:08:36.576000
after sort:      2014-11-18 10:08:36.579000
====== sort 21*300 numbers with many duplicated ======  
quick QS:       2014-11-18 10:08:36.579000 37ms #此处可以看出在有大量重复元素的情况下,第三种方法的性能相比另两种提高了一个数量级
my_quick_sort:   2014-11-18 10:08:36.616000 994ms
default sort:    2014-11-18 10:08:37.610000 1ms
after sort:      2014-11-18 10:08:37.611000
====== compare with random or without random ======  
simple QS:       2014-11-18 10:08:37.611000
my_quick_sort:   2014-11-18 10:08:38.758000
default sort:    2014-11-18 10:08:39.908000
after sort:      2014-11-18 10:08:39.909000
====== sort 90000 numbers without duplicated ======  
qucik QS:        2014-11-18 10:08:39.916000
simple QS:       2014-11-18 10:08:40.285000
my_quick_sort:   2014-11-18 10:08:40.661000
default sort:    2014-11-18 10:08:41.016000
after sort:      2014-11-18 10:08:41.018000

 


#排序1万个乱序:
simple QS: 2014-11-17 16:53:03.450000 38ms my_quick_sort: 2014-11-17 16:53:03.488000 35ms default sort: 2014-11-17 16:53:03.523000 after sort: 2014-11-17 16:53:03.524000 2014-11-17 16:53:03.524000
#排序1万个有序:
simple QS: 2014-11-17 16:53:03.524000 35ms my_quick_sort: 2014-11-17 16:53:03.559000 35ms default sort: 2014-11-17 16:53:03.594000 after sort: 2014-11-17 16:53:03.594000

#9万个有序:
simple QS:       2014-11-17 16:57:12.885000 367ms
my_quick_sort:   2014-11-17 16:57:13.252000 352ms
default sort:    2014-11-17 16:57:13.604000 1ms
after sort:      2014-11-17 16:57:13.605000

 

#-------------------------------------------------------------------------------
# Name:        quick sort
# Purpose:
#
# Author:      ScottGu<gu.kai.66@gmail.com, 150316990@qq.com>
#
# Created:     04/09/2013
# Copyright:   (c) ScottGu<gu.kai.66@gmail.com, 150316990@qq.com> 2013
# Licence:     <your licence>
#-------------------------------------------------------------------------------
from datetime import datetime
from random import randint
import sys
sys.setrecursionlimit(10000)

def quick_sort(lst):
    if len(lst)==0:
        return lst
    else:
        pivot_idx = randint(0, len(lst) - 1)
        pivot = lst[pivot_idx]
        lesser=quick_sort([val for idx, val in enumerate(lst) 
                           if val <= pivot and idx!=pivot_idx])
        greater=quick_sort([x for x in lst if x > pivot])
        return lesser+[pivot]+greater

def quick_sort_simple(lst):
    if len(lst)==0:
        return lst
    else:
        pivot = lst[0]
        lesser=quick_sort([x for x in lst[1:] if x <= pivot])
        greater=quick_sort([x for x in lst[1:] if x > pivot])
        return lesser+[pivot]+greater

def quick_sort_quick(lst):
    if len(lst)==0:
        return lst
    else:
        pivot = lst[0]
        lesser=quick_sort([x for x in lst[1:] if x < pivot])
        greater=quick_sort([x for x in lst[1:] if x > pivot])
        return lesser+[pivot]+[x for x in lst[1:] if x == pivot]+greater


def main():
    
    lst=[20,  21,  22,  23,  24,  25,  26,  27,  28,  29,  30,  
         31,  32,  33,  34,  35,  36,  37,  38,  39,  40,  41,  42,  43,  44,  45, 
          46,  47,  48,  49,  50,  51,  52,  53,  54,  55,  56,  57,  58,  59,  60, 
           61,  62,  63,  64,  65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,  79,  80, 
            81,  82,  83,  84,  85,  86,  87,  88,  89,  90,  91,  92,  93,  94,  95,  96,  97,  98,  99,  100]
    lst=lst*500
    ls2=[1,2,1,1,1,3,4,7,9,11,12,1,1,1,1,1,1,1,1,1,1]
    ls2=ls2*300
    
    print '====== sort 80*500 numbers ======  '
    print 'quick QS:       '+str(datetime.now())
    quick_sort_quick(lst)
    print 'my_quick_sort:   '+str(datetime.now())
    quick_sort_simple(lst)
    print 'default sort:    '+str(datetime.now())
    lst.sort()
    print 'after sort:      '+str(datetime.now())


    print '====== sort 21*300 numbers with many duplicated ======  '
    print 'quick QS:       '+str(datetime.now())
    quick_sort_quick(ls2)
    print 'my_quick_sort:   '+str(datetime.now())
    quick_sort_simple(ls2)
    print 'default sort:    '+str(datetime.now())
    ls2.sort()
    print 'after sort:      '+str(datetime.now())
    
    
    print '====== compare with random or without random ======  '
    print 'simple QS:       '+str(datetime.now())
    quick_sort_simple(lst)
    print 'my_quick_sort:   '+str(datetime.now())
    quick_sort(lst)
    print 'default sort:    '+str(datetime.now())
    lst.sort()
    print 'after sort:      '+str(datetime.now())
 
    print '====== sort 90000 numbers without duplicated ======  '
    lst=[]
    for x in range(90000):
        lst.append(x)
     
    print 'qucik QS:        '+str(datetime.now())
    quick_sort_quick(lst)
    print 'simple QS:       '+str(datetime.now())
    quick_sort_simple(lst)
    print 'my_quick_sort:   '+str(datetime.now())
    quick_sort(lst)
    print 'default sort:    '+str(datetime.now())
    lst.sort()
    print 'after sort:      '+str(datetime.now())

if __name__ == '__main__':
    main()

 

posted @ 2014-11-17 16:54  ScottGu  阅读(493)  评论(0编辑  收藏  举报