算法学习：基数排序

1、基数排序的应用场景

把一系列单词，按照英文字典的顺序排序，如 a，alice，bob ....

基数排序不具有普适性。

2、定义

基数排序（radix sort）属于“分配式排序”（distribution sort），又称“筒子法”（bucket sort）或bin sort。

顾名思义，它是通过键值的部分资讯，将要排序的元素分配至某些“桶”中，以此达到排序的作用。

基数排序法是稳定性排序其时间复杂度为O (nlog(r)m)，其中r为所采取的基数，而m为堆数，在某些时候，基数排序法的效率高于其它的稳定性排序法。

3、举例

输入： ["banana","apple","orange","ape","he"]

输出： ["ape","apple","banana","he","orange"]

4、基数排序的分类

实现方法不一样，但是实现的结果是一样的。分为低位优先和高位优先。

5、基数排序-低位排序-字母

1）思路

　　如果对于字母进行排序，我们的字母永远是26个基本字母进行的组合。

　　将每个字母作为一个基准的桶，只要对应位置的字母是匹配基准字母桶的，我们将其放入对应的桶里面。

　　待排序列表：["banana","apple","orange","ape","he"]

　　前置条件：把单词按照最长单词的长度进行填充（不是真正的填充，而是将对应位置看做a）

2）过程说明：

　　["banana","apple","orange","ape","he"]　　max_length = 6

取第六个单词，idx=5:

　　banana banana[5] = a

　　apple"a" apple[5] ==>index out of range，bucket_index = 0（放入“a”）的桶中

　　orange orange[5] = e

　　ape"aaa" ape[5] ==>index out of range，bucket_index = 0（放入“a”）的桶中

　　he"aaaa" he[5]==>index out of range，bucket_index = 0（放入“a”）的桶中

.......

取第三个单词，idx=2：

　　banana banana[2] = n

　　apple"a" apple[2] = p

　　.......

step1：分26个桶，以倒数第一个字母为基准查找

　　A：banana apple ape he

　　D：

　　E：orange

　　X：

　　Y：

　　Z：

step2：从A-Z依次取值放在一起，按顺序取出桶里面的数组合：banana apple ape he orange

step3：分26个桶，按照倒数第二个字母为基准放置

　　A：ape he

　　D：

　　E：apple

　　G：orange

　　...

　　N：banana

　　X：

　　Y：

　　Z：

step4：按顺序取出桶里面输的组合：ape he apple orange banana

step5：分26个桶，按照（倒数）第三个字母为基准放置

　　A：ape, he, banana

　　G：

　　L：apple

　　...

　　N：orange

　　X：

　　Y：

　　Z：

step6：按顺序取出桶里面输的组合：ape he banana apple orange

...

3）代码

#基数排序:低位优先 字母
def bucketSort(arr):
    #确定分桶并桶的次数
    max_length = len(max(arr,key=len))
    print(max_length)
    #while做的重复的分桶并桶
    while max_length>=0:
        buckets = [[] for i in range(26)]
        for word in arr:
            #假定长度不够的情况下，把单词放到第一个桶里面去
            if max_length>=len(word):
                bucket_idx = 0
            else:
                #长度够的情况下，取要对比的单词word[max_length]==》获取对应的桶的索引
                bucket_idx = ord(word[max_length].lower())-97
            buckets[bucket_idx].append(word)
        j = 0
        #并桶:从a - z 按顺序并桶
        for bucket in buckets:
            for word in bucket:
                arr[j]=word
                j+=1
        print("arr=%s"%arr)
        max_length-=1
arr = ["banana","apple","orange","ape","he","application","bat","object","able"]
print(bucketSort(arr))

'''
结果：
11
arr=['banana', 'apple', 'orange', 'ape', 'he', 'application', 'bat', 'object', 'able']
arr=['banana', 'apple', 'orange', 'ape', 'he', 'bat', 'object', 'able', 'application']
arr=['banana', 'apple', 'orange', 'ape', 'he', 'bat', 'object', 'able', 'application']
arr=['banana', 'apple', 'orange', 'ape', 'he', 'bat', 'object', 'able', 'application']
arr=['banana', 'apple', 'orange', 'ape', 'he', 'bat', 'object', 'able', 'application']
arr=['banana', 'apple', 'orange', 'ape', 'he', 'bat', 'object', 'able', 'application']
arr=['banana', 'apple', 'ape', 'he', 'bat', 'able', 'application', 'orange', 'object']
arr=['ape', 'he', 'bat', 'able', 'object', 'apple', 'orange', 'application', 'banana']
arr=['ape', 'he', 'bat', 'banana', 'able', 'object', 'apple', 'application', 'orange']
arr=['he', 'orange', 'ape', 'object', 'able', 'banana', 'apple', 'application', 'bat']
arr=['banana', 'bat', 'object', 'able', 'he', 'ape', 'apple', 'application', 'orange']
arr=['able', 'ape', 'apple', 'application', 'banana', 'bat', 'he', 'object', 'orange']
None
'''

6、基数排序-高位排序-字母

1）思路

第一步：从idx=0 开始按照字母分桶

　　A:apple,ape,able====>able,ape,apple

　　B:banana

　　　　able,ape,apple,banana

第二步：对A里面的单词进行高位优先分桶，idx+1 = 1

　　从A桶里面拿出来的：apple,ape,able

　　A：

　　B:able

　　p:apple ,ape====》ape,apple

　　　　able,ape,apple

第三步：对p:apple ,ape桶里面的进行分桶，idx+1 = 2

　　E: ape

　　P: apple

　　所有的桶只有一个单词了，开始合桶

　　　　ape,apple

低位优先:分桶合桶

高位优先：体现了一个分治的思想的

2）代码

#基数排序：高位排序 字母
def bucketSortMsd(arr,idx):
    max_length = len(max(arr,key=len))
    if idx>max_length:
        return arr
    buckets = [[] for i in range(26)]
    for word in arr:
        if idx>=len(word):
            bucket_idx = 0
        else:
            bucket_idx = ord(word[idx])-97
        buckets[bucket_idx].append(word)
    arr = []
    for bucket in buckets:
        if len(bucket)>1:
            arr +=bucketSortMsd(bucket,idx+1)
        else:
            arr+=bucket
    return arr
arr = ["banana","apple","ape","he"]
print(bucketSortMsd(arr,0))

'''
结果：
['ape', 'apple', 'banana', 'he']
'''

posted @ 2020-12-08 16:13 hqq的进阶日记阅读(530) 评论(0) 收藏举报

刷新页面返回顶部

算法学习：基数排序

公告