Median Maintainence—中间值查找问题
问题描述:
随机给出一串数i, 要能够给出其中大小中间的那个数
算法描述:
一般做法,做插入排序,然后中间值在索引一半的位置,时间复杂度一般,插入排序平均时间复杂度O(n2),再找中间
值,效率不高。
这里的做法是,引入数据结构--Heap来解决问题,时间复杂度为O(logn)。
引入两个堆,max heap和 min heap来存放整数串i的两个部分,需要满足如下条件:
1. 大小条件
max heap中的元素个数只能比min中的多1个,或者是相等,否则进行调整
2. 顺序条件
max heap中存放前半部分小的值
min heap中存放后半部分大的值
max heap中最大的值只能比min中最小的值小,或者是相等,否则进行调整
也就是median产生在max heap中的堆顶,或者是max heap堆顶和min heap中的堆顶的平均值
代码如下:
class MyHeap:
# heap type
MAX_HEAP = 1
MIN_HEAP = 0
def __init__(self, type=MAX_HEAP, arr=None):
self.type = type
# if init directly by array
if arr is not None:
self.data = arr[:]
length = len(arr)
# the last non leave node
begin = length / 2 - 1
for i in range(begin, -1, -1):
self.heapify(i)
else:
self.data = []
def __heapify(self, i):
length = len(self.data)
left = self.__leftChild(i)
right = self.__rightChild(i)
largest = i
while left < length or right < length:
if self.type == self.MAX_HEAP:
if left < length and self.data[left] > self.data[largest]:
largest = left
if right < length and self.data[right] > self.data[largest]:
largest = right
elif self.type == self.MIN_HEAP:
if left < length and self.data[left] < self.data[largest]:
largest = left
if right < length and self.data[right] < self.data[largest]:
largest = right
if i != largest:
self.__swap(i, largest)
i = largest
left = self.__leftChild(i)
right = self.__rightChild(i)
else:
break
def inset(self, item):
self.data.insert(0, item)
# heapify starts from 0
self.__heapify(0)
def delete(self, index):
self.data.pop(index)
# if delete the 0 index item, heapify from 0
self.heapify(index - 1 if index - 1 else 0)
def pop(self):
# pop the extreme value, what ever it is max or min
self.__swap(0, len(self.data) - 1)
extreme = self.data.pop()
self.__heapify(0)
return extreme
# overwrite the getitem method of MyHeap class,
# so you can use [] to get value by index
def __getitem__(self, index):
if len(self.data) == 0:
raise Error("no items")
return self.data[index]
# overwrite the len method of MyHeap class,
# so you can len(heapclass) to get the size of heap
def __len__(self):
return len(self.data)
def __swap(self, i, j):
temp = self.data[i]
self.data[i] = self.data[j]
self.data[j] = temp
# index of array starts from zero
def __rightChild(self, i):
return 2 * i + 1
def __leftChild(self, i):
return 2 * i + 2
# overwrite the repr method of MyHeap class,
# so you can print the readability info of heap
def __repr__(self):
return str(self.data)
class MedianMaintain:
def __init__(self):
self.maxHeap = MyHeap(MyHeap.MAX_HEAP)
self.minHeap = MyHeap(MyHeap.MIN_HEAP)
# the total number of items in both heaps
self.N = 0
def insert(self, item):
# to obey size requirement rule, before insertion, if
# total number is even, it is OK, insert new item to
# max heap, and then adjust it
if self.N % 2 == 0:
self.maxHeap.inset(item)
self.N += 1
if len(self.minHeap) == 0:
return
# to obey order requirement rule, largest of items in max heap should
# less or equal than smallest of the items in the min heap, if not,
# swap them
if self.maxHeap[0] > self.minHeap[0]:
toMin = self.maxHeap.pop()
toMax = self.minHeap.pop()
self.maxHeap.inset(toMax)
self.minHeap.inset(toMin)
else:
# to obey the size requirement rule, before insertion, if the size of
# max heap is odd, then to insert the new item, and pop the extreme value
# to insert into min heap
self.maxHeap.inset(item)
toMin = self.maxHeap.pop()
self.minHeap.inset(toMin)
self.N += 1
def getMedian(self):
# if total size if even, the median is the average of value of root of min and max heap
if self.N % 2 == 0:
return (self.maxHeap[0] + self.minHeap[0]) / 2.0
else:
# if total size if odd, median is root of max heap
return self.maxHeap[0]
def __repr__(self):
return "max heap: " + str(self.maxHeap) + '\n' + "min heap: " + str(self.minHeap)
if __name__ == "__main__":
medianMaintain = MedianMaintain()
medianMaintain.insert(5)
medianMaintain.insert(4)
medianMaintain.insert(3)
medianMaintain.insert(2)
medianMaintain.insert(1)
medianMaintain.insert(6)
print medianMaintain
print medianMaintain.getMedian()
作者:btchenguang
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利.

浙公网安备 33010602011771号