LeetCode #274 H-Index
Question
Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.
According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."
Example:
Input:citations = [3,0,6,1,5]Output: 3 Explanation:[3,0,6,1,5]means the researcher has5papers in total and each of them had received3, 0, 6, 1, 5citations respectively. Since the researcher has3papers with at least3citations each and the remaining two with no more than3citations each, her h-index is3.
排序+遍历O(nlogn)
根据 wiki(见参考链接)中提供的计算方法:
First we order the values of f from the largest to the lowest value. Then, we look for the last position in which f is greater than or equal to the position (we call h this position).
即只要从大到小排序然后遍历找到最后一个 citations[i] >= i 就行了,此时 h=i (实际上是citations[i] >= i+1)
为什么work?举个例子
6 5 3 1 0
1 2 3 4 5
这说明前三个数满足 citations[i] >= i >= 3 的,后两个数满足 citations[i] < i (此时i最小取4),所以citations[i] <=3
当然,根据题目定义的方法来进行比较也是ok的,时间复杂度没有增加,但后续改进会难以继续
bucket sort:O(n)
用bucket sort桶排序可以达到O(n)。这题有个非常值得注意的特点是,h的范围是在[0, n]之间的,所以可以用bucket sort!
class Solution: def hIndex(self, citations: List[int]) -> int: length = len(citations) freq_list = [0 for i in range(length+1)] # first pass freq_list for i in range(length): if citations[i] > length: index = length else: index = citations[i] freq_list[index] += 1 # second pass freq_list last = 0 for i in range(length, -1, -1): freq_list[i] += last last = freq_list[i] if freq_list[i] >= i: return i
桶排序的关键是建立一个映射,比如基数为10的基数排序就是建立f(x) = x mod 10 这样的映射。我们先定义bucket:
freq_list[i]:表示有多少篇文章被至少引用了i次
要求出freq_list,需要两次遍历:第一次求出有多少篇文章被引用了i次,第二次求出有多少篇文章被至少引用了i次。
注意到,如果有x篇文章的引用至少3次,那么引用至少2次的文章数量y等于x加上引用次数等于2次的文章数量,即 y= x + freq_list[i],因此该步骤可以以一次遍历完成。
参考:
https://en.wikipedia.org/wiki/H-index
https://www.cnblogs.com/zmyvszk/p/5619051.html

浙公网安备 33010602011771号