[Python] Powerful Ultimate Binary Search Template. Solved many problems

>> Intro

> 介绍

 

Binary Search is quite easy to understand conceptually. Basically, it splits the search space into two halves and only keep the half that probably has the search target and throw away the other half that would not possibly have the answer. In this manner, we reduce the search space to half the size at every step, until we find the target. Binary Search helps us reduce the search time from linear O(n) to logarithmic O(log n). But when it comes to implementation, it's rather difficult to write a bug-free code in just a few minutes. Some of the most common problems include:

二进制搜索在概念上很容易理解。基本上,它将搜索空间分成两半,只保留可能有搜索目标的那一半,而丢弃可能没有答案的另一半。通过这种方式,我们在每一步都将搜索空间减少到原来的一半,直到找到目标。二进制搜索帮助我们减少搜索时间从线性 o (n)到对数 o (log n)。但是当涉及到实现时,很难在几分钟内编写一个没有 bug 的代码。一些最常见的问题包括:

 

  • When to exit the loop? Should we use left < right or left <= right as the while loop condition?
  • 什么时候退出循环? 我们应该使用 left < right 还是 left < = right 作为 while 循环条件?
  • How to initialize the boundary variable left and right?
  • 如何左右初始化边界变量?
  • How to update the boundary? How to choose the appropriate combination from left = midleft = mid + 1 and right = midright = mid - 1?
  • 如何更新边界?如何从左 = mid,左 = mid + 1和右 = mid,右 = mid-1中选择合适的组合?

 

A rather common misunderstanding of binary search is that people often think this technique could only be used in simple scenario like "Given a sorted array, find a specific value in it". As a matter of fact, it can be applied to much more complicated situations.

对二进制搜索的一个常见误解是,人们常常认为这种技术只能用于简单的场景,如“给定一个排序数组,在其中找到一个特定的值”。事实上,它可以应用于更复杂的情况。

 

After a lot of practice in LeetCode, I've made a powerful binary search template and solved many Hard problems by just slightly twisting this template. I'll share the template with you guys in this post. I don't want to just show off the code and leave. Most importantly, I want to share the logical thinking: how to apply this general template to all sorts of problems. Hopefully, after reading this post, people wouldn't be pissed off any more when LeetCoding, "This problem could be solved with binary search! Why didn't I think of that before!"

经过在 LeetCode 中的大量练习,我制作了一个强大的二进制搜索模板,并通过稍微扭动这个模板解决了许多难题。在这篇文章中,我将与你们分享这个模板。我不想只是炫耀代码然后离开。最重要的是,我想分享逻辑思维: 如何将这个通用模板应用于各种问题。希望读完这篇文章后,当 LeetCoding 说“这个问题可以通过二进制搜索解决”时,人们不会再生气了!我以前怎么没想到呢!”

 


>> Most Generalized Binary Search

> > > 最一般化的二分搜索

 

Suppose we have a search space. It could be an array, a range, etc. Usually it's sorted in ascending order. For most tasks, we can transform the requirement into the following generalized form:

假设我们有一个搜索空间。它可以是一个数组、一个范围等等。通常是按升序排序的。对于大多数任务,我们可以将需求转换为以下通用形式:

 

Minimize k , s.t. condition(k) is True

最小化 k,s.t. 条件(k)为真

 

The following code is the most generalized binary search template:

下面的代码是最通用的二进制搜索模板:

 

def binary_search(array) -> int:
    def condition(value) -> bool:
        pass

    left, right = min(search_space), max(search_space) # could be [0, n], [1, n] etc. Depends on problem
    while left < right:
        mid = left + (right - left) // 2
        if condition(mid):
            right = mid
        else:
            left = mid + 1
    return left

 

What's really nice of this template is that, for most of the binary search problems, we only need to modify three parts after copy-pasting this template, and never need to worry about corner cases and bugs in code any more:

这个模板真正好的地方在于,对于大多数的二进制搜索问题,我们只需要在复制粘贴这个模板之后修改三个部分,再也不需要担心代码中的拐角情况和错误:

 

  • Correctly initialize the boundary variables left and right to specify search space. Only one rule: set up the boundary to include all possible elements;
  • 正确初始化左右边界变量以指定搜索空间。只有一个规则: 设置边界,包括所有可能的元素;
  • Decide return value. Is it return left or return left - 1? Remember this: after exiting the while loop, left is the minimal k​ satisfying the condition function;
  • 确定返回值。它是向左返回还是向左返回 -1?请记住: 退出 while 循环后,左边是满足条件函数的最小 k;
  • Design the condition function. This is the most difficult and most beautiful part. Needs lots of practice.
  • 设计条件函数。这是最难也是最美的部分。需要大量的实践。

 

Below I'll show you guys how to apply this powerful template to many LeetCode problems.

下面我将向你们展示如何将这个强大的模板应用到很多 LeetCode 问题上。

 


>> Basic Application

> 基本应用

278. First Bad Version [Easy] 278. First Bad Version [ Easy ]

 

You are a product manager and currently leading a team to develop a new product. Since each version is developed based on the previous version, all the versions after a bad version are also bad. Suppose you have n versions [1, 2, ..., n] and you want to find out the first bad one, which causes all the following ones to be bad. You are given an API bool isBadVersion(version) which will return whether version is bad.

你是一名产品经理,目前正带领一个团队开发新产品。因为每个版本都是在前一个版本的基础上开发的,所以坏版本之后的所有版本都是坏的。假设你有 n 个版本[1,2,... ,n ] ,你想找出第一个坏版本,这会导致下面所有的坏版本。您将获得一个 API bool isBadVersion (版本) ,它将返回版本是否为坏。

 

Example:

例子:

 

Given n = 5, and version = 4 is the first bad version.

call isBadVersion(3) -> false
call isBadVersion(5) -> true
call isBadVersion(4) -> true

Then 4 is the first bad version. 

 

First, we initialize left = 1 and right = n to include all possible values. Then we notice that we don't even need to design the condition function. It's already given by the isBadVersion API. Finding the first bad version is equivalent to finding the minimal k satisfying isBadVersion(k) is True. Our template can fit in very nicely:

首先,我们初始化 left = 1和 right = n 以包含所有可能的值。然后我们注意到,我们甚至不需要设计条件函数。它已经由 isBadVersion API 提供了。找到第一个坏版本等同于找到最小 k 满足的 isBadVersion (k)为 True。我们的模板非常适合:

 

class Solution:
    def firstBadVersion(self, n) -> int:
        left, right = 1, n
        while left < right:
            mid = left + (right - left) // 2
            if isBadVersion(mid):
                right = mid
            else:
                left = mid + 1
        return left

69. Sqrt(x) [Easy] 69. Sqrt (x)[ Easy ]

 

Implement int sqrt(int x). Compute and return the square root of x, where x is guaranteed to be a non-negative integer. Since the return type is an integer, the decimal digits are truncated and only the integer part of the result is returned.

实现 int sqrt (int x)。计算并返回 x 的平方根,其中 x 被保证为非负整数。因为返回类型是整数,所以小数位被截断,只返回结果的整数部分。

 

Example:

例子:

 

Input: 4
Output: 2
Input: 8
Output: 2

 

Easy one. First we need to search for minimal k satisfying condition k^2 > x, then k - 1 is the answer to the question. We can easily come up with the solution. Notice that I set right = x + 1 instead of right = x to deal with special input cases like x = 0 and x = 1.

很简单。首先我们需要搜索满足条件 k ^ 2 > x 的最小 k,然后 k-1是问题的答案。我们可以很容易地想出解决办法。注意,我将右键 = x + 1而不是右键 = x 设置为处理特殊的输入情况,比如 x = 0和 x = 1。

 

def mySqrt(x: int) -> int:
    left, right = 0, x + 1
    while left < right:
        mid = left + (right - left) // 2
        if mid * mid > x:
            right = mid
        else:
            left = mid + 1
    return left - 1  # `left` is the minimum k value, `k - 1` is the answer

35. Search Insert Position [Easy] 35. 搜寻插入位置[简单]

 

Given a sorted array and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order. You may assume no duplicates in the array.

给定一个已排序的数组和一个目标值,如果找到目标,则返回索引。如果没有,则按顺序插入索引,则返回索引位置。您可以假定数组中没有重复的值。

 

Example:

例子:

 

Input: [1,3,5,6], 5
Output: 2
Input: [1,3,5,6], 2
Output: 1

 

Very classic application of binary search. We are looking for the minimal k value satisfying nums[k] >= target, and we can just copy-paste our template. Notice that our solution is correct regardless of whether the input array nums has duplicates. Also notice that the input target might be larger than all elements in nums and therefore needs to placed at the end of the array. That's why we should initialize right = len(nums) instead of right = len(nums) - 1.

非常经典的二进制搜索的应用。我们正在寻找满足 nums [ k ] > = target 的最小 k 值,并且我们可以复制粘贴我们的模板。请注意,无论输入数组 nums 是否有重复项,我们的解决方案都是正确的。还请注意,输入目标可能大于 nums 中的所有元素,因此需要放在数组的末尾。这就是为什么我们应该初始化 right = len (nums)而不是 right = len (nums)-1。

 

class Solution:
    def searchInsert(self, nums: List[int], target: int) -> int:
        left, right = 0, len(nums)
        while left < right:
            mid = left + (right - left) // 2
            if nums[mid] >= target:
                right = mid
            else:
                left = mid + 1
        return left

>> Advanced Application

> 高级应用

 

The above problems are quite easy to solve, because they already give us the array to be searched. We'd know that we should use binary search to solve them at first glance. However, more often are the situations where the search space and search target are not so readily available. Sometimes we won't even realize that the problem should be solved with binary search -- we might just turn to dynamic programming or DFS and get stuck for a very long time.

 

As for the question "When can we use binary search?", my answer is that, If we can discover some kind of monotonicity, for example, if condition(k) is True then condition(k + 1) is True, then we can consider binary search.

 

1011. Capacity To Ship Packages Within D Days [Medium]1011. d 天内装运包裹的能力[中等]

 

A conveyor belt has packages that must be shipped from one port to another within D days. The i-th package on the conveyor belt has a weight of weights[i]. Each day, we load the ship with packages on the conveyor belt (in the order given by weights). We may not load more weight than the maximum weight capacity of the ship.

传送带上的包裹必须在 d 天内从一个港口运到另一个港口。传送带上的第 i 个包装有重量[ i ]。每天,我们在传送带上装载包裹(按重量的顺序)。我们装载的重量不得超过船的最大载重量。

 

Return the least weight capacity of the ship that will result in all the packages on the conveyor belt being shipped within D days.

返回最小的重量能力的船舶,将导致所有的包裹在传送带被运输 d 天内。

 

Example :

例子:

 

Input: weights = [1,2,3,4,5,6,7,8,9,10], D = 5
Output: 15
Explanation: 
A ship capacity of 15 is the minimum to ship all the packages in 5 days like this:
1st day: 1, 2, 3, 4, 5
2nd day: 6, 7
3rd day: 8
4th day: 9
5th day: 10

Note that the cargo must be shipped in the order given, so using a ship of capacity 14 and splitting the packages into parts like (2, 3, 4, 5), (1, 6, 7), (8), (9), (10) is not allowed. 

 

Binary search probably would not come to our mind when we first meet this problem. We might automatically treat weights as search space and then realize we've entered a dead end after wasting lots of time. In fact, we are looking for the minimal one among all feasible capacities. We dig out the monotonicity of this problem: if we can successfully ship all packages within D days with capacity m, then we can definitely ship them all with any capacity larger than m. Now we can design a condition function, let's call it feasible, given an input capacity, it returns whether it's possible to ship all packages within D days. This can run in a greedy way: if there's still room for the current package, we put this package onto the conveyor belt, otherwise we wait for the next day to place this package. If the total days needed exceeds D, we return False, otherwise we return True.

当我们第一次遇到这个问题时,二进制搜索可能不会出现在我们的脑海中。我们可能会自动把权重当作搜索空间,然后浪费大量时间后意识到我们已经进入了一个死胡同。事实上,我们正在所有可行的能力中寻找最小的能力。我们挖掘出这个问题的单调性: 如果我们能够成功地在 d 天内装运所有的包裹,并且容量为 m,那么我们肯定能够装运任何容量大于 m 的包裹。现在我们可以设计一个条件函数,我们称之为切实可行的,给定一个输入容量,它返回是否有可能在 d 天内运送所有包裹。这可以运行在一个贪婪的方式: 如果仍然有空间为当前包,我们把这个包到传送带,否则我们等待第二天把这个包。如果所需的总天数超过 d,则返回 False,否则返回 True。

 

Next, we need to initialize our boundary correctly. Obviously capacity should be at least max(weights), otherwise the conveyor belt couldn't ship the heaviest package. On the other hand, capacity need not be more thansum(weights), because then we can ship all packages in just one day.

接下来,我们需要正确地初始化边界。显然,容量至少应该是最大(重量) ,否则传送带不能运输最重的包裹。另一方面,容量不需要超过总和(重量) ,因为这样我们就可以在一天内运送所有的包裹。

 

Now we've got all we need to apply our binary search template:

现在我们已经得到了应用我们的二进制搜索模板所需要的一切:

 

def shipWithinDays(weights: List[int], D: int) -> int:
    def feasible(capacity) -> bool:
        days = 1
        total = 0
        for weight in weights:
            total += weight
            if total > capacity:  # too heavy, wait for the next day
                total = weight
                days += 1
                if days > D:  # cannot ship within D days
                    return False
        return True

    left, right = max(weights), sum(weights)
    while left < right:
        mid = left + (right - left) // 2
        if feasible(mid):
            right = mid
        else:
            left = mid + 1
    return left

410. Split Array Largest Sum [Hard]410. Split Array Largest Sum [ Hard ]

 

Given an array which consists of non-negative integers and an integer m, you can split the array into m non-empty continuous subarrays. Write an algorithm to minimize the largest sum among these m subarrays.

给定一个由非负整数和整数 m 组成的数组,您可以将该数组拆分为 m 个非空的连续子数组。写一个最小化这些 m 子阵列中最大和的算法。

 

Example:

例子:

 

Input:
nums = [7,2,5,10,8]
m = 2

Output:
18

Explanation:
There are four ways to split nums into two subarrays. The best way is to split it into [7,2,5] and [10,8], where the largest sum among the two subarrays is only 18.

 

If you take a close look, you would probably see how similar this problem is with LC 1011 above. Similarly, we can design a feasible function: given an input threshold, then decide if we can split the array into several subarrays such that every subarray-sum is less than or equal to threshold. In this way, we discover the monotonicity of the problem: if feasible(m) is True, then all inputs larger than m can satisfy feasible function. You can see that the solution code is exactly the same as LC 1011.

如果你仔细观察,你可能会发现这个问题与上面的 LC 1011有多么相似。类似地,我们可以设计一个可行的函数: 给定一个输入阈值,然后决定是否可以将数组拆分为若干子数组,使每个子数组和小于或等于阈值。通过这种方法,我们发现了问题的单调性: 如果可行(m)为真,那么所有大于 m 的输入都可以满足可行函数。您可以看到解决方案代码与 LC 1011完全相同。

 

def splitArray(nums: List[int], m: int) -> int:        
    def feasible(threshold) -> bool:
        count = 1
        total = 0
        for num in nums:
            total += num
            if total > threshold:
                total = num
                count += 1
                if count > m:
                    return False
        return True

    left, right = max(nums), sum(nums)
    while left < right:
        mid = left + (right - left) // 2
        if feasible(mid):
            right = mid     
        else:
            left = mid + 1
    return left

 

But we probably would have doubts: It's true that left returned by our solution is the minimal value satisfying feasible, but how can we know that we can split the original array to actually get this subarray-sum? For example, let's say nums = [7,2,5,10,8] and m = 2. We have 4 different ways to split the array to get 4 different largest subarray-sum correspondingly: 25:[[7], [2,5,10,8]]23:[[7,2], [5,10,8]]18:[[7,2,5], [10,8]]24:[[7,2,5,10], [8]]. Only 4 values. But our search space [max(nums), sum(nums)] = [10, 32] has much more that just 4 values. That is, no matter how we split the input array, we cannot get most of the values in our search space.

但是我们可能会有疑问: 我们的解决方案返回的左值确实是满足可行性的最小值,但是我们怎么知道我们可以分割原始数组来实际得到这个子数组和呢?例如,设 nums = [7,2,5,10,8]和 m = 2。我们有4种不同的方法来拆分数组,以得到4个不同的最大子数组和,相应地: 25: [[7] ,[2,5,10,8] ,23: [7,2,5,8] ,18: [7,2,5] ,[10,8] ,24: [7,2,5,10] ,[8]。只有4个值。但是我们的搜索空间[ max (nums) ,sum (nums)] = [10,32]有更多只有4个值。也就是说,无论我们如何拆分输入数组,我们都不能获得搜索空间中的大部分值。

 

Let's say k is the minimal value satisfying feasible function. We can prove the correctness of our solution with proof by contradiction. Assume that no subarray's sum is equal to k, that is, every subarray sum is less than k. The variable total inside feasible function keeps track of the total weights of current load. If our assumption is correct, then total would always be less than k. As a result, feasible(k - 1) must be True, because total would at most be equal to k - 1 and would never trigger the if-clause if total > thresholdtherefore feasible(k - 1) must have the same output as feasible(k), which is True. But we already know that k is the minimal value satisfying feasible function, so feasible(k - 1) has to be False, which is a contradiction. So our assumption is incorrect. Now we've proved that our algorithm is correct.

假设 k 是满足可行函数的极小值。我们可以用反证法证明我们的解决方案是正确的。假设没有子数组的和等于 k,也就是说,每个子数组的和都小于 k。可变总量内可行函数跟踪当前负荷的总权重。如果我们的假设是正确的,那么总和总是小于 k。因此,feasible (k-1)必须为 True,因为 total 最多等于 k-1,如果 total > threshold,则不会触发 if-clause,因此 feasible (k-1)必须具有与 feasible (k)相同的输出,即 True。但是我们已经知道 k 是满足可行函数的极小值,因此可行(k-1)必然是假的,这是一个矛盾。所以我们的假设是不正确的。现在我们已经证明了我们的算法是正确的。

 


875. Koko Eating Bananas [Medium] 875. Koko Eating Bananas [中等]

 

Koko loves to eat bananas. There are N piles of bananas, the i-th pile has piles[i] bananas. The guards have gone and will come back in H hours. Koko can decide her bananas-per-hour eating speed of K. Each hour, she chooses some pile of bananas, and eats K bananas from that pile. If the pile has less than K bananas, she eats all of them instead, and won't eat any more bananas during this hour.

科科喜欢吃香蕉。有 n 堆香蕉,第 i 堆有成堆的香蕉。警卫已经走了,小时后会回来。Koko 可以决定她每小时吃香蕉的速度是 k。每小时,她选择一堆香蕉,然后从那堆香蕉中吃掉 k。如果这堆香蕉少于 k,她就会把所有的香蕉都吃掉,并且在这一小时内不会再吃更多的香蕉。

 

Koko likes to eat slowly, but still wants to finish eating all the bananas before the guards come back. Return the minimum integer K such that she can eat all the bananas within H hours.

科科喜欢慢慢地吃,但仍然想在警卫回来之前吃完所有的香蕉。返回最小整数 k,这样她可以在小时内吃完所有的香蕉。

 

Example :

例子:

 

Input: piles = [3,6,7,11], H = 8
Output: 4
Input: piles = [30,11,23,4,20], H = 5
Output: 30
Input: piles = [30,11,23,4,20], H = 6
Output: 23

 

Very similar to LC 1011 and LC 410 mentioned above. Let's design a feasible function, given an input speed, determine whether Koko can finish all bananas within H hours with hourly eating speed speed. Obviously, the lower bound of the search space is 1, and upper bound is max(piles), because Koko can only choose one pile of bananas to eat every hour.

与上文提及的 lc1011及 lc410非常相似。让我们设计一个可行的函数,给定一个输入速度,确定 Koko 是否可以在小时内完成所有的香蕉以每小时的进食速度。显然,搜索空间的下界是1,上界是最大(堆) ,因为 Koko 每小时只能选择吃一堆香蕉。

 

def minEatingSpeed(piles: List[int], H: int) -> int:
    def feasible(speed) -> bool:
        # return sum(math.ceil(pile / speed) for pile in piles) <= H  # slower        
        return sum((pile - 1) // speed + 1 for pile in piles) <= H  # faster

    left, right = 1, max(piles)
    while left < right:
        mid = left  + (right - left) // 2
        if feasible(mid):
            right = mid
        else:
            left = mid + 1
    return left

1482. Minimum Number of Days to Make m Bouquets [Medium]1482. 制作 m 花束的最少天数[中等]

 

Given an integer array bloomDay, an integer m and an integer k. We need to make m bouquets. To make a bouquet, you need to use k adjacent flowers from the garden. The garden consists of n flowers, the ith flower will bloom in the bloomDay[i] and then can be used in exactly one bouquet. Return the minimum number of days you need to wait to be able to make m bouquets from the garden. If it is impossible to make m bouquets return -1.

给定一个整数数组 bloomDay,一个整数 m 和一个整数 k。我们需要制作 m 花束。要做一束花,你需要使用 k 从花园相邻的花。花园里有 n 朵花,盛开的时候是盛开的时候,然后可以用在一束花上。返回的最低天数,你需要等待,才能使米花束从花园。如果不能使 m 花束返回 -1。

 

Examples:

例子:

 

Input: bloomDay = [1,10,3,10,2], m = 3, k = 1
Output: 3
Explanation: Let's see what happened in the first three days. x means flower bloomed and _ means flower didn't bloom in the garden.
We need 3 bouquets each should contain 1 flower.
After day 1: [x, _, _, _, _]   // we can only make one bouquet.
After day 2: [x, _, _, _, x]   // we can only make two bouquets.
After day 3: [x, _, x, _, x]   // we can make 3 bouquets. The answer is 3.
Input: bloomDay = [1,10,3,10,2], m = 3, k = 2
Output: -1
Explanation: We need 3 bouquets each has 2 flowers, that means we need 6 flowers. We only have 5 flowers so it is impossible to get the needed bouquets and we return -1.

 

Now that we've solved three advanced problems above, this one should be pretty easy to do. The monotonicity of this problem is very clear: if we can make m bouquets after waiting for d days, then we can definitely finish that as well if we wait for more than d days.

现在我们已经解决了上面的三个高级问题,这个问题应该很容易做到。这个问题的单调性是很明显的: 如果我们能在等 d 天之后再制作 m 花束,那么如果我们等 d 天以上我们也一定能完成。

 

def minDays(bloomDay: List[int], m: int, k: int) -> int:
    def feasible(days) -> bool:
        bonquets, flowers = 0, 0
        for bloom in bloomDay:
            if bloom > days:
                flowers = 0
            else:
                bonquets += (flowers + 1) // k
                flowers = (flowers + 1) % k
        return bonquets >= m

    if len(bloomDay) < m * k:
        return -1
    left, right = 1, max(bloomDay)
    while left < right:
        mid = left + (right - left) // 2
        if feasible(mid):
            right = mid
        else:
            left = mid + 1
    return left

668. Kth Smallest Number in Multiplication Table [Hard]668. 乘法表中 k 是最小的数字[ Hard ]

 

Nearly every one have used the Multiplication Table. But could you find out the k-th smallest number quickly from the multiplication table? Given the height m and the length n of a m * n Multiplication Table, and a positive integer k, you need to return the k-th smallest number in this table.

几乎每个人都用过乘法表。但是你能从乘法表中快速找出第 k 个最小的数字吗?给定 m * n 乘法表的高度 m 和长度 n,以及一个正整数 k,您需要返回表中的第 k 个最小值。

 

Example :

例子:

 

Input: m = 3, n = 3, k = 5
Output: 3
Explanation: 
The Multiplication Table:
1	2	3
2	4	6
3	6	9

The 5-th smallest number is 3 (1, 2, 2, 3, 3).

 

For Kth-Smallest problems like this, what comes to our mind first is Heap. Usually we can maintain a Min-Heap and just pop the top of the Heap for k times. However, that doesn't work out in this problem. We don't have every single number in the entire Multiplication Table, instead, we only have the height and the length of the table. If we are to apply Heap method, we need to explicitly calculate these m * n values and save them to a heap. The time complexity and space complexity of this process are both O(mn), which is quite inefficient. This is when binary search comes in. Remember we say that designing condition function is the most difficult part? In order to find the k-th smallest value in the table, we can design an enough function, given an input num, determine whether there're at least k values less than or equal to numThe minimal num satisfying enough function is the answer we're looking for. Recall that the key to binary search is discovering monotonicity. In this problem, if num satisfies enough, then of course any value larger than num can satisfy. This monotonicity is the fundament of our binary search algorithm.

对于像这样最小的问题,我们首先想到的是堆。通常我们可以维护一个 Min-Heap,并且只需将 Heap 顶部弹出 k 次即可。然而,这在这个问题上是行不通的。我们没有整个乘法表的每一个数字,相反,我们只有桌子的高度和长度。如果我们要应用 Heap 方法,我们需要显式地计算这些 m * n 值并将它们保存到堆中。该过程的时间复杂度和空间复杂度均为 o (mn) ,效率相当低。这就是进行二进制搜索的时候。还记得我们说过设计条件函数是最难的部分吗?为了找到表中的第 k 个最小值,我们可以设计一个足够的函数,给定一个输入 num,确定是否至少有 k 个值小于或等于 num。最小数量足够令人满意的函数就是我们要找的答案。记住,二进制搜索的关键是发现单调性。在这个问题中,如果 num 足够满足,那么当然任何大于 num 的值都可以满足。这种单调性是我们的二进制搜索算法的基础。

 

Let's consider search space. Obviously the lower bound should be 1, and the upper bound should be the largest value in the Multiplication Table, which is m * n, then we have search space [1, m * n]. The overwhelming advantage of binary search solution to heap solution is that it doesn't need to explicitly calculate all numbers in that table, all it needs is just picking up one value out of the search space and apply enough function to this value, to determine should we keep the left half or the right half of the search space. In this way, binary search solution only requires constant space complexity, much better than heap solution.

让我们考虑一下搜索空间。显然,下界应该是1,上界应该是乘法表中的最大值,也就是 m * n,然后我们有了搜索空间[1,m * n ]。对于堆的解决方案,二进制搜索的绝对优势在于它不需要显式计算表中的所有数字,它所需要的只是从搜索空间中选取一个值,并对该值应用足够的函数,以确定我们应该保留搜索空间的左半部分还是右半部分。这样,二进制搜索解决方案只需要恒定的空间复杂度,比堆解决方案好得多。

 

Next let's consider how to implement enough function. It can be observed that every row in the Multiplication Table is just multiples of its index. For example, all numbers in 3rd row [3,6,9,12,15...] are multiples of 3. Therefore, we can just go row by row to count the total number of entries less than or equal to input num. Following is the complete solution.

接下来让我们考虑如何实现足够的函数。可以观察到,乘法表中的每一行仅仅是其索引的倍数。例如,第3行[3,6,9,12,15... ]中的所有数字都是3的倍数。因此,我们可以逐行计算小于或等于输入 num 的条目总数。以下是完整的解决方案。

 

def findKthNumber(m: int, n: int, k: int) -> int:
    def enough(num) -> bool:
        count = 0
        for val in range(1, m + 1):  # count row by row
            add = min(num // val, n)
            if add == 0:  # early exit
                break
            count += add
        return count >= k                

    left, right = 1, n * m
    while left < right:
        mid = left + (right - left) // 2
        if enough(mid):
            right = mid
        else:
            left = mid + 1
    return left 

 

In LC 410 above, we have doubt "Is the result from binary search actually a subarray sum?". Here we have a similar doubt: "Is the result from binary search actually in the Multiplication Table?". The answer is yes, and we also can apply proof by contradiction. Denote num as the minimal input that satisfies enough function. Let's assume that num is not in the table, which means that num is not divisible by any val in [1, m], that is, num % val > 0. Therefore, changing the input from num to num - 1 doesn't have any effect on the expression add = min(num // val, n). So enough(num - 1) would also return True, same as enough(num). But we already know num is the minimal input satisfying enough function, so enough(num - 1) has to be False. Contradiction! The opposite of our original assumption is true: num is actually in the table.

在上面的 LC 410中,我们怀疑“二进制搜索的结果实际上是子数组和吗?”.这里我们有一个类似的疑问: “二进制搜索的结果实际上在乘法表中吗?”.答案是肯定的,我们也可以申请反证法。表示 num 为满足足够函数的最小输入。假设 num 不在表中,这意味着 num 不能被[1,m ]中的任何 val 整除,即 num% val > 0。因此,将输入从 num 更改为 num-1对表达式 add = min (num/val,n)没有任何影响。所以 enough (num-1)也会返回 True,与 enough (num)一样。但是我们已经知道 num 是最小输入满足足够的函数,所以 enough (num-1)必须是 False。矛盾!与我们最初的假设相反的是正确的: num 实际上在表中。

 


719. Find K-th Smallest Pair Distance [Hard]719. 查找 k 次最小对距离[硬]

 

Given an integer array, return the k-th smallest distance among all the pairs. The distance of a pair (A, B) is defined as the absolute difference between A and B.

给定一个整数数组,返回所有对中的第 k 个最小距离。一对(a,b)的距离定义为 a 和 b 之间的绝对差。

 

Example :

例子:

 

Input:
nums = [1,3,1]
k = 1
Output: 0 
Explanation:
Following are all the pairs. The 1st smallest distance pair is (1,1), and its distance is 0.
(1,3) -> 2
(1,1) -> 0
(3,1) -> 2

 

Very similar to LC 668 above, both are about finding Kth-Smallest. Just like LC 668, We can design an enough function, given an input distance, determine whether there're at least k pairs whose distances are less than or equal to distance. We can sort the input array and use two pointers (fast pointer and slow pointer, pointed at a pair) to scan it. Both pointers go from leftmost end. If the current pair pointed at has a distance less than or equal to distance, all pairs between these pointers are valid (since the array is already sorted), we move forward the fast pointer. Otherwise, we move forward the slow pointer. By the time both pointers reach the rightmost end, we finish our scan and see if total counts exceed k. Here is the implementation:

非常类似于上面的 LC 668,都是关于寻找 kth-最小。就像 LC 668一样,我们可以设计一个足够的函数,给定一个输入距离,判断是否至少存在 k 对,它们的距离小于或等于距离。我们可以对输入数组进行排序,并使用两个指针(指向一对的快速指针和慢速指针)来扫描它。两个指针都从最左端出发。如果指向的当前对的距离小于或等于距离,则这些指针之间的所有对都是有效的(因为数组已经排序) ,我们向前移动快速指针。否则,我们向前移动慢指针。当两个指针都到达最右端时,我们完成扫描,看看总计数是否超过 k。以下是实现方法:

 

def enough(distance) -> bool:  # two pointers
    count, i, j = 0, 0, 0
    while i < n or j < n:
        while j < n and nums[j] - nums[i] <= distance:  # move fast pointer
            j += 1
        count += j - i - 1  # count pairs
        i += 1  # move slow pointer
    return count >= k

 

Obviously, our search space should be [0, max(nums) - min(nums)]. Now we are ready to copy-paste our template:

显然,我们的搜索空间应该是[0,max (nums)-min (nums)]:

 

def smallestDistancePair(nums: List[int], k: int) -> int:
    nums.sort()
    n = len(nums)
    left, right = 0, nums[-1] - nums[0]
    while left < right:
        mid = left + (right - left) // 2
        if enough(mid):
            right = mid
        else:
            left = mid + 1
    return left

1201. Ugly Number III [Medium] 1201. Ugly Number III [ Medium ]

 

Write a program to find the n-th ugly number. Ugly numbers are positive integers which are divisible by a or b or c.

编写一个程序来找到第 n 个难看的数。丑数是能被 a 或 b 或 c 整除的正整数。

 

Example :

例子:

 

Input: n = 3, a = 2, b = 3, c = 5
Output: 4
Explanation: The ugly numbers are 2, 3, 4, 5, 6, 8, 9, 10... The 3rd is 4.
Input: n = 4, a = 2, b = 3, c = 4
Output: 6
Explanation: The ugly numbers are 2, 3, 4, 6, 8, 9, 10, 12... The 4th is 6.

 

Nothing special. Still finding the Kth-Smallest. We need to design an enough function, given an input num, determine whether there are at least n ugly numbers less than or equal to num. Since a might be a multiple of b or c, or the other way round, we need the help of greatest common divisor to avoid counting duplicate numbers.

没什么特别的。还在寻找最小的 kth。我们需要设计一个足够的函数,给定一个输入 num,确定是否至少有 n 个丑陋的数字小于或等于 num。因为 a 可能是 b 或 c 的倍数,或者反过来,我们需要最大公约数的帮助来避免计算重复的数字。

 

def nthUglyNumber(n: int, a: int, b: int, c: int) -> int:
    def enough(num) -> bool:
        total = num//a + num//b + num//c - num//ab - num//ac - num//bc + num//abc
        return total >= n

    ab = a * b // math.gcd(a, b)
    ac = a * c // math.gcd(a, c)
    bc = b * c // math.gcd(b, c)
    abc = a * bc // math.gcd(a, bc)
    left, right = 1, 10 ** 10
    while left < right:
        mid = left + (right - left) // 2
        if enough(mid):
            right = mid
        else:
            left = mid + 1
    return left

1283. Find the Smallest Divisor Given a Threshold [Medium]求给定阈值的最小除数[中等]

 

Given an array of integers nums and an integer threshold, we will choose a positive integer divisor and divide all the array by it and sum the result of the division. Find the smallest divisor such that the result mentioned above is less than or equal to threshold.

给定一个整数数组和一个整数阈值,我们选择一个正整数除数,用它除所有的数组,并对除数的结果求和。求最小除数,使得上面提到的结果小于或等于阈值。

 

Each result of division is rounded to the nearest integer greater than or equal to that element. (For example: 7/3 = 3 and 10/2 = 5). It is guaranteed that there will be an answer.

除法的每个结果舍入到最接近的大于或等于该元素的整数。(例如: 7/3 = 3和10/2 = 5)。我们保证会有一个答案。

 

Example :

例子:

 

Input: nums = [1,2,5,9], threshold = 6
Output: 5
Explanation: We can get a sum to 17 (1+2+5+9) if the divisor is 1. 
If the divisor is 4 we can get a sum to 7 (1+1+2+3) and if the divisor is 5 the sum will be 5 (1+1+1+2). 

 

After so many problems introduced above, this one should be a piece of cake. We don't even need to bother to design a condition function, because the problem has already told us explicitly what condition we need to satisfy.

在上面介绍了这么多问题之后,这个问题应该是小菜一碟。我们甚至不需要费心去设计一个条件函数,因为问题已经明确地告诉我们需要满足什么条件。

 

def smallestDivisor(nums: List[int], threshold: int) -> int:
    def condition(divisor) -> bool:
        return sum((num - 1) // divisor + 1 for num in nums) <= threshold

    left, right = 1, max(nums)
    while left < right:
        mid = left + (right - left) // 2
        if condition(mid):
            right = mid
        else:
            left = mid + 1
    return left

End

 

Wow, thank you so much for making it to the end! Really appreciate that. As you can see from the python codes above, they all look very similar to each other. That's because I copy-pasted my own template all the time. No exception. This is the strong proof of my template's powerfulness and adaptability. I believe everyone can acquire this binary search template to solve many problems. All we need is just more practice to build up our ability to discover the monotonicity of the problem and to design a beautiful condition function.

哇,非常感谢你能坚持到最后!非常感谢。正如你可以从上面的 python 代码中看到的,它们看起来都非常相似。那是因为我一直在复制粘贴我自己的模板。没有例外。这有力地证明了我的模板的强大性和适应性。我相信每个人都可以获得这个二进制搜索模板来解决许多问题。我们所需要的只是更多的练习,以建立我们发现问题的单调性和设计一个美丽的条件函数的能力。

 

Hope this helps.

希望这能有所帮助。

 

Reference

 

posted @ 2022-04-13 16:12  ka2uha  阅读(83)  评论(0)    收藏  举报