480. Sliding Window Median
Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value. Examples: [2,3,4] , the median is 3 [2,3], the median is (2 + 3) / 2 = 2.5 Given an array nums, there is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position. Your job is to output the median array for each window in the original array. For example, Given nums = [1,3,-1,-3,5,3,6,7], and k = 3. Window position Median --------------- ----- [1 3 -1] -3 5 3 6 7 1 1 [3 -1 -3] 5 3 6 7 -1 1 3 [-1 -3 5] 3 6 7 -1 1 3 -1 [-3 5 3] 6 7 3 1 3 -1 -3 [5 3 6] 7 5 1 3 -1 -3 5 [3 6 7] 6 Therefore, return the median sliding window as [1,-1,-1,3,5,6]. Note: You may assume k is always valid, ie: k is always smaller than input array's size for non-empty array. https://leetcode.com/problems/sliding-window-median/discuss/96348/Java-solution-using-two-PriorityQueues Almost the same idea of Find Median from Data Stream https://leetcode.com/problems/find-median-from-data-stream/ 1. Use two Heaps to store numbers. maxHeap for numbers smaller than current median, minHeap for numbers bigger than and equal to current median. A small trick I used is always make size of minHeap equal (when there are even numbers) or 1 element more (when there are odd numbers) than the size of maxHeap. Then it will become very easy to calculate current median. 2. Keep adding number from the right side of the sliding window and remove number from left side of the sliding window. And keep adding current median to the result.
Approach #2 Two Heaps! (Lazy Removal) [Accepted]
Intuition
The idea is the same as Approach #3 from 295. Find Median From Data Stream. The only additional requirement is removing the outgoing elements from the window.
Since the window elements are stored in heaps, deleting elements that are not at the top of the heaps is a pain.
Some languages (like Java) provide implementations of the PriorityQueue class that allow for removing arbitrarily placed elements. Generally, using such features is not efficient nor is their portability assured.
Assuming that only the tops of heaps (and by extension the PriorityQueue class) are accessible, we need to find a way to efficiently invalidate and remove elements that are moving out of the sliding window.
At this point, an important thing to notice is the fact that if the two heaps are balanced, only the top of the heaps are actually needed to find the medians. This means that as long as we can somehow keep the heaps balanced, we could also keep some extraneous elements.
Thus, we can use hash-tables to keep track of invalidated elements. Once they reach the heap tops, we remove them from the heaps. This is the lazy removal technique.
An immediate challenge at this point is balancing the heaps while keeping extraneous elements. This is done by actually moving some elements to the heap which has extraneous elements, from the other heap. This cancels out the effect of having extraneous elements and maintains the invariant that the heaps are balanced.
NOTE: When we talk about keeping the heaps balanced, we are not referring to the actual heap sizes. We are only concerned with valid elements and hence when we talk about balancing heaps, we are referring to count of such elements.
Algorithm
-
Two priority queues:
- A max-heap
loto store the smaller half of the numbers - A min-heap
hito store the larger half of the numbers
- A max-heap
-
A hash-map or hash-table
hash_tablefor keeping track of invalid numbers. It holds the count of the occurrences of all such numbers that have been invalidated and yet remain in the heaps. -
The max-heap
lois allowed to store, at worst, one more element more than the min-heaphi. Hence if we have processed kk elements:- If k = 2*n + 1 \quad (\forall \, n \in \mathbb{Z})k=2∗n+1(∀n∈Z), then
lois allowed to hold n+1n+1 elements, whilehican hold nnelements. - If k = 2*n \quad (\forall \, n \in \mathbb{Z})k=2∗n(∀n∈Z), then both heaps are balanced and hold nn elements each.
This gives us the nice property that when the heaps are perfectly balanced, the median can be derived from the tops of both heaps. Otherwise, the top of the max-heap
loholds the legitimate median. - If k = 2*n + 1 \quad (\forall \, n \in \mathbb{Z})k=2∗n+1(∀n∈Z), then
NOTE: As mentioned before, when we are talking about keeping the heaps balanced, the actual sizes of the heaps are irrelevant. Only the count of valid elements in both heaps matter.
-
Keep a
balancefactor. It indicates three situations:balance= 0=0: Both heaps are balanced or nearly balanced.balance< 0<0:loneeds more valid elements. Elements fromhiare moved tolo.balance> 0>0:hineeds more valid elements. Elements fromloare moved tohi.
-
Inserting an incoming number
in_num:-
If
in_numis less than or equal to the top element oflo, then it can be inserted tolo. However this unbalanceshi(hihas lesser valid elements now). Hencebalanceis incremented. -
Otherwise,
in_nummust be added tohi. Obviously, nowlois unbalanced. Hencebalanceis decremented.
-
-
Lazy removal of an outgoing number
out_num:- If
out_numis present inlo, then invalidating this occurrence will unbalanceloitself. Hencebalancemust be decremented. -
If
out_numis present inhi, then invalidating this occurrence will unbalancehiitself. Hencebalancemust be incremented. -
We increment the count of this element in the hash_table table.
- Once an invalid element reaches either of the heap tops, we remove them and decrement their counts in the hash_table table.
- c++ code
-
vector<double> medianSlidingWindow(vector<int>& nums, int k) { vector<double> medians; unordered_map<int, int> hash_table; priority_queue<int> lo; // max heap priority_queue<int, vector<int>, greater<int> > hi; // min heap int i = 0; // index of current incoming element being processed // initialize the heaps while (i < k) lo.push(nums[i++]); for (int j = 0; j < k / 2; j++) { hi.push(lo.top()); lo.pop(); } while (true) { // get median of current window medians.push_back(k & 1 ? lo.top() : ((double)lo.top() + (double)hi.top()) * 0.5); if (i >= nums.size()) break; // break if all elements processed int out_num = nums[i - k], // outgoing element in_num = nums[i++], // incoming element balance = 0; // balance factor // number `out_num` exits window balance += (out_num <= lo.top() ? -1 : 1); hash_table[out_num]++; // number `in_num` enters window if (!lo.empty() && in_num <= lo.top()) { balance++; lo.push(in_num); } else { balance--; hi.push(in_num); } // re-balance heaps if (balance < 0) { // `lo` needs more valid elements lo.push(hi.top()); hi.pop(); balance++; } if (balance > 0) { // `hi` needs more valid elements hi.push(lo.top()); lo.pop(); balance--; } // remove invalid numbers that should be discarded from heap tops while (hash_table[lo.top()]) { hash_table[lo.top()]--; lo.pop(); } while (!hi.empty() && hash_table[hi.top()]) { hash_table[hi.top()]--; hi.pop(); } } return medians; }
-
java using two pq, remove() takes O(n) Almost the same idea of Find Median from Data Stream https://leetcode.com/problems/find-median-from-data-stream/ Use two Heaps to store numbers. maxHeap for numbers smaller than current median, minHeap for numbers bigger than and equal to current median. A small trick I used is always make size of minHeap equal (when there are even numbers) or 1 element more (when there are odd numbers) than the size of maxHeap. Then it will become very easy to calculate current median. Keep adding number from the right side of the sliding window and remove number from left side of the sliding window. And keep adding current median to the result. public class Solution { PriorityQueue<Integer> minHeap = new PriorityQueue<Integer>(); PriorityQueue<Integer> maxHeap = new PriorityQueue<Integer>( new Comparator<Integer>() { public int compare(Integer i1, Integer i2) { return i2.compareTo(i1); } } ); public double[] medianSlidingWindow(int[] nums, int k) { int n = nums.length - k + 1; if (n <= 0) return new double[0]; double[] result = new double[n]; for (int i = 0; i <= nums.length; i++) { if (i >= k) { result[i - k] = getMedian(); remove(nums[i - k]); } if (i < nums.length) { add(nums[i]); } } return result; } private void add(int num) { if (num < getMedian()) { maxHeap.add(num); } else { minHeap.add(num); } if (maxHeap.size() > minHeap.size()) { minHeap.add(maxHeap.poll()); } if (minHeap.size() - maxHeap.size() > 1) { maxHeap.add(minHeap.poll()); } } private void remove(int num) { if (num < getMedian()) { maxHeap.remove(num); } else { minHeap.remove(num); } if (maxHeap.size() > minHeap.size()) { minHeap.add(maxHeap.poll()); } if (minHeap.size() - maxHeap.size() > 1) { maxHeap.add(minHeap.poll()); } } private double getMedian() { if (maxHeap.isEmpty() && minHeap.isEmpty()) return 0; if (maxHeap.size() == minHeap.size()) { return ((double)maxHeap.peek() + (double)minHeap.peek()) / 2.0; } else { return (double)minHeap.peek(); } } }
- If
posted on 2018-11-08 02:19 猪猪🐷 阅读(182) 评论(0) 收藏 举报
浙公网安备 33010602011771号