LeetCode692 Top K Frequency Words

I solved this using pq, but this problem is actually very much alike LC642, which can be solved by Trie.

class Solution {
    class Node {
        String word;
        int times;
        public Node(String word, int times){
            this.word = word;
            this.times = times;
        }
    }
    public List<String> topKFrequent(String[] words, int k) {
        //i dont need tp qucikly update the words array, so we use pq, but if we need to update with new words, then it's hard to maintain pq
        PriorityQueue<Node> pq =  new PriorityQueue<>(words.length, (a, b) -> (a.times== b.times? a.word.compareTo(b.word): b.times - a.times));
        HashMap<String, Integer> map = new HashMap<>();
        for (String word: words) {
            map.put(word, map.getOrDefault(word, 0) + 1);
        }
        for (String key: map.keySet()) {
            pq.offer(new Node(key, map.get(key)));
        }
        List<String> list = new ArrayList<>();
        while (k > 0) {
            list.add(pq.poll().word);
            k--;
        }
        return list;
    }
}

how does it solved by trie?
first, let;s think about why do we want to solve this problem using trie.
In real word, the given words can not be static all the time, there will be many update later. and if each time we construct a pq, then the there will be many duplicate calculation which is time consuming. but if we use a trie, each time we want to update one word, we just need to construct a new path on the trie(if the word is a new word), or update the times attribute for some node in the trie. either way, update will cost O(logn) and get top k will get O(klogn).

public List<String> topKFrequent(String[] words, int k) {
        // calculate frequency of each word
        Map<String, Integer> freqMap = new HashMap<  >();
        for(String word : words) {
            freqMap.put(word, freqMap.getOrDefault(word, 0) + 1);
        }
        // build the buckets
        TrieNode[] count = new TrieNode[words.length + 1];
        for(String word : freqMap.keySet()) {
            int freq = freqMap.get(word);
            if(count[freq] == null) {
                count[freq] = new TrieNode();
            }
            addWord(count[freq], word);
        }
        // get k frequent words
        List<String> list = new LinkedList<>();
        for(int f = count.length - 1; f >= 1 && list.size() < k; f--) {
            if(count[f] == null) continue;
            getWords(count[f], list, k);
        }
        return list;
    }
    
    private void getWords(TrieNode node, List<String> list, int k) {
        if(node == null) return;
        if(node.word != null) {
            list.add(node.word);
        }
        if(list.size() == k) return;
        for(int i = 0; i < 26; i++) {
            if(node.next[i] != null) {
                getWords(node.next[i], list, k);
            }
        }
    }
    
    private boolean addWord(TrieNode root, String word) {
        TrieNode curr = root;
        for(char c : word.toCharArray()) {
            if(curr.next[c - 'a'] == null) {
                curr.next[c - 'a'] = new TrieNode();
            }
            curr = curr.next[c - 'a'];
        }
        curr.word = word;
        return true;
    }
    
    class TrieNode {
        TrieNode[] next;
        String word;
        TrieNode() {
            this.next = new TrieNode[26];
            this.word = null;
        }
    }

and there is a third way to do this: remeber that quick select algorithm is used to find the kth element in an unsorted list.
now, imagine that we want to sort a list of string based on their frequency. so first we calculated their frequency, and then using quick select to find the kth element k-1nd element…to 1th element.(why don’t we start from find the 1st? think about the quick select process)
quick select algorithm is based on quick sort. but instead of double recursion in quick sort, we only do one recustion. so we will have klogn of the time consuming.
but quicksort and an unstable sorting, which means when times are the same between strings, it might can’t print out in lexicographic order.

posted @ 2020-05-04 10:45  EvanMeetTheWorld  阅读(31)  评论(0)    收藏  举报