最常出现的字符串 Most Common Word

2018-10-26 00:32:05

问题描述:

问题求解:

方法一、Trie

最长出现的字符串,最容易想到的解法就是Trie树了,于是首先使用Trie树进行了实现,代码量有点大,当然了是可以A掉的,只是对于这种Easy的题,理论上是不该超过50行代码的。

public class MostCommonWord {
    class TrieNode {
        public TrieNode[] next = new TrieNode[26];
        public int cnt = 0;
        public String word = null;
    }

    public String mostCommonWord(String paragraph, String[] banned) {
        int[] maxCnt = new int[1];
        String[] res = new String[1];
        TrieNode root = buildTrie(paragraph, banned);
        helper(root, maxCnt, res);
        return res[0];
    }

    private void helper(TrieNode root, int[] maxCnt, String[] res) {
        if (root.cnt > maxCnt[0]) {
            maxCnt[0] = root.cnt;
            res[0] = root.word;
        }
        for (int i = 0; i < 26; i++) {
            if (root.next[i] != null) helper(root.next[i], maxCnt, res);
        }
    }

    private TrieNode buildTrie(String s, String[] banned) {
        Set<Character> set = new HashSet<>();
        Set<String> b = new HashSet<>();
        for (String i : banned) b.add(i);
        set.add(' ');
        set.add('!');
        set.add('?');
        set.add('\'');
        set.add(',');
        set.add(';');
        set.add('.');
        TrieNode root = new TrieNode();
        String lowS = s.toLowerCase() + ' ';
        char[] chs= lowS.toCharArray();
        for (int i = 0; i < chs.length; i++) {
            while (i < chs.length && set.contains(chs[i])) i++;
            TrieNode cur = root;
            for (int j = i; j < chs.length; j++) {
                if (set.contains(chs[j])) {
                    cur.word = lowS.substring(i, j);
                    if (!b.contains(cur.word)) cur.cnt++;
                    i = j;
                    break;
                }
                if (cur.next[chs[j] - 'a'] == null) cur.next[chs[j] - 'a'] = new TrieNode();
                cur = cur.next[chs[j] - 'a'];
            }
        }
        return root;
    }

    public static void main(String[] args) {
        System.out.println('\'');
    }
}

方法二、split

作为一条Easy必然是有简单解,但是还是有点tricky的,这里使用了正则的replaceAll函数来将其他字符转成” “,之后再split并统计即可。

    public String mostCommonWord(String paragraph, String[] banned) {
        String[] strs = paragraph.replaceAll("[!?',;.]", " ").toLowerCase().split(" ");
        Map<String, Integer> map = new HashMap<>();
        Set<String> set = new HashSet<>();
        for (String i : banned) set.add(i);
        set.add("");
        for (String s : strs) {
            if (!set.contains(s)) {
                int cnt = map.getOrDefault(s, 0);
                map.put(s, ++cnt);
            }
        }
        int maxCnt = 0;
        String res = "";
        for (String s : map.keySet()) {
            if (map.get(s) > maxCnt) {
                maxCnt = map.get(s);
                res = s;
            }
        }
        return res;
    }

 

posted @ 2018-10-26 00:53  hyserendipity  阅读(368)  评论(0)    收藏  举报