LeetCode472 Concatenated Words
Given a list of words (without duplicates), please write a program that returns all concatenated words in the given list of words.
A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array.
solution1: dp
//the general idea of this solution is very simple: we just need to check every words to see if any of them can be concatenated by others. this can be done using dp tp check.
//and another tricky part of this solution is in order to avoid one word's concatenate is itself, we presort the words array, and also do construct the wordDict and checking if the word is concatenated or nor at the same time. why we can do this? bacaue "less length word might be the concated part of longer word, but not vice versa", and what's good for this? we don't need to worry about that a words concate is itself because when we check current word, the current word is not in wordDict yet.
//nevertheless, it is genius.
//but when words is long, we will have TLE.
class Solution {
public List<String> findAllConcatenatedWordsInADict(String[] words) {
List<String> res = new ArrayList<>();
List<String> wordDict = new ArrayList<>();
Arrays.sort(words, (a, b) -> (a.length() - b.length())); // presort words accroding to length
for (int i = 0; i < words.length; i++) {
if (helper(words[i], wordDict)) { //less length word might be the concated part of longer word, but not vice versa
res.add(words[i]);
}
wordDict.add(words[i]); //this is a really tricky one, because we construct wordDict and the result arraylist at the same time.
}
return res;
}
public boolean helper(String s, List<String> wordDict) { //helper() function used to check if this word
if (wordDict.isEmpty()) {
return false;
}
boolean[] dp = new boolean[s.length() + 1]; //pay attention to the length of dp array, dp[i] means the first i-1 chars in s can be concatenated by other words. so dp[i] = dp[j] * wordDict.contains(substring(j.i))
dp[0] = true;
for (int i = 1; i <= s.length(); i++) {
for (int j = 0; j < i; j++) {
if (dp[j] && wordDict.contains(s.substring(j, i))) {
dp[i] = true;
break;
}
}
}
return dp[s.length()];
}
}
the solution is easy, but it keeps TLE when we meet some really long array of words.
but after take a look at the the solution2, I changed the wordDict from list to hashset, which accepted but with a pretty poor time consuming
solution2:
class Solution {
public List<String> findAllConcatenatedWordsInADict(String[] words) {
List<String> res = new ArrayList<>();
if(words == null || words.length == 0) return res;
HashSet<String> set = new HashSet<>();
for (String word: words) {
if(word.equals("")) continue; //commentted this statement, the code is still accepted
set.add(word);
}
for(String word: words) {
if(word.equals("")) continue;
if(dfs(set, word, 0)) {
res.add(word);
}
}
return res;
}
public boolean dfs(HashSet<String> set, String word, int count) { //this dfs function uses to find the word can be concatenated by items in set or not. and this count varible uses to avoid the word find itself as concated by itself. that's forbid because we defined that the concatenated means a least two. we have to
if(count > 0 && set.contains(word)) return true;
int n = word.length();
for(int i = 0; i < n; i++) { //i<n-1 instead of i<n, because if i<n, then max i==n-1, and word.substring(i+1) will be ""
if(set.contains(word.substring(0, i+1)) && dfs(set, word.substring(i+1), count + 1)) {
return true;
}
}
return false;
}
}
I try to change the solution like solution1 eager to get a less time consuming, it is accepted though, but there is not much improvement(?)
//this is solution2, which is pretty much like solution1. I used to think about remove the count parameter in dfs function but I know I can't because if so, we have to presort it.
class Solution {
public List<String> findAllConcatenatedWordsInADict(String[] words) {
List<String> res = new ArrayList<>();
if(words == null || words.length == 0) return res;
HashSet<String> set = new HashSet<>();
// for (String word: words) {
// if(word.equals("")) continue; //commentted this statement, the code is still accepted
// set.add(word);
// }
Arrays.sort(words, (a, b) -> (a.length() - b.length()));
for(String word: words) {
if(word.equals("")) continue;
if(dfs(set, word)) {
res.add(word);
}
set.add(word);
}
return res;
}
public boolean dfs(HashSet<String> set, String word) { //this dfs function uses to find the word can be concatenated by items in set or not. and this count varible uses to avoid the word find itself as concated by itself. that's forbid because we defined that the concatenated means a least two
if(set.contains(word)) return true;
int n = word.length();
for(int i = 0; i < n; i++) { //i<n-1 instead of i<n, because if i<n, then max i==n-1, and word.substring(i+1) will be ""
if(set.contains(word.substring(0, i+1)) && dfs(set, word.substring(i+1))) {
return true;
}
}
return false;
}
}
and solution3, which uses Trie+DFS:
the so called dfs is the tranverse of the trie.
the general idea of the following solution is: using all words to build trie. and tranverse again to those words and check how many .isWord attribute is true for each path. if a path has more than one-time true. then it is one of the valid answer.
//this is the third solution which uses trie and dfs
class Solution {
class TrieNode {
TrieNode[] children;
boolean isWord;
public TrieNode() {
children = new TrieNode[128];
}
}
public List<String> findAllConcatenatedWordsInADict(String[] words) {
List<String> res = new ArrayList<>();
if (words == null || words.length == 0) {
return res;
}
TrieNode root = new TrieNode();
buildTrie(words, root);
for (String word: words) {
if (countWords(word.toCharArray(), 0, root, 0)) {
res.add(word);
}
}
return res;
}
private boolean countWords(char[] chars, int index, TrieNode root, int count) { //this function means to check if current path has more than one words
TrieNode cur = root;
int n = chars.length;
for (int i = index; i < n; i++) { //i'. not sure why i have to start at index
if (cur.children[chars[i]] == null) { //if the children of cur didn't even include current character
return false; //then this path clearly can't be concatenated by other words
}
if (cur.children[chars[i]].isWord) {
if (i == n - 1) { //if i reaches the end of current path
return count >= 1; //we need to know if we have count(why count==1 is ok? because we start count as 0, as count actually means "cut", if we have only one cut, it means the current path did concatenated by 2 words)
}
if (countWords(chars, i + 1, root, count+1)) { //i+1 means the next time, we tranverse the chars starting at i+1
return true;
}
}
cur = cur.children[chars[i]]; //cur moves to next level
}
return false;
}
private void buildTrie(String[] words, TrieNode root) {
for (String word: words) {
char[] chars = word.toCharArray();
TrieNode cur = root;
for (char c: chars) {
if (cur.children[c] == null) {
cur.children[c] = new TrieNode();
}
cur = cur.children[c];
}
cur.isWord = true;
}
}
}

浙公网安备 33010602011771号