187. Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
Example:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]


https://leetcode.com/problems/repeated-dna-sequences/discuss/53855/7-lines-simple-Java-O(n)

 it is very similar to "sliding window of size 10", basically you made use of HashSet which will save unique element, so when it rejects, the element has been saved to the set before


class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        HashSet<String> set = new HashSet<>();
        HashSet<String> res = new HashSet<>(); // use set to deduplicate 
        for(int i = 0; i + 9 < s.length(); i++){ // i + 9 < s.length() 
            String ten = s.substring(i, i + 10);
            if(!set.add(ten)){
                res.add(ten);
            }
            
        }
        return new ArrayList(res); // new ArrayList(set)
        
    }
}

 

posted on 2018-11-08 02:17  猪猪&#128055;  阅读(121)  评论(0)    收藏  举报

导航