【原创】leetCodeOj --- Repeated DNA Sequences 解题报告

原题地址:

https://oj.leetcode.com/problems/repeated-dna-sequences/

 

题目内容:

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

方法:

大概的方向是,遍历所有长度为10的子串,用一个hash表记录每个不同子串的出现次数,最后输出满足条件的子串。

 

关键问题是:如何更快

 

我的办法:将A、C、G、T映射到1、2、3、4,然后换算成大整数。为了方便计算,字符串的最左边是最低位。这么说有些语焉不详,举几个例子:

AACG = 4311

CGTA = 1432

然后计算出首个字符串的整数值并加入map,这样,每下一个子串都可以通过 整数值/10 + 下一个字符乘以十亿来得到。

这样,hash值的计算从字符串变成了整数,同时,获得下一个字符串的行为也可以在更快的常数次时间内完成,因为操作字符串的时间开支。

 

全部代码:

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        unordered_map<long long,int> dict;
        unordered_map<long long,int> :: iterator it;
        vector<string> res;
        long long flag = 1000000000;
        if (s.size() <= 10)
            return res;
        long long num = generateFirstNum(s);
        dict[num] = 1;
        for (int i = 10; i < s.size(); i ++) {
            num /= 10;
            long long now = getCharNum(s[i]);
            num += now * flag;
            it = dict.find(num);
            if (it == dict.end()) {
                dict[num] = 1;
            } else {
                dict[num] += 1;
            }
        }
        for (it = dict.begin(); it != dict.end(); it ++) {
            if (it->second > 1) {
                generateRes(res,it->first);
            }
        }
        return res;
    }
    
    long long generateFirstNum(string s) {
        long long res = 0;
        long long power = 1;
        for (int i = 0; i < 10; i ++) {
            long long num = getCharNum(s[i]);
            res += num * power;
            power *= 10;
        }
        return res;
    }
    
    long long getCharNum(char s) {
        switch (s) {
            case 'A' : return 1;
            case 'C' : return 2;
            case 'G' : return 3;
            case 'T' : return 4;
        }
    }
    
    char getNumChar(long long s) {
        switch (s) {
            case 1 : return 'A';
            case 2 : return 'C';
            case 3 : return 'G';
            case 4 : return 'T';
        }
    }
    
    void generateRes(vector<string> &res,long long target) {
        string s;
        while (target > 0) {
            char now = getNumChar(target % 10);
            s = s + now;
            target /= 10;
        }
        res.push_back(s);
    }
};

  

posted on 2015-02-15 10:57  shadowmydx'sLab  阅读(235)  评论(0编辑  收藏  举报

导航