LeetCode -- Repeated DNA Sequences
題目描述:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: ACGAATTCCG. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT,
Return:
[AAAAACCCCC, CCCCCAAAAA].
就是在一個字符串s中長度為10的字串中找到出現次數大於1次的子串。
思路:
使用哈希統計s[i...i+10]出現的次數,i∈[0,n-9)。
實現代碼:
public class Solution {
public IList FindRepeatedDnaSequences(string s)
{
if(s.Length < 11){
return new List();
}
var hash = new Dictionary();
for(var i = 0;i < s.Length - 9; i++){
var t = s.Substring(i,10);
if(!hash.ContainsKey(t)){
hash.Add(t,1);
}
else{
hash[t]++;
}
}
var ret = new List();
foreach(var k in hash.Keys){
if(hash[k] > 1){
ret.Add(k);
}
}
return ret;
}
}