問題描述:
給定兩個序列(C中可認為是數組或字符串,python中可認為是list),找到二者的最大公共子序列。子序列中元素相對順序不變,且不一定連續,例如“abcdef”中,"abc","ace"都算作子序列,當然不難得出一個結論,一個長度為n的序列,子序列成分為2^n個(回想下排列組合)
遞歸解法:
對於指數復雜度問題,往往不能一步到位(直接窮舉當然是不能接受的),所以考慮是否能通過迂回的辦法,嘗試解決掉它的子問題。對於兩個序列x,y,長度分別為n,m,可以發現x,y的LCS結果可以由它的三個子問題其中之一得出:
1. LCS(x1...n-1,y1...m)
2. LCS(x1...n, y1...m-1)
3. LCS(x1...n-1,y1...m-1) + 公共尾元素
PYTHON代碼:
def lcs_len(x, y): """This function returns length of longest common sequence of x and y.""" if len(x) == 0 or len(y) == 0: return 0 xx = x[:-1] # xx = sequence x without its last element yy = y[:-1] if x[-1] == y[-1]: # if last elements of x and y are equal return lcs_len(xx, yy) + 1 else: return max(lcs_len(xx, y), lcs_len(x, yy)) def lcs_len(x, y): """This function returns length of longest common sequence of x and y.""" if len(x) == 0 or len(y) == 0: return 0 xx = x[:-1] # xx = sequence x without its last element yy = y[:-1] if x[-1] == y[-1]: # if last elements of x and y are equal return lcs_len(xx, yy) + 1 else: return max(lcs_len(xx, y), lcs_len(x, yy))
動態規劃解法 O(n^2):
顯然,遞歸操作引入了很多重復計算。動態規劃正好能解決這一問題,它的一個英文解釋非常到位:whenever the results of subproblems are needed, they have already been computed, and can simply be looked up in a table。即所有子問題的計算都能由查表來完成!先來看下代碼:
def lcs(x, y): n = len(x) m = len(y) table = dict() # a hashtable, but we'll use it as a 2D array here for i in range(n+1): # i=0,1,...,n for j in range(m+1): # j=0,1,...,m if i == 0 or j == 0: table[i, j] = 0 elif x[i-1] == y[j-1]: table[i, j] = table[i-1, j-1] + 1 else: table[i, j] = max(table[i-1, j], table[i, j-1]) # Now, table[n, m] is the length of LCS of x and y. # Let's go one step further and reconstruct # the actual sequence from DP table: def recon(i, j): if i == 0 or j == 0: return "" elif x[i-1] == y[j-1]: return recon(i-1, j-1) + str(x[i-1]) elif table[i-1, j] > table[i, j-1]: return recon(i-1, j) else: return recon(i, j-1) return recon(n, m) def lcs(x, y): n = len(x) m = len(y) table = dict() # a hashtable, but we'll use it as a 2D array here for i in range(n+1): # i=0,1,...,n for j in range(m+1): # j=0,1,...,m if i == 0 or j == 0: table[i, j] = 0 elif x[i-1] == y[j-1]: table[i, j] = table[i-1, j-1] + 1 else: table[i, j] = max(table[i-1, j], table[i, j-1]) # Now, table[n, m] is the length of LCS of x and y. # Let's go one step further and reconstruct # the actual sequence from DP table: def recon(i, j): if i == 0 or j == 0: return "" elif x[i-1] == y[j-1]: return recon(i-1, j-1) + str(x[i-1]) elif table[i-1, j] > table[i, j-1]: return recon(i-1, j) else: return recon(i, j-1) return recon(n, m)
代碼中用到了一個2D的表,table(i,j)則表示子問題(i,j)的LCS_LEN,經過分析,它的值只可能是table(i-1,j-1),table(i,j-1),table(i-1,j)之一,所以從上到下,從左到右的賦值方式不會出現table(i,j)無法賦值的情況; 當然求得LCS_LEN並不是我們的最終目的,特別是在應用中,一般都需要得到這個LCS,那麼可以通過table來求得結果(見代碼)。
動態規劃解法——優化 O(NlogN):
查了一些資料,貌似是有nlogn的解法,但需要輸入滿足一定條件,由於不具有普遍性,所以姑且認為研究它的意義不大,如果後面碰到,我們再做分析吧