Tuesday, 22 September 2015

Longest common subsequence

Problem

Given two strings find their longest common subsequence. A subsequence is a sequence which can be derived by deleting some of the elements of the original sequence. For example. ale is a subsequence of apple. gre is a subsequence of greed. If a string is subsequence of two strings, i,e it can be obtained by removing some characters from two strings then it is called a common subsequence. The problem is to find a subsequence of highest length of two given strings.

Solution

We will solve this by dynamic programming. If two strings end with the same character then we can remove that character from both the strings and compute the longest common subsequence for the rest of the string and at the end append the removed character to the result. If they don't end in the same character then one of the end characters will not be in the longest common subsequence. The longest common subsequence is maximum of LCS( A[0,n] , B[0,N-1] ) and LCS( A[0,N-1] , B[0,N] ). These two assumptions give the following formulas
LCS( a[0,n] , b[0,n] ) = a[n] + LCS( a[0,n-1] , b[0,n-1] ) when a[n] = b[n]
                          = Max( LCS( a[0,n] , b[0,n-1] ) , LCS( a[0,n-1] , b[0,n] ) when a[n] != b[n]

We solve this by memoizing the result in a m * n array. Each of the cell is computed once, so the complexity is O(mn). For two strings of length n the order is O(n^2)

Code

public class LongestCommonSubsequence
{
 public static void main(String[] args)
 {
  String a = "alfkjalfjlkj";
  String b = "ajflaklfjlaj";
  String result = findLCS(a, b);
  System.out.println(result);
 }

 private static String findLCS(String a, String b)
 {
  int[][] memo = new int[a.length() + 1][b.length() + 1];

  for (int i = a.length() - 1; i >= 0; --i)
   for (int j = b.length() - 1; j >= 0; --j)
   {
    if (a.charAt(i) == b.charAt(j))
     memo[i][j] = memo[i + 1][j + 1] + 1;
    else
     memo[i][j] = Math.max(memo[i + 1][j], memo[i][j + 1]);
   }

  int i = 0;
  int j = 0;

  StringBuffer result = new StringBuffer();
  while (i < a.length() && j < b.length())
  {
   if (a.charAt(i)==b.charAt(j))
   {
    result.append(a.charAt(i));
    i++;
    j++;
   } else if (memo[i+1][j] > memo[i][j+1])
    i++;
   else
    j++;
  }
  return result.toString();
 }
}

No comments:

Post a Comment