Learnitweb

Longest Common Subsequence

Problem Statement
You are given two strings text1 and text2. Your task is to return the length of their longest common subsequence.
A subsequence of a string is a new string that is formed by deleting some (or none) of the characters without changing the order of the remaining characters.

Example
Input:
text1 = "abcde", text2 = "ace"
Output:
3
Explanation: The longest common subsequence is "ace", and its length is 3.

Input:
text1 = "abc", text2 = "def"
Output:
0
Explanation: There are no common characters between the two strings.

Understanding the Problem

We need to find the longest sequence of characters that appears in both strings in the same order, but not necessarily continuously.

For example:

  • In "abcde" and "ace", the characters 'a', 'c', 'e' appear in both strings in the same order.
    So, the LCS (Longest Common Subsequence) = "ace".

A subsequence differs from a substring because the characters do not have to be next to each other.

Thinking Process and Approach

This is a classic dynamic programming (DP) problem.
Let’s think through the problem step by step.

  1. Suppose we compare characters from the end of both strings.
  2. If the last characters match, that character contributes 1 to the LCS.
  3. If they do not match, then we have two options:
    • Ignore the last character of text1 and compute LCS for (text1[0..i-1], text2[0..j])
    • Ignore the last character of text2 and compute LCS for (text1[0..i], text2[0..j-1])
    • Take the maximum of these two results.

This idea naturally leads to a recursive solution, but recursion will recompute many overlapping subproblems.
Hence, we use Dynamic Programming to store previously computed results.

Dynamic Programming Approach

We create a 2D table dp where:

  • dp[i][j] represents the length of the longest common subsequence between
    text1[0..i-1] and text2[0..j-1].

Recurrence Relation

If characters match:

dp[i][j] = 1 + dp[i-1][j-1]

If characters do not match:

dp[i][j] = Math.max(dp[i-1][j], dp[i][j-1])

Base Case

If either string is empty (i == 0 or j == 0),
then dp[i][j] = 0.

Step-by-Step Example

Let’s take text1 = "abcde" and text2 = "ace".

We create a dp table of size (6 x 4) (adding 1 for the empty string case).

We fill the table as follows:

i\j01 (a)2 (c)3 (e)
00000
1 (a)0111
2 (b)0111
3 (c)0122
4 (d)0122
5 (e)0123

The final answer is dp[5][3] = 3.

Java Code Implementation

public class LongestCommonSubsequence {
    public int longestCommonSubsequence(String text1, String text2) {
        int m = text1.length();
        int n = text2.length();

        int[][] dp = new int[m + 1][n + 1];

        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (text1.charAt(i - 1) == text2.charAt(j - 1)) {
                    dp[i][j] = 1 + dp[i - 1][j - 1];
                } else {
                    dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
                }
            }
        }

        return dp[m][n];
    }
}

Thinking Behind the Code

  1. We build a 2D array dp to record the length of the LCS up to each pair of positions (i, j).
  2. If characters match, extend the previous subsequence (+1).
  3. If not, take the longer subsequence formed by ignoring one character from either string.
  4. The bottom-right cell of the table gives the final answer.

Time and Space Complexity

  • Time Complexity: O(m × n)
    We compute each dp[i][j] once.
  • Space Complexity: O(m × n)
    We store results in a 2D matrix.

For optimization, we can reduce space to O(min(m, n)) using only two arrays, since each row depends only on the previous one.

Key Takeaways

  1. The Longest Common Subsequence problem can be solved efficiently using Dynamic Programming.
  2. The idea is to compare characters and build the result using previous computations.
  3. This technique forms the foundation for other string comparison problems like Edit Distance, Longest Palindromic Subsequence, and Sequence Alignment.