Learnitweb

Problem 72: Edit Distance

Problem Overview

You are given two strings, word1 and word2. You need to determine the minimum number of operations required to convert word1 into word2.

The allowed operations are:

  1. Insert a character.
  2. Delete a character.
  3. Replace a character.

You must find the fewest possible sequence of these operations to make the two strings identical.

Example 1

Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation:
horse → rorse (replace 'h' with 'r')
rorse → rose  (remove 'r')
rose → ros    (remove 'e')

Example 2

Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation:
intention → inention (remove 't')
inention → enention (replace 'i' with 'e')
enention → exention (replace 'n' with 'x')
exention → exection (replace 'n' with 'c')
exection → execution (insert 'u')

Intuition / Approach

This problem is a classic example of Dynamic Programming (DP).

Let’s define a 2D DP table dp[i][j], where:

  • dp[i][j] represents the minimum edit distance required to convert the first i characters of word1 into the first j characters of word2.

The intuition is to build this table step by step by considering one character at a time from both strings.

We can think recursively:

  • If the last characters of both words are the same, then no operation is needed for that position, and we can move to the previous characters.
  • If they are different, we can choose one of the three operations (insert, delete, replace), and each operation affects the DP state differently.

Algorithm Explanation

Let m = word1.length(), and n = word2.length().

  1. Base Cases
    • If word1 is empty, the only option is to insert all characters of word2.
      So, dp[0][j] = j.
    • If word2 is empty, we need to delete all characters of word1.
      So, dp[i][0] = i.
  2. Recursive Relation
    • If word1[i - 1] == word2[j - 1], then no change is needed:
      dp[i][j] = dp[i - 1][j - 1].
    • Otherwise, we can choose one of the three operations:
      • Insert: Add a character to word1.
        dp[i][j - 1] + 1
      • Delete: Remove a character from word1.
        dp[i - 1][j] + 1
      • Replace: Change one character.
        dp[i - 1][j - 1] + 1
    We take the minimum of these three options: dp[i][j] = min( dp[i - 1][j] + 1, // delete dp[i][j - 1] + 1, // insert dp[i - 1][j - 1] + 1 // replace )
  3. Final Answer
    The result will be in dp[m][n].

Java Code Implementation

public class EditDistance {

    public int minDistance(String word1, String word2) {
        int m = word1.length();
        int n = word2.length();

        int[][] dp = new int[m + 1][n + 1];

        // Base case initialization
        for (int i = 0; i <= m; i++) {
            dp[i][0] = i; // delete all characters
        }
        for (int j = 0; j <= n; j++) {
            dp[0][j] = j; // insert all characters
        }

        // Fill the dp table
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    int insert = dp[i][j - 1] + 1;
                    int delete = dp[i - 1][j] + 1;
                    int replace = dp[i - 1][j - 1] + 1;
                    dp[i][j] = Math.min(insert, Math.min(delete, replace));
                }
            }
        }

        return dp[m][n];
    }

    // Example usage
    public static void main(String[] args) {
        EditDistance solver = new EditDistance();
        System.out.println(solver.minDistance("horse", "ros")); // Output: 3
        System.out.println(solver.minDistance("intention", "execution")); // Output: 5
    }
}

Complexity Analysis

  1. Time Complexity: O(m × n)
    • We compute every cell in the DP table once, and each computation takes constant time.
    • Therefore, the total time is proportional to the product of the lengths of the two words.
  2. Space Complexity: O(m × n)
    • We use a 2D table of size (m + 1) × (n + 1) to store results.
    • However, this can be optimized to O(min(m, n)) by using only two rows at a time, since each state depends only on the previous row.

Alternative Approach (Space Optimization)

You can reduce space complexity using only two 1D arrays:

public int minDistanceOptimized(String word1, String word2) {
    int m = word1.length(), n = word2.length();
    int[] prev = new int[n + 1];
    int[] curr = new int[n + 1];

    for (int j = 0; j <= n; j++) {
        prev[j] = j;
    }

    for (int i = 1; i <= m; i++) {
        curr[0] = i;
        for (int j = 1; j <= n; j++) {
            if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                curr[j] = prev[j - 1];
            } else {
                curr[j] = Math.min(
                    Math.min(prev[j], curr[j - 1]), 
                    prev[j - 1]
                ) + 1;
            }
        }
        prev = curr.clone();
    }

    return prev[n];
}

This approach yields the same result but uses only linear space.