Algorithms Lesson 25 – LCS Problem | Dataplexa

Longest Common Subsequence (LCS)

In the previous lesson, we solved Fibonacci and Coin Change problems using Dynamic Programming.

Now we move to one of the most important and most frequently asked Dynamic Programming problems: Longest Common Subsequence (LCS).

This problem tests your understanding of subproblems, optimal choices, and DP table construction.


What Is a Subsequence?

A subsequence is a sequence that appears in the same relative order, but not necessarily continuously.

Example:

String: "ABCDE"
Subsequence: "ACE"

Here, the characters are in order, but not adjacent.


What Is Longest Common Subsequence?

Given two strings, the Longest Common Subsequence is the longest sequence that appears in both strings in the same order.

Example:

String 1: "AGGTAB"
String 2: "GXTXAYB"

The LCS is: "GTAB"


Why This Problem Is Hard

A brute-force approach would try all subsequences of both strings.

The number of subsequences grows exponentially, making brute force impractical.

This is where Dynamic Programming becomes necessary.


Defining the Subproblem

Let dp[i][j] represent the length of the LCS between:

• First i characters of string A
• First j characters of string B

Our goal is to compute dp[n][m].


LCS Recurrence Relation

If the characters match:

dp[i][j] = 1 + dp[i-1][j-1]

If the characters do not match:

dp[i][j] = max(dp[i-1][j], dp[i][j-1])

This decision captures the optimal substructure.


Building the DP Table

def lcs(text1, text2):
    n = len(text1)
    m = len(text2)
    dp = [[0]*(m+1) for _ in range(n+1)]

    for i in range(1, n+1):
        for j in range(1, m+1):
            if text1[i-1] == text2[j-1]:
                dp[i][j] = 1 + dp[i-1][j-1]
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])

    return dp[n][m]

lcs("AGGTAB", "GXTXAYB")

Each cell represents a solved subproblem.


Understanding the Table Intuitively

We compare characters one by one and grow the solution gradually.

When characters match, we extend the sequence.

When they do not, we choose the best result found so far.


Time and Space Complexity

Time Complexity: O(n × m)

Space Complexity: O(n × m)

This is a huge improvement over exponential brute force.


Real-World Applications

LCS is used in:

• Version control systems (diff tools)
• DNA sequence analysis
• Text comparison tools
• Plagiarism detection


Mini Practice

Think carefully:

  • Why does dp[0][j] always remain 0?
  • What happens if we swap the strings?

Exercises

Exercise 1:
What does dp[i][j] represent?

The length of the LCS of the first i characters of one string and first j characters of the other.

Exercise 2:
Why do we take max(dp[i-1][j], dp[i][j-1])?

Because we must discard one character and choose the better result.

Exercise 3:
Is LCS the same as longest common substring?

No. Subsequence allows gaps, substring does not.

Quick Quiz

Q1. What property makes DP suitable for LCS?

Overlapping subproblems and optimal substructure.

Q2. What is the base case of the DP table?

When either string length is zero, LCS length is zero.

Longest Common Subsequence is a cornerstone Dynamic Programming problem.

In the next lesson, we will study the Longest Increasing Subsequence (LIS) and see how DP can be optimized further.