Longest Common Substring in Given Two Strings - Visualization

Visualization Player

Problem Statement

Given two strings s1 and s2, your task is to find the longest common substring shared between them.

  • A common substring is a sequence of characters that appears consecutively in both strings.
  • Return the length of the longest such substring.
  • If no common substring exists, return 0.

This problem is widely used to test your understanding of dynamic programming, particularly with 2D table approaches for string comparison.

Examples

s1 s2 Output Description
"abcde" "abfce" 2 The longest common substring is "ab" of length 2
"abcdef" "zcdemf" 3 "cde" is the longest substring present in both
"abc" "xyz" 0 No common substring; output is 0
"abcdef" "abcdef" 6 Both strings are identical; full match of length 6
"abcdxyz" "xyzabcd" 4 "abcd" is a common substring starting at different positions
"abc" "ababc" 3 "abc" is the longest substring from s1 fully found in s2
"aaa" "aa" 2 "aa" is the longest common substring
"abc" "" 0 s2 is empty; no substring possible
"" "abc" 0 s1 is empty; output is 0
"" "" 0 Both strings are empty; no substring exists

Solution

Understanding the Problem

We are given two strings: s1 and s2. Our goal is to find the longest common substring shared between them.

A substring means characters must be consecutive and appear in the same order in both strings.

For example, in s1 = "abcde" and s2 = "abfce", the common substrings are "a", "b", and "e". The longest among them is of length 1.

Step-by-Step Solution with Example

Step 1: Choose a strategy

We'll use a technique called Dynamic Programming. It helps us solve problems by breaking them down into smaller overlapping subproblems and storing their solutions.

Step 2: Initialize a 2D array

We create a 2D array dp where dp[i][j] represents the length of the longest common substring ending at s1[i-1] and s2[j-1].

This array has dimensions (len(s1)+1) x (len(s2)+1), and we initialize it with 0s.

Step 3: Define variables to track the result

We keep:

  • maxLen = 0 – to store the length of the longest substring found so far
  • endIndex = 0 – to remember where that substring ends in s1

Step 4: Loop through characters of both strings

We loop through i from 1 to len(s1), and inside that, loop through j from 1 to len(s2).

Step 5: Compare characters

If s1[i-1] == s2[j-1], it means we can extend the previous matching substring, so we do:

dp[i][j] = dp[i-1][j-1] + 1

Step 6: Update the maximum length and position

If this newly found substring is longer than any previous one, update maxLen and endIndex:

if dp[i][j] > maxLen:
    maxLen = dp[i][j]
    endIndex = i

Step 7: Extract the final result

Once we've filled the dp table, we extract the substring from s1:

longest_substring = s1[endIndex - maxLen : endIndex]

Example:

s1 = "abcdf", s2 = "abedf"

The common substrings are "ab" and "df". The longest is "ab" (length 2).

The 2D dp table helps track where the substrings match and how long they are.

Edge Cases

  • Empty strings: If either s1 or s2 is empty, return 0.
  • No common substring: If no characters match consecutively, return 0.
  • Full match: If the strings are identical, return the full string as the result.
  • Multiple same-length substrings: Return any one of them (most implementations return the one that appears first in s1).

Algorithm Steps

  1. Create a 2D array dp of size (len(s1)+1) x (len(s2)+1), initialized with 0. This will store lengths of common suffixes.
  2. Initialize two variables: maxLen = 0 to track the length of the longest common substring, and endIndex = 0 to remember its ending index in s1.
  3. Loop through i from 1 to len(s1):
  4. For each i, loop through j from 1 to len(s2):
  5. If s1[i-1] == s2[j-1], set dp[i][j] = dp[i-1][j-1] + 1.
  6. If dp[i][j] > maxLen, update maxLen and set endIndex = i.
  7. After processing, return s1[endIndex - maxLen : endIndex] as the longest common substring.

Code

Python
Java
JS
C
C++
Go
Rust
Kotlin
Swift
TS
def longest_common_substring(s1, s2):
    m, n = len(s1), len(s2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    max_len = 0
    end_index = 0
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
                if dp[i][j] > max_len:
                    max_len = dp[i][j]
                    end_index = i
    return s1[end_index - max_len:end_index]

# Example usage
s1 = "abcde"
s2 = "abfce"
print(longest_common_substring(s1, s2))  # Output: "abc"

Time Complexity

CaseTime ComplexityExplanation
Best CaseO(m * n)Even in the best case where a common substring is found early, the algorithm must still fill the entire dp table of size (m+1) x (n+1), where m = len(s1) and n = len(s2).
Average CaseO(m * n)On average, each character pair (s1[i], s2[j]) is compared once, and the dp table is filled accordingly, resulting in a time complexity proportional to the product of the string lengths.
Worst CaseO(m * n)In the worst case, where there are no matching substrings or all characters differ, the entire dp table still needs to be filled to determine the result.

Space Complexity

O(m * n)

Explanation: The algorithm uses a 2D dp array of size (m+1) x (n+1) to store intermediate results, leading to O(m * n) space usage. Space can be optimized to O(n) using two rolling rows, but the standard approach uses full table.


Comments

💬 Please keep your comment relevant and respectful. Avoid spamming, offensive language, or posting promotional/backlink content.
All comments are subject to moderation before being published.


Loading comments...