Longest Substring with At Most K Distinct Characters - Sliding Window - Visualization - Visualization

Problem Statement

Given a string s and an integer k, return the length of the longest substring that contains at most k distinct characters.

If k is 0 or the string is empty, return 0.

Example 1:
Input: s = "eceba", k = 2 → Output: 3
Explanation: The answer is "ece" with 2 distinct characters.

Example 2:
Input: s = "aa", k = 1 → Output: 2
Explanation: The entire string has only one distinct character.

Examples

s k Output Description
"eceba" 2 3 Longest substring with at most 2 distinct characters is "ece"
"aa" 1 2 Entire string "aa" has only one distinct character
"abcadcacacaca" 3 11 Longest substring with 3 distinct characters is "cadcacacaca"
"aabbcc" 1 2 Max substring with 1 distinct char is "aa", "bb", or "cc"
"aabbcc" 2 4 "aabb", "bbcc" are valid 4-length substrings with 2 distinct chars
"aabbcc" 3 6 Whole string has exactly 3 distinct characters
"aabbcc" 10 6 k > total unique characters; return entire string
"" 2 0 Empty string input returns 0
"abc" 0 0 k = 0 means no characters allowed; return 0
"a" 1 1 Single character string with k = 1 returns 1

Solution

Understanding the Problem

We are given a string s and an integer k. Our goal is to find the length of the longest substring of s that contains at most k distinct characters.

Think of a substring as a piece of the original string that comes in a continuous block. The constraint is that this block can have at most k different letters.

If k is 0 or if the string is empty, it means no characters can be part of the substring — so we return 0 immediately.

We will use the sliding window technique with two pointers (start and end) to maintain a valid window and a hash map to count character frequency.

Step-by-Step Solution with Example

Step 1: Initialize

We start by initializing:

  • A map called charFreq to store how many times each character appears in our window.
  • start = 0 and end will iterate through the string.
  • maxLen = 0 to track the longest valid substring.

Step 2: Traverse the String

We iterate through the string one character at a time using end. At each step:

  • Add s[end] to the charFreq map and update its count.
  • Check: if the number of distinct keys in charFreq exceeds k, the window is no longer valid.

Step 3: Shrink Window

While the window is invalid (i.e., we have more than k distinct characters), we shrink it from the left:

  • Decrease the count of s[start] in the map.
  • If the count becomes zero, remove that character from the map.
  • Move start forward by one.

Step 4: Update Max Length

At each valid window, calculate the window size (end - start + 1) and update maxLen if it's larger than the previous value.

Step 5: Example Walkthrough

Let's take s = "eceba", k = 2

Window: "e" → {'e':1} → valid → maxLen = 1
Window: "ec" → {'e':1, 'c':1} → valid → maxLen = 2
Window: "ece" → {'e':2, 'c':1} → valid → maxLen = 3
Window: "eceb" → {'e':2, 'c':1, 'b':1} → invalid → shrink
  Shrink to "ceb" → remove 'e' once → {'e':1, 'c':1, 'b':1}
  Shrink to "eb" → remove 'c' completely → {'e':1, 'b':1}
Window now valid again → maxLen remains 3
Continue to next char 'a' → invalid → shrink again...
Final result: 3 (substring "ece")

Edge Cases

  • s = "", k = 2 → Output: 0 (empty input)
  • s = "a", k = 0 → Output: 0 (k is 0, no characters allowed)
  • s = "abc", k = 3 → Output: 3 (all characters fit)
  • s = "aaaa", k = 1 → Output: 4 (only one character used)

Final Thoughts

This is a classic sliding window problem where we maintain a valid window using two pointers and a character frequency map. The key is to efficiently expand and contract the window so that it always satisfies the condition of having at most k distinct characters.

With a good understanding of this problem, you'll build intuition for many real-world string-processing tasks and interview questions.

Algorithm Steps

  1. Initialize a hash map charFreq to store frequency of characters in the current window.
  2. Use two pointers: start for the beginning of the window, and end for the current index while expanding the window.
  3. Initialize maxLen = 0 to store the maximum length of a valid substring.
  4. Traverse the string with end from 0 to s.length - 1:
  5. → Add s[end] to charFreq and increment its count.
  6. → While the size of charFreq > k, shrink the window from the left:
  7.     • Decrease the frequency of s[start] in charFreq.
  8.     • If frequency becomes 0, remove s[start] from charFreq.
  9.     • Increment start to shrink the window.
  10. → Update maxLen = max(maxLen, end - start + 1).
  11. After traversal, return maxLen as the final result.

Code

C
C++
Python
Java
JS
Go
Rust
Kotlin
Swift
TS
#include <stdio.h>
#include <string.h>
#include <stdbool.h>

#define MAX_CHAR 256

int longestSubstringKDistinct(const char* s, int k) {
    if (k == 0 || s[0] == '\0') return 0;

    int freq[MAX_CHAR] = {0};
    int distinctCount = 0, maxLen = 0;
    int start = 0, end = 0, len = strlen(s);

    while (end < len) {
        if (freq[(unsigned char)s[end]] == 0) distinctCount++;
        freq[(unsigned char)s[end]]++;

        while (distinctCount > k) {
            freq[(unsigned char)s[start]]--;
            if (freq[(unsigned char)s[start]] == 0) distinctCount--;
            start++;
        }

        int windowLen = end - start + 1;
        if (windowLen > maxLen) maxLen = windowLen;
        end++;
    }
    return maxLen;
}

int main() {
    const char* s1 = "eceba";
    int k1 = 2;
    printf("Longest substring length for '%s' with at most %d distinct chars: %d\n", s1, k1, longestSubstringKDistinct(s1, k1));

    const char* s2 = "aa";
    int k2 = 1;
    printf("Longest substring length for '%s' with at most %d distinct chars: %d\n", s2, k2, longestSubstringKDistinct(s2, k2));

    return 0;
}

Time Complexity

CaseTime ComplexityExplanation
Best CaseO(n)Even in the best case (e.g., when the entire string already has ≤ k distinct characters), the algorithm must traverse the entire string once using the end pointer to build the sliding window.
Average CaseO(n)On average, each character is added and removed from the hash map at most once, and the window adjusts accordingly — leading to a linear pass through the string.
Worst CaseO(n)Even in the worst case (when the character set frequently exceeds k), the sliding window still ensures that each character is processed at most twice — once when added and once when removed — resulting in linear time.

Space Complexity

O(k)

Explanation: The hash map stores at most k characters and their frequencies at any point, so space usage grows with k.


Comments

💬 Please keep your comment relevant and respectful. Avoid spamming, offensive language, or posting promotional/backlink content.
All comments are subject to moderation before being published.


Loading comments...