Find Longest Consecutive Sequence in an Array Optimal HashSet Solution

Problem Statement

Given an unsorted array of integers, your task is to find the length of the longest sequence of consecutive numbers that appear in the array.

  • A consecutive sequence is a group of numbers where each number is exactly 1 more than the previous number.
  • The numbers in the array can be in any order and may contain duplicates.
  • You need to return the length of the longest such sequence.

If the array is empty, return 0.

Examples

Input Array Longest Consecutive Sequence Length Description
[100, 4, 200, 1, 3, 2] [1, 2, 3, 4] 4 Longest consecutive sequence starts at 1 and ends at 4
[0, 3, 7, 2, 5, 8, 4, 6, 0, 1] [0, 1, 2, 3, 4, 5, 6, 7, 8] 9 All elements from 0 to 8 form the longest consecutive sequence
[10, 30, 20] [10] 1 No two numbers are consecutive, so each number alone is a sequence
[1, 9, 3, 10, 2, 20] [1, 2, 3] 3 Consecutive sequence from 1 to 3
[5, 5, 5, 5] [5] 1 All elements are duplicates, so only one unique number exists
[] [] 0 Empty array, no sequence exists

Visualization Player

Solution

The goal is to find the longest chain of consecutive numbers that appear in the array, even though the array itself might be completely unsorted.

A consecutive sequence means a series of numbers like [3, 4, 5, 6] — where each number is 1 more than the previous one. We are not concerned with the order of elements in the original array, just whether the numbers exist to form a sequence.

Understanding the Problem Through Cases

Case 1: Normal case with random numbers
In an array like [100, 4, 200, 1, 3, 2], even though the numbers are jumbled, there's a consecutive sequence [1, 2, 3, 4]. We return 4, the length of that sequence.

Case 2: All numbers already in order
If the array is like [1, 2, 3, 4, 5], then the whole array forms one big sequence. We return the array’s length.

Case 3: Duplicates in the array
For arrays like [5, 5, 5], duplicates don’t help us. We only care about unique numbers. So, we ignore repeats and count the length of the sequence among distinct values.

Case 4: Disconnected numbers
If the numbers have big gaps between them like [10, 30, 50], then each number is its own sequence. The answer will be 1.

Case 5: Empty array
If no numbers are given, then there’s no sequence. So the answer is simply 0.

How We Efficiently Solve It

We put all numbers into a HashSet to remove duplicates and allow fast lookups. Then for each number, we check whether it could be the start of a new sequence (i.e., num - 1 is not in the set). If it is a valid start, we keep checking num + 1, num + 2, ... to count how long the sequence continues.

This way, we avoid re-checking numbers that are part of earlier sequences and ensure we process each number only once. The result is a very fast and scalable solution that works even for large arrays.

⏱️ Time Complexity

Since we use a set and visit each number once, the time complexity is O(n), where n is the number of elements in the array.

Algorithm Steps

  1. Given an unsorted array arr of integers.
  2. Insert all elements into a HashSet for constant-time lookups.
  3. Initialize max_length = 0.
  4. For each element num in arr:
  5. → Check if num - 1 is not in the set (meaning num is the start of a new sequence).
  6. → If so, initialize current_length = 1 and incrementally check for num + 1, num + 2, ... while they exist in the set, incrementing current_length.
  7. → Update max_length with the maximum of max_length and current_length.
  8. Return max_length.

Code

Python
JavaScript
Java
C++
C
def longest_consecutive_sequence(arr):
    num_set = set(arr)
    max_length = 0
    
    for num in arr:
        if num - 1 not in num_set:
            current = num
            length = 1
            while current + 1 in num_set:
                current += 1
                length += 1
            max_length = max(max_length, length)

    return max_length

# Sample Input
arr = [100, 4, 200, 1, 3, 2]
print("Longest Consecutive Sequence Length:", longest_consecutive_sequence(arr))

Time Complexity

CaseTime ComplexityExplanation
Best CaseO(n)In the best case, each element is checked only once, and very few sequences are formed. The HashSet operations are constant time on average.
Average CaseO(n)Each number is visited once when building the set and at most once again when scanning for sequence starts, resulting in linear time.
Worst CaseO(n)Even in the worst case, each element is processed at most twice—once during insertion and once during sequence scanning. So time remains linear.

Space Complexity

O(n)

Explanation: A HashSet is used to store all unique elements of the array for quick lookups, which requires linear space in terms of the input size.