author avatar BooleanStack

128. Longest Consecutive Sequence

Master the O(n) solution for finding the longest consecutive sequence in an unsorted array using Hash Sets.

128. Longest Consecutive Sequence

The Longest Consecutive Sequence problem is one of those interview questions that separates the ā€œI can codeā€ folks from the ā€œI understand algorithmsā€ folks. Here’s what you need to do: given an unsorted array of integers, find the length of the longest consecutive elements sequence.

Sounds simple, right? Here’s the catch—you need to solve it in O(n) time.

That one constraint changes everything. It immediately throws out the obvious solution of sorting the array first, since comparison-based sorting takes at least O(nlog⁔n)O(n \log n). Instead, you’ll need to think about data structures that give you constant-time lookups. That’s where things get interesting.

Real-World Analogy


The Naive Approach: Sorting

Let’s start with why sorting feels natural (even though it won’t work for our time constraint).

If we sort [100, 4, 200, 1, 3, 2], we get [1, 2, 3, 4, 100, 200]. Now we can just walk through the array and count consecutive numbers that differ by 1. Easy enough.

Visualization: Sorting Approach

The logic’s solid, but here’s the problem: the sorting step kills our performance. We end up with O(nlog⁔n)O(n \log n) complexity. For huge datasets, that difference between O(nlog⁔n)O(n \log n) and O(n)O(n) really adds up.


The Optimal Approach: Hash Set

To hit that O(n) sweet spot, we need to ditch sorting entirely. Enter the Hash Set—our secret weapon for O(1) lookups.

The Algorithm

Here’s how it works:

  1. Build the Set: Throw all numbers into a Hash Set. This gives us O(1) lookups and automatically handles duplicates.
  2. Find the Starting Points: Loop through the set. For each number num, check if num - 1 exists.
    • If num - 1 exists, then num isn’t the start of a sequence (it’s part of one that started earlier). Skip it.
    • If num - 1 doesn’t exist, bingo—num is the start of a new sequence.
  3. Count the Sequence: When you find a start, use a while loop to check for num + 1, num + 2, etc., until the sequence breaks. Track the length.
  4. Track the Max: Keep updating the maximum length you’ve found.

Interactive Visualization: Hash Set Strategy

Watch how the algorithm identifies sequence starts and walks through them efficiently.

Longest Consecutive Sequence - HashSet Approach

Click "Start" to begin visualization

100
4
200
1
3
2

Current Sequence

0

-

Max Length

0

Longest Sequence

-

Algorithm Explanation

Time Complexity: O(n) - Each number is visited at most twice

Space Complexity: O(n) - HashSet stores all numbers

Key Insight:

Only start counting from numbers that are the beginning of a sequence (num-1 not in set)

Wait, why is this O(n)?

I know what you’re thinking—there’s a while loop inside a for loop. Shouldn’t that be O(n2)O(n^2)?

Here’s the thing: the while loop only runs when we find the start of a sequence. And here’s the key insight—each number gets processed at most twice: once when we check if it’s a start, and once when we count it as part of a sequence. We never process the same sequence multiple times.


Implementation

Here’s the code in Java, C++, and Python. It’s production-ready and handles all the edge cases.

import java.util.HashSet;
import java.util.Set;

class Solution {
    public int longestConsecutive(int[] nums) {
        // Handle edge case: empty array
        if (nums == null || nums.length == 0) {
            return 0;
        }

        // Add all numbers to a HashSet for O(1) lookup
        Set<Integer> numSet = new HashSet<>();
        for (int num : nums) {
            numSet.add(num);
        }

        int longestStreak = 0;

        for (int num : numSet) {
            // Only attempt to build a sequence if 'num' is the start
            // (i.e., num - 1 is not in the set)
            if (!numSet.contains(num - 1)) {
                int currentNum = num;
                int currentStreak = 1;

                // Count consecutive numbers
                while (numSet.contains(currentNum + 1)) {
                    currentNum += 1;
                    currentStreak += 1;
                }

                longestStreak = Math.max(longestStreak, currentStreak);
            }
        }

        return longestStreak;
    }
}

Alternative Approach: Union-Find

Here’s something interesting—you can also solve this using Union-Find (Disjoint Set Union). Think of each number as a node in a graph. When you find adjacent numbers (num and num+1), you union them. The answer becomes the size of the largest connected component.

Visualization: Union-Find

See how connecting adjacent numbers creates components, where the biggest component is your answer.

Longest Consecutive Sequence - Union-Find Approach

Click "Start" to begin visualization

100
4
200
1
3
2

Max Component Size

0

Step

0 / -1

Union-Find Approach

Time Complexity: O(n α(n)) ā‰ˆ O(n) - α is inverse Ackermann function

Space Complexity: O(n) - Parent and rank arrays

How it works:

  • Initialize each number as its own parent
  • For each number, union it with num+1 if it exists
  • Track component sizes during union operations
  • The maximum component size is the answer

Union-Find is a neat approach conceptually, but honestly? It’s more complex to implement and has slightly more overhead than the Hash Set solution. I’d stick with Hash Sets for interviews unless you’re specifically asked about Union-Find.


Complexity Analysis

Let’s break down why the Hash Set solution is so efficient.

Yeah, there’s a while loop inside a for loop, but don’t let that fool you. The inner loop only runs when num - 1 isn’t in the set. This means each number participates in the counting process at most once.

Space Complexity: O(n)

We’re storing all unique elements in the Hash Set. Worst case (all elements are unique)? Linear space.


Common Pitfalls & Best Practices

Let me save you from some mistakes I’ve seen (and made):

  1. Forgetting the ā€œStartā€ Check : If you skip the if (!numSet.contains(num - 1)) check, your algorithm becomes O(n2)O(n^2) in the worst case. Imagine the input [1, 2, 3, 4, 5]—you’d recount the sequence 2,3,4,5, then 3,4,5, and so on. Not good.

  2. Using a List for Lookups :

    Checking if something exists in a List is O(n)O(n). Do that inside a loop and you’re back to O(n2)O(n^2). Always use a Set for this problem.

  3. Handling Duplicates: The problem wants consecutive integers. So [1, 2, 2, 3] has a longest sequence of 3 (1, 2, 3). The Hash Set automatically handles this by deduplicating the input. Nice.

  4. Integer Overflow (the sneaky one): In C++ or Java, watch out for numbers near their max values. If num is Integer.MAX_VALUE, then num + 1 wraps around (overflow). If that wrapped value happens to be in your set, your loop might behave unexpectedly. It’s rare in typical test cases, but mentioning it in interviews shows you think about edge cases.

Conclusion

The Longest Consecutive Sequence problem is a perfect example of the classic space-time tradeoff. We use O(n)O(n) extra space (the Hash Set) to unlock O(n)O(n) time complexity, completely sidestepping the limitations of sorting. The real trick? Recognizing sequence boundaries so we don’t waste time recounting the same sequences.

This pattern—using a Hash Set to enable linear-time traversals of unsorted data—comes up all the time in algorithm problems. Master it, and you’ll have a powerful tool for your next interview.

Similar Posts