Master the art of grouping anagrams using hash maps and canonical representations to solve LeetCode 49 efficiently.
The Group Anagrams problem is one of those classic interview questions that looks scary at first glance but is actually pretty satisfying once you see the trick. The goal? Take a messy list of words and organize them so that words made of the same letters sit together.
It’s a fantastic problem for learning how to design keys for hash maps and how to normalize data into what we call a “canonical form.”
If you’ve ever played Scrabble or solved word jumbles, you’ve essentially run this algorithm in your head: you look at a pile of letters and realize that “eat”, “tea”, and “ate” are just different arrangements of the same building blocks.
Real-World Analogy: Think of a LEGO set. A LEGO car and a LEGO boat might look completely different, but if you take them apart brick by brick, they might consist of the exact same pile of pieces.
In this problem, that “pile of bricks” is our canonical form. We disassemble every word into its base components (letters), and if two words produce the exact same pile, they belong in the same box.
The most intuitive way to handle this? Sort the letters. If you take any anagram and sort its characters alphabetically, they become identical.
By using this sorted string as a key in a Hash Map (or Dictionary), we can group all the matching original strings together. It’s simple, clean, and easy to explain.
Explore how words are processed, sorted into keys, and stored in the hash map.
Time Complexity: O(N × K log K) where N is the number of words and K is the maximum length of a word
To crack this efficiently, we need to lean on two main ideas:
Canonical Representation : Think of this as a unique ID tag. For anagrams, we need a version of the word that looks exactly the same for every member of the group, but different for anyone else.
Key Concept: A Canonical Representation transforms complex data into a standard format (like a sorted string) so we can easily compare and group things.
Hash Map (Dictionary): This is where we store our groups. It gives us lightning-fast lookups ( on average), which is exactly what we need for performance.
Sorting is the most readable solution, but it isn’t always the fastest. It costs per string (where is the string length). If you’re dealing with massive strings, that math starts to hurt.
A faster (though slightly more complex) way uses Character Counts. Since anagrams have the exact same count of every character, we can use the frequency array (e.g., a:1, b:0, c:0, ..., e:1, ...) as the key.
See how we can generate keys without sorting by counting character frequencies.
Time Complexity: O(N × K) where N is the number of words and K is the maximum length of a word
Advantage: Better than sorting approach (O(N × K log K))
Here is the implementation using the Sorting Approach. I’m sticking with this one because it’s the most universal and, honestly, the easiest one to write out on a whiteboard without bugging out. Note that in Python, the dictionary simplifies the logic significantly.
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
# Dictionary to map the sorted string (key) to list of anagrams
anagram_map = defaultdict(list)
for s in strs:
# Create the key by sorting the string
# 'eat' -> ['a', 'e', 't'] -> 'aet'
key = "".join(sorted(s))
# Append the original string to the correct group
anagram_map[key].append(s)
# Return just the grouped lists (values of the map)
return list(anagram_map.values()) You need to know the trade-offs here, especially if your interviewer asks, “Can we make this faster?”
Anagrams are words or phrases formed by rearranging the letters of another word or phrase, using all the original letters exactly once.
✓ Examples of Anagrams:
✗ Not Anagrams:
Time Complexity:
Space Complexity:
I’ve seen a lot of candidates trip up on these specific edge cases. Watch out for them:
⚠️ Pitfall (Python): Lists are mutable and cannot be used as dictionary keys. If you use the Character Count approach in Python, you’ll get an error if you try to use a list like [1, 0, 2...] as a key. You must convert it to a tuple (1, 0, 2...) first.
strs = [] or strs = [""]. The logic usually handles [""] naturally (the key is just an empty string), but an empty list input [] might need an early return in some languages.int[] don’t hash by their content. If you try to use the count array as a key, it’ll use the memory address instead. You have to wrap it in Arrays.toString() or a List to get it to behave.s.toLowerCase()) before generating the key.Believe it or not, this isn’t just a toy problem. This pattern—grouping by a “normalized” key—shows up all over the place:
The “Group Anagrams” problem is really just a lesson in data transformation. By converting messy, varied data into a strict canonical form, we can use Hash Maps to organize chaos instantly. Whether you choose sorting (for readability) or frequency counting (for raw speed), the core principle is the same: find the hidden structure that similar items share.
💡 Pro Tip: In a system design interview, if you’re asked to group billions of documents by similarity, this “Canonical Key + Hash Map” pattern is often the first step in a MapReduce job!