Find All Duplicate Subtrees in a Binary Tree - Visualization

Problem Statement

Given the root of a binary tree, return all duplicate subtrees. For two trees to be considered duplicate, they must have the same structure and the same node values. You can return the root node of any one of the duplicate subtrees. Each duplicate subtree should be included only once in the result.

Examples

Input Tree Duplicate Subtrees (Serialized) Description
[1, 2, 3, 4, null, 2, 4, null, null, 4]
['4', '2,4,null,null,null'] The subtree rooted at value 4 appears three times. The subtree [2, 4] also appears twice.
[1, 2, 3, 4, null, 4, null, null, null, 4]
['4'] The leaf node with value 4 appears multiple times, forming duplicate subtrees.
[1, 2, 3]
[] All subtrees are unique; no duplicate subtree structure exists.
[0, 0, 0, 0, null, null, 0]
['0'] The leaf node with value 0 appears three times. These are duplicate subtrees.
[2, 1, 1, null, null, null, 1]
['1'] There are multiple leaf nodes with value 1, which are considered duplicate subtrees.

Solution

Understanding the Problem

We are given a binary tree, and our goal is to find all duplicate subtrees. A duplicate subtree is one where both the structure and node values are identical to another subtree in the tree. Each duplicate should be returned only once, represented by its root node.

To identify duplicates, we must compare every subtree with all others. But doing this naively would be inefficient. Instead, we use a serialization strategy to transform each subtree into a string format, allowing quick lookup and comparison using a hash map.

Step-by-Step Solution with Example

Step 1: Visualize the Example

Let’s take the input tree: [1,2,3,4,null,2,4,null,null,4]

This corresponds to the following structure:

        1
       /       2   3
     /   /     4   2   4
       /
      4

We can observe:

  • Three nodes with value 4, all as leaf nodes.
  • Two subtrees with structure 2 → 4.
These are candidates for duplicate subtrees.

Step 2: Define Serialization Format

We perform a post-order traversal (left → right → node) and serialize each subtree into a string.

For example:

  • Leaf node 4 becomes "4,#,#"
  • Subtree 2 → 4 becomes "2,4,#,#,#"
This unique string helps us identify duplicates using a hash map that stores counts.

Step 3: Track and Detect Duplicates

We maintain a hash map where keys are serialization strings and values are their frequency.

As we serialize each subtree:

  • If a serialization is seen once, we add the subtree root to the result.
  • If it’s already seen more than once, we skip it to avoid duplicates in the result.

Step 4: Return the Result

From our example:

  • Serialization "4,#,#" appears 3 times → add one root node of 4
  • Serialization "2,4,#,#,#" appears 2 times → add one root node of 2
The final result contains the roots of these duplicate subtrees.

Edge Cases

Case 1: All Subtrees Are Unique

Input: [1,2,3]
Structure:

  1
 / 2   3
Each node is unique. Serialized strings for each subtree do not match. Result: [].

Case 2: Empty Tree

Input: []
There are no nodes in the tree. Hence, no subtrees exist. Result: [].

Case 3: Large Tree with Many Duplicates

Ensure that your solution uses efficient traversal and hashing to handle such cases within time limits.

Finally

This problem is a great example of using serialization to simplify complex structure comparisons. It teaches you how to:

  • Break down trees into reusable string representations
  • Use hash maps to count and identify duplicates efficiently
  • Handle edge cases like null roots and small trees gracefully
This technique is widely applicable in tree-related problems where comparison is required.

Algorithm Steps

  1. Perform a postorder traversal of the binary tree.
  2. For each node, generate a unique serialization string representing the subtree rooted at that node. Use a format like node.val,left_serial,right_serial.
  3. Use a hash map (or dictionary) to count the occurrences of each serialized subtree.
  4. If a serialization appears more than once, add the subtree's root to the result list (ensuring each duplicate is added only once).
  5. Return the list of duplicate subtree roots.

Code

Python
Java
JS
C
C++
Go
Rust
Kotlin
Swift
TS
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

class Solution:
    def findDuplicateSubtrees(self, root: TreeNode) -> list:
        count = {}
        result = []
        
        def traverse(node):
            if not node:
                return '#'
            serial = f'{node.val},' + traverse(node.left) + ',' + traverse(node.right)
            count[serial] = count.get(serial, 0) + 1
            if count[serial] == 2:
                result.append(node)
            return serial
        
        traverse(root)
        return result

# Example usage:
if __name__ == '__main__':
    # Construct binary tree sample
    #         1
    #        / \
    #       2   3
    #      /   / \
    #     4   2   4
    #        /
    #       4
    root = TreeNode(1, TreeNode(2, TreeNode(4)), TreeNode(3, TreeNode(2, TreeNode(4)), TreeNode(4)))
    sol = Solution()
    duplicates = sol.findDuplicateSubtrees(root)
    for node in duplicates:
        print(node.val)

Comments

💬 Please keep your comment relevant and respectful. Avoid spamming, offensive language, or posting promotional/backlink content.
All comments are subject to moderation before being published.


Loading comments...