Find All Duplicate Subtrees in a Binary Tree

Problem Statement❯

Given the root of a binary tree, return all duplicate subtrees. For two trees to be considered duplicate, they must have the same structure and the same node values. You can return the root node of any one of the duplicate subtrees. Each duplicate subtree should be included only once in the result.

Examples❯

Input Tree	Duplicate Subtrees (Serialized)	Description
[1, 2, 3, 4, null, 2, 4, null, null, 4]	['4', '2,4,null,null,null']	The subtree rooted at value 4 appears three times. The subtree [2, 4] also appears twice.
[1, 2, 3, 4, null, 4, null, null, null, 4]	['4']	The leaf node with value 4 appears multiple times, forming duplicate subtrees.
[1, 2, 3]	[]	All subtrees are unique; no duplicate subtree structure exists.
[0, 0, 0, 0, null, null, 0]	['0']	The leaf node with value 0 appears three times. These are duplicate subtrees.
[2, 1, 1, null, null, null, 1]	['1']	There are multiple leaf nodes with value 1, which are considered duplicate subtrees.

Solution❯

Understanding the Problem

We are given a binary tree, and our goal is to find all duplicate subtrees. A duplicate subtree is one where both the structure and node values are identical to another subtree in the tree. Each duplicate should be returned only once, represented by its root node.

To identify duplicates, we must compare every subtree with all others. But doing this naively would be inefficient. Instead, we use a serialization strategy to transform each subtree into a string format, allowing quick lookup and comparison using a hash map.

Step-by-Step Solution with Example

Step 1: Visualize the Example

Let’s take the input tree: [1,2,3,4,null,2,4,null,null,4]

This corresponds to the following structure:

        1
       /       2   3
     /   /     4   2   4
       /
      4

We can observe:

Three nodes with value 4, all as leaf nodes.
Two subtrees with structure 2 → 4.

These are candidates for duplicate subtrees.

Step 2: Define Serialization Format

We perform a post-order traversal (left → right → node) and serialize each subtree into a string.

For example:

Leaf node 4 becomes "4,#,#"
Subtree 2 → 4 becomes "2,4,#,#,#"

This unique string helps us identify duplicates using a hash map that stores counts.

Step 3: Track and Detect Duplicates

We maintain a hash map where keys are serialization strings and values are their frequency.

As we serialize each subtree:

If a serialization is seen once, we add the subtree root to the result.
If it’s already seen more than once, we skip it to avoid duplicates in the result.

Step 4: Return the Result

From our example:

Serialization "4,#,#" appears 3 times → add one root node of 4
Serialization "2,4,#,#,#" appears 2 times → add one root node of 2

The final result contains the roots of these duplicate subtrees.

Edge Cases

Case 1: All Subtrees Are Unique

Input: [1,2,3]
Structure:

  1
 / 2   3

Each node is unique. Serialized strings for each subtree do not match. Result: [].

Case 2: Empty Tree

Input: []
There are no nodes in the tree. Hence, no subtrees exist. Result: [].

Case 3: Large Tree with Many Duplicates

Ensure that your solution uses efficient traversal and hashing to handle such cases within time limits.

Finally

This problem is a great example of using serialization to simplify complex structure comparisons. It teaches you how to:

Break down trees into reusable string representations
Use hash maps to count and identify duplicates efficiently
Handle edge cases like null roots and small trees gracefully

This technique is widely applicable in tree-related problems where comparison is required.

Algorithm Steps❯

Perform a postorder traversal of the binary tree.
For each node, generate a unique serialization string representing the subtree rooted at that node. Use a format like node.val,left_serial,right_serial.
Use a hash map (or dictionary) to count the occurrences of each serialized subtree.
If a serialization appears more than once, add the subtree's root to the result list (ensuring each duplicate is added only once).
Return the list of duplicate subtree roots.

Code

Python

Java

C++

Rust

Kotlin

Swift

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

class Solution:
    def findDuplicateSubtrees(self, root: TreeNode) -> list:
        count = {}
        result = []
        
        def traverse(node):
            if not node:
                return '#'
            serial = f'{node.val},' + traverse(node.left) + ',' + traverse(node.right)
            count[serial] = count.get(serial, 0) + 1
            if count[serial] == 2:
                result.append(node)
            return serial
        
        traverse(root)
        return result

# Example usage:
if __name__ == '__main__':
    # Construct binary tree sample
    #         1
    #        / \
    #       2   3
    #      /   / \
    #     4   2   4
    #        /
    #       4
    root = TreeNode(1, TreeNode(2, TreeNode(4)), TreeNode(3, TreeNode(2, TreeNode(4)), TreeNode(4)))
    sol = Solution()
    duplicates = sol.findDuplicateSubtrees(root)
    for node in duplicates:
        print(node.val)

⬅ Previous TopicFind the kth Ancestor of a Node in a Binary Tree

Next Topic ⮕Find a Value in a Binary Search Tree

Comments

Loading comments...

Find All Duplicate Subtrees in a Binary Tree - Visualization

Contents

Problem Statement❯

Examples❯

Solution❯

Understanding the Problem

Step-by-Step Solution with Example

Step 1: Visualize the Example

Step 2: Define Serialization Format

Step 3: Track and Detect Duplicates

Step 4: Return the Result

Edge Cases

Case 1: All Subtrees Are Unique

Case 2: Empty Tree

Case 3: Large Tree with Many Duplicates

Finally

Algorithm Steps❯

Code

Comments

Find All Duplicate Subtrees in a Binary Tree - Visualization

Contents

Problem Statement❯

Examples❯

Solution❯

Understanding the Problem

Step-by-Step Solution with Example

Step 1: Visualize the Example

Step 2: Define Serialization Format

Step 3: Track and Detect Duplicates

Step 4: Return the Result

Edge Cases

Case 1: All Subtrees Are Unique

Case 2: Empty Tree

Case 3: Large Tree with Many Duplicates

Finally

Algorithm Steps❯

Code

Comments

Player Settings