Check for Duplicate Subtrees in a Binary Tree

Problem Statement

Given a binary tree, determine whether it contains any duplicate subtrees of size two or more. Two subtrees are considered duplicate if they have the same structure and the same node values. Note that subtrees of size 1 (single leaf nodes) should not be considered for duplication check.

Examples

Input Tree Has Duplicate Subtrees? Description
[1, 2, 3, 4, null, 2, 4, null, null, 4]
true The subtree [4] appears more than once and so does [2, 4].
[1, 2, 3, 4, 5, 6, 7]
false All subtrees are unique; no duplication occurs.
[1, 1, 1, null, null, null, 1]
true The subtree consisting of just node [1] is duplicated multiple times.
[] false An empty tree has no subtrees and thus no duplicates.
[1]
false A single-node tree cannot contain any duplicate subtrees.

Solution

Case 1: Tree with Obvious Duplicate Subtrees

In this scenario, the binary tree contains subtrees that appear more than once and are structurally and numerically identical. For example, if a node '2' has a left child '4', and this structure is repeated elsewhere, we consider it a duplicate. These are the kinds of duplications the algorithm targets — not just value repetition, but full structural duplication.

Case 2: Tree with All Unique Subtrees

Sometimes, a tree may share node values (e.g., multiple nodes having value '4'), but their structural layout differs. In such cases, although node values repeat, the structure does not, so it doesn't count as a duplicate subtree. The algorithm needs to distinguish structural sameness from just value duplication.

Case 3: Tree with a Single Node

If the binary tree has only one node, there are no subtrees of size ≥ 2, so it’s impossible to have any duplicates. This is a trivial case and should return false by default.

Case 4: Empty Tree

If the input tree is empty, it doesn’t contain any nodes or subtrees, so it logically follows that there can be no duplicate subtrees. This is another trivial base case handled directly at the start of the algorithm.

How the Algorithm Works

The algorithm works by performing a postorder traversal (left, right, root) and serializing each subtree into a unique string (e.g., "2,4,#,#,#" for a node 2 with a left child 4). These strings are stored in a hash map. If any serialized string occurs more than once and represents a subtree of size ≥ 2, the function returns true.

This serialization strategy ensures that both the structure and values are captured, helping to reliably detect duplicates, even if node values repeat but structures differ.

Algorithm Steps

  1. Traverse the binary tree in a postorder manner and serialize each subtree into a string representation.
  2. For each node, if the node is not a leaf (i.e. it has at least one child), store its serialized string in a hash map with a frequency count.
  3. If the frequency of any serialized subtree becomes 2 (or more), mark that duplicate has been found.
  4. After processing all nodes, if a duplicate subtree (of size 2 or more) is detected, return true; otherwise, return false.

Code

Python
Java
JavaScript
C
C++
C#
Kotlin
Swift
Go
Php
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right


def hasDuplicateSubtrees(root):
    seen = {}
    duplicate = False

    def serialize(node):
        nonlocal duplicate
        if not node:
            return '#'
        left = serialize(node.left)
        right = serialize(node.right)
        subtree = str(node.val) + ',' + left + ',' + right
        # Only consider subtrees of size >= 2 (node with at least one child)
        if node.left or node.right:
            seen[subtree] = seen.get(subtree, 0) + 1
            if seen[subtree] == 2:
                duplicate = True
        return subtree

    serialize(root)
    return duplicate

# Example usage:
if __name__ == '__main__':
    # Construct a sample binary tree:
    #         1
    #        / \
    #       2   3
    #      /   / \
    #     4   2   4
    #        /
    #       4
    root = TreeNode(1, TreeNode(2, TreeNode(4)), TreeNode(3, TreeNode(2, TreeNode(4)), TreeNode(4)))
    print(hasDuplicateSubtrees(root))