Accounts Merge Using Graphs and DSU - Visualization

Problem Statement

You're given a list of accounts. Each account is a list where the first element is a user name and the rest are email addresses belonging to that person. Your task is to merge the accounts if any two accounts share at least one common email.

Important: Two accounts with the same name are not necessarily the same person unless they share an email. After merging, return each account with the name followed by the sorted list of unique emails.

Examples

Input Accounts Merged Output Description
[ ["John", "johnsmith@mail.com", "john00@mail.com"], ["John", "johnnybravo@mail.com"], ["John", "johnsmith@mail.com", "john_newyork@mail.com"], ["Mary", "mary@mail.com"] ] [ ["John", "john00@mail.com", "john_newyork@mail.com", "johnsmith@mail.com"], ["John", "johnnybravo@mail.com"], ["Mary", "mary@mail.com"] ] First and third John accounts are merged via common email
[ ["Alex", "a@mail.com"], ["Alex", "b@mail.com"], ["Alex", "a@mail.com", "b@mail.com"] ] [ ["Alex", "a@mail.com", "b@mail.com"] ] All Alex accounts are connected through emails
[] [] No accounts to process
[ ["Bob", "bob@mail.com"] ] [ ["Bob", "bob@mail.com"] ] Single account, no merging needed

Solution

Understanding the Problem

We are given a list of accounts where each account belongs to a person and contains a list of emails. Some people may have multiple accounts, and emails may appear in more than one account. Our goal is to merge all the accounts that belong to the same person, meaning any accounts that share at least one common email should be merged into one.

For example, if "John" has one account with ["john@gmail.com", "john@yahoo.com"] and another with ["john@yahoo.com", "john@outlook.com"], these two should be merged into a single account with all three emails, since "john@yahoo.com" is common to both.

We will solve this using Disjoint Set Union (DSU), which is great for managing groups of connected elements efficiently.

Step-by-Step Solution with Example

Step 1: Visualize the Problem as a Graph

Think of each email as a node in a graph. If two emails are part of the same account, draw an edge between them. This way, all emails that are connected form a cluster that belongs to the same user.

Step 2: Initialize DSU

We create a DSU (also known as Union-Find) structure to manage which emails are connected. Each email will be a node in this structure, and we'll use union and find operations to group them.

Step 3: Build Connections Using Union Operation

Iterate through each account. For every account, pick the first email and union it with all other emails in that account. This connects all emails in that account together.

Step 4: Map Emails to User Names

As we process each email, we also maintain a mapping from email to user name so that we can retrieve the correct name when building the final result.

Step 5: Group Emails by Root

After all unions are done, go through every email and use the find operation to determine its representative (root parent). Group all emails with the same root together. These are emails that belong to the same person.

Step 6: Sort and Format the Result

For each group of emails, sort them alphabetically and prepend the user name (using our earlier mapping). This gives us the merged account in the correct format.

Example:


Input: 
[
  ["John", "john@gmail.com", "john@yahoo.com"],
  ["John", "john@yahoo.com", "john@outlook.com"],
  ["Mary", "mary@gmail.com"]
]

Process:
- Union "john@gmail.com" with "john@yahoo.com"
- Union "john@yahoo.com" with "john@outlook.com"
- Map all emails to "John"
- Map "mary@gmail.com" to "Mary"

Result:
[
  ["John", "john@gmail.com", "john@outlook.com", "john@yahoo.com"],
  ["Mary", "mary@gmail.com"]
]

Edge Cases

  • Multiple accounts with no overlapping emails: Each account will remain separate.
  • One email used by multiple accounts of the same name: All such accounts should be merged using that common email as the connector.
  • Emails that belong to different users but look similar: Be careful; merge only if the actual email strings match exactly.
  • Empty account list: Simply return an empty result.

Finally

This problem is a great example of applying DSU to a real-world scenario involving connected data. By modeling emails as a graph and using DSU to manage the components, we can efficiently merge related accounts. Always remember to handle mapping and sorting properly when preparing the final answer.

Algorithm Steps

  1. Initialize a parent map for DSU: parent[email] = email.
  2. Iterate through each account:
    1. For each email in the account, perform union(email, first_email).
    2. Record email_to_name[email] = account[0].
  3. For each email, find its root parent using find(email) and group emails by root.
  4. For each group, sort the emails and prepend the name.
  5. Return the list of merged accounts.

Code

C
C++
Python
Java
JS
Go
Rust
Kotlin
Swift
TS
#include <stdio.h>
#include <string.h>

#define MAX_ACCOUNTS 4
#define MAX_EMAILS 5
#define MAX_EMAIL_LEN 50
#define MAX_NAME_LEN 20

// Simple structure to hold account data
typedef struct {
    char name[MAX_NAME_LEN];
    int emailCount;
    char emails[MAX_EMAILS][MAX_EMAIL_LEN];
} Account;

// Helper function to check if email exists in an array
int emailExists(char emails[][MAX_EMAIL_LEN], int count, const char *email) {
    for (int i = 0; i < count; i++) {
        if (strcmp(emails[i], email) == 0) {
            return 1;
        }
    }
    return 0;
}

int main() {
    Account accounts[MAX_ACCOUNTS] = {
        {"John", 2, {"johnsmith@mail.com", "john00@mail.com"}},
        {"John", 1, {"johnnybravo@mail.com"}},
        {"John", 2, {"johnsmith@mail.com", "john_newyork@mail.com"}},
        {"Mary", 1, {"mary@mail.com"}}
    };

    // Merging accounts that share emails (naive approach for demonstration)
    // Result stored in merged accounts
    Account merged[MAX_ACCOUNTS];
    int mergedCount = 0;

    for (int i = 0; i < MAX_ACCOUNTS; i++) {
        int mergedIndex = -1;
        for (int j = 0; j < mergedCount; j++) {
            // Check if any email overlaps
            for (int e1 = 0; e1 < accounts[i].emailCount; e1++) {
                if (emailExists(merged[j].emails, merged[j].emailCount, accounts[i].emails[e1])) {
                    mergedIndex = j;
                    break;
                }
            }
            if (mergedIndex != -1) break;
        }

        if (mergedIndex == -1) {
            // Add new merged account
            merged[mergedCount++] = accounts[i];
        } else {
            // Merge emails without duplicates
            for (int e1 = 0; e1 < accounts[i].emailCount; e1++) {
                if (!emailExists(merged[mergedIndex].emails, merged[mergedIndex].emailCount, accounts[i].emails[e1])) {
                    strcpy(merged[mergedIndex].emails[merged[mergedIndex].emailCount++], accounts[i].emails[e1]);
                }
            }
        }
    }

    // Print merged accounts
    for (int i = 0; i < mergedCount; i++) {
        printf("%s: ", merged[i].name);
        for (int e = 0; e < merged[i].emailCount; e++) {
            printf("%s%s", merged[i].emails[e], (e == merged[i].emailCount - 1) ? "\n" : ", ");
        }
    }
    return 0;
}

Comments

💬 Please keep your comment relevant and respectful. Avoid spamming, offensive language, or posting promotional/backlink content.
All comments are subject to moderation before being published.


Loading comments...