Understanding the Problem
We are given two sorted arrays, possibly of different sizes. Our goal is to find the median of the combined data. However, we are not allowed to merge the two arrays because that would take extra time and space.
The challenge is to find the median efficiently—ideally in O(log(min(n, m))
time—by leveraging the fact that the arrays are already sorted.
The median is the middle value in a sorted list. If the combined number of elements is odd, it's the exact middle element. If even, it's the average of the two middle elements.
Step-by-Step Solution with Example
Step 1: Choose the Shorter Array for Binary Search
We always perform binary search on the smaller of the two arrays to minimize the time complexity. Let's call the two arrays nums1
and nums2
.
Example: Let nums1 = [1, 3]
and nums2 = [2, 4, 5]
Step 2: Calculate Total Length and Half Length
Total number of elements = 5, which is odd.
Half length (for partitioning) = (5 + 1) / 2 = 3
.
Step 3: Apply Binary Search on nums1
We’ll apply binary search to find a partition index i
in nums1
, and calculate j = half - i
for nums2
.
Step 4: Define Left and Right Elements at Partition
We compare the elements around the partition to ensure the following:
nums1[i - 1] ≤ nums2[j]
nums2[j - 1] ≤ nums1[i]
If this is true, we’ve found the correct partition.
Step 5: Calculate the Median Based on Partition
In our example, one possible partition is:
nums1 = [1, 3]
→ i = 2
nums2 = [2, 4, 5]
→ j = 1
Left half = [1, 2, 3], Right half = [4, 5]
The median is the maximum of the left half = 3
, because total number is odd.
Step 6: Adjust Binary Search if Partition Is Incorrect
If the partition condition is not met (e.g., nums1[i - 1] > nums2[j]
), we move left. If nums2[j - 1] > nums1[i]
, we move right. This continues until we find the correct split.
Edge Cases
- One array is empty: If
nums1
is empty and nums2 = [1, 2, 3]
, the median is just 2
.
- Both arrays are empty: No median can be defined. Return an error or a null result.
- Arrays have duplicate values or zeros: The binary search-based logic still works since it relies on comparisons, not uniqueness.
- Arrays are of very different lengths: The algorithm remains efficient as it only searches the smaller array.
Finally
This solution avoids merging the arrays and achieves logarithmic time complexity using binary search. It carefully balances the elements on both sides of a conceptual partition to directly identify the median.
It’s efficient, elegant, and perfect for large datasets where a full merge would be too slow.
Comments
Loading comments...