Given an array of integers, return a new array containing only the duplicates.
The resulting array's values can be stored in any order.
Example:
Input: [2, 3, 1, 1, 4, 3, 2, 1] Output: [2, 1, 3]
Your algorithm should run in O(n) time and use O(n) extra space.
The core challenge of this problem is to identify all the duplicate elements in the array efficiently. This problem is significant in various applications such as data cleaning, where duplicates need to be identified and handled. A common pitfall is to use a nested loop approach, which results in O(n^2) time complexity, making it inefficient for large arrays.
To solve this problem efficiently, we can use a hash map (unordered_map in C++) to keep track of the frequency of each element. This allows us to identify duplicates in O(n) time. Here’s a step-by-step approach:
Let's discuss a naive approach first and then move to the optimized solution.
The naive approach involves using two nested loops to compare each element with every other element, resulting in O(n^2) time complexity. This is not optimal for large arrays.
Using a hash map, we can achieve the desired O(n) time complexity. The hash map allows us to count the occurrences of each element in a single pass and then identify duplicates in another pass.
Here is a step-by-step breakdown of the optimized algorithm:
#include <iostream>
#include <vector>
#include <unordered_map>
std::vector<int> findDuplicates(const std::vector<int>& nums) {
// Step 1: Initialize an unordered_map to store the frequency of each element
std::unordered_map<int, int> frequency;
// Step 2: Traverse the array and update the frequency of each element
for (int num : nums) {
frequency[num]++;
}
// Step 3: Initialize a result vector to store the duplicates
std::vector<int> result;
// Step 4: Traverse the map and collect elements with a frequency greater than 1
for (const auto& pair : frequency) {
if (pair.second > 1) {
result.push_back(pair.first);
}
}
return result;
}
int main() {
std::vector<int> nums = {2, 3, 1, 1, 4, 3, 2, 1};
std::vector<int> duplicates = findDuplicates(nums);
std::cout << "Duplicates: ";
for (int num : duplicates) {
std::cout << num << " ";
}
std::cout << std::endl;
return 0;
}
The time complexity of this approach is O(n) because we traverse the array once to build the frequency map and then traverse the map to collect duplicates. The space complexity is also O(n) due to the additional storage used by the hash map and the result vector.
Consider the following edge cases:
Example:
Input: [1, 1, 1, 1] Output: [1]
To test the solution comprehensively, consider the following test cases:
When approaching such problems, consider the following tips:
In this blog post, we discussed how to find duplicates in an array efficiently using a hash map. We covered the problem definition, approach, algorithm, code implementation, complexity analysis, edge cases, and testing. Understanding and solving such problems is crucial for improving problem-solving skills and preparing for coding interviews.
For further reading and practice, consider the following resources: