Remove Duplicates from Array II in O(n log n) Time Complexity using Java

Given an array of integers, remove the duplicates in-place such that each unique element appears only once.

Do not allocate extra space for another array. You must do this by modifying the input array in-place with O(1) extra memory.

The resulting array can be in any order.

Example:

Input: [2, 3, 1, 1, 4, 3, -2, 1]
Output: [2, 3, 1, 4, -2]

Note:

Your algorithm should run in O(n log n) time and use O(1) extra space.

Problem Definition

The problem requires us to remove duplicates from an array of integers in-place, ensuring that each unique element appears only once. The solution must not use extra space for another array and should run in O(n log n) time complexity.

Input:

An array of integers, e.g., [2, 3, 1, 1, 4, 3, -2, 1]

Output:

An array of integers with duplicates removed, e.g., [2, 3, 1, 4, -2]

Constraints:

The algorithm should run in O(n log n) time.
The algorithm should use O(1) extra space.

Understanding the Problem

The core challenge is to remove duplicates without using extra space and to do so efficiently. This problem is significant in scenarios where memory usage is critical, such as in embedded systems or large-scale data processing.

Common Applications:

Data cleaning in data preprocessing pipelines.
Optimizing storage in memory-constrained environments.

Potential Pitfalls:

Using extra space for another array, which violates the constraints.
Not handling edge cases like empty arrays or arrays with all identical elements.

Approach

To solve this problem, we can follow these steps:

Naive Solution:

A naive solution would involve using an additional data structure like a set to keep track of unique elements. However, this violates the O(1) space constraint.

Optimized Solution:

We can achieve the desired result by following these steps:

Sort the array. This ensures that duplicate elements are adjacent.
Use a pointer to keep track of the position where the next unique element should be placed.
Traverse the sorted array and overwrite the elements in-place to remove duplicates.

Thought Process:

Sorting the array helps in easily identifying duplicates since they will be adjacent. By using a pointer, we can overwrite the array in-place, ensuring O(1) extra space usage.

Algorithm

Here is a step-by-step breakdown of the algorithm:

Sort the array.
Initialize a pointer p to 0.
Traverse the array from the second element to the end.
If the current element is not equal to the previous element, increment the pointer and update the array at the pointer's position with the current element.
Return the subarray from the start to the pointer's position.

Code Implementation

import java.util.Arrays;

public class RemoveDuplicates {
    public static int[] removeDuplicates(int[] nums) {
        // Step 1: Sort the array
        Arrays.sort(nums);
        
        // Step 2: Initialize a pointer to track the position of unique elements
        int p = 0;
        
        // Step 3: Traverse the sorted array
        for (int i = 1; i < nums.length; i++) {
            // Step 4: If the current element is not equal to the previous element
            if (nums[i] != nums[i - 1]) {
                // Increment the pointer and update the array at the pointer's position
                p++;
                nums[p] = nums[i];
            }
        }
        
        // Step 5: Return the subarray from the start to the pointer's position
        return Arrays.copyOfRange(nums, 0, p + 1);
    }

    public static void main(String[] args) {
        int[] nums = {2, 3, 1, 1, 4, 3, -2, 1};
        int[] result = removeDuplicates(nums);
        System.out.println(Arrays.toString(result)); // Output: [2, 3, 1, 4, -2]
    }
}

Complexity Analysis

The time complexity of the algorithm is O(n log n) due to the sorting step. The space complexity is O(1) as we are modifying the array in-place and not using any additional data structures.

Comparison:

Naive approach: O(n) space complexity due to the use of a set.
Optimized approach: O(1) space complexity by modifying the array in-place.

Edge Cases

Consider the following edge cases:

Empty array: The output should be an empty array.
Array with all identical elements: The output should be an array with a single element.

Examples:

Input: []
Output: []

Input: [1, 1, 1, 1]
Output: [1]

Testing

To test the solution comprehensively, consider the following test cases:

Simple cases with a few elements.
Edge cases like empty arrays and arrays with all identical elements.
Large arrays to test the efficiency of the algorithm.

Test Cases:

assert Arrays.equals(removeDuplicates(new int[]{2, 3, 1, 1, 4, 3, -2, 1}), new int[]{-2, 1, 2, 3, 4});
assert Arrays.equals(removeDuplicates(new int[]{}), new int[]{});
assert Arrays.equals(removeDuplicates(new int[]{1, 1, 1, 1}), new int[]{1});
assert Arrays.equals(removeDuplicates(new int[]{1, 2, 3, 4, 5}), new int[]{1, 2, 3, 4, 5});

Thinking and Problem-Solving Tips

When approaching such problems, consider the following tips:

Understand the constraints and requirements clearly.
Think about how sorting can simplify the problem.
Consider in-place modifications to save space.
Practice similar problems to improve problem-solving skills.

Conclusion

In this blog post, we discussed how to remove duplicates from an array in-place using O(n log n) time complexity and O(1) extra space. We explored the problem definition, approach, algorithm, code implementation, complexity analysis, edge cases, and testing. Understanding and solving such problems is crucial for optimizing memory usage and improving algorithmic efficiency.

Additional Resources

For further reading and practice, consider the following resources: