Longest Subarray with at most K Distinct Integers II in Python (O(n^2) Time Complexity)


Given an array of integers, find the longest subarray that contains at most K distinct integers and return its length.

Example

Input: nums = [1, 2, 1, 2, 3], k = 2
Output: 4
Explanation:the subarray nums[0...3] contains 2 distinct values: [1, 2] and is the longest subarray

Note:

Your algorithm should run in O(n^2) time and use O(n) extra space.


Problem Definition

The problem requires finding the longest subarray within a given array of integers that contains at most K distinct integers. The output should be the length of this subarray.

Input:

  • nums: List of integers.
  • k: An integer representing the maximum number of distinct integers allowed in the subarray.

Output:

  • An integer representing the length of the longest subarray with at most K distinct integers.

Constraints:

  • The algorithm should run in O(n^2) time complexity.
  • The algorithm should use O(n) extra space.

Example:

Input: nums = [1, 2, 1, 2, 3], k = 2
Output: 4
Explanation: The subarray nums[0...3] contains 2 distinct values: [1, 2] and is the longest subarray.

Understanding the Problem

The core challenge is to find the longest contiguous subarray with at most K distinct integers. This problem is significant in scenarios where we need to analyze data streams or sequences with constraints on diversity.

Common applications include data compression, network packet analysis, and substring problems in text processing.

Potential pitfalls include misunderstanding the requirement for contiguous subarrays and not handling edge cases where K is larger than the number of unique elements in the array.

Approach

To solve this problem, we can use a sliding window approach. The idea is to maintain a window that satisfies the condition of having at most K distinct integers and expand or contract this window as needed.

Naive Solution

A naive solution would involve checking all possible subarrays and counting the distinct integers in each. This approach is not optimal as it has a time complexity of O(n^3).

Optimized Solution

We can optimize the solution using a sliding window technique combined with a hash map to keep track of the count of distinct integers within the window. This approach ensures that we only traverse the array once, achieving a time complexity of O(n^2).

Algorithm

Here is a step-by-step breakdown of the sliding window approach:

  1. Initialize two pointers, left and right, both set to the start of the array.
  2. Use a hash map to keep track of the count of each integer within the window.
  3. Expand the window by moving the right pointer and updating the hash map.
  4. If the number of distinct integers exceeds K, contract the window by moving the left pointer until the condition is satisfied.
  5. Keep track of the maximum length of the window that satisfies the condition.

Code Implementation

def longest_subarray_with_k_distinct(nums, k):
    # Initialize pointers and the hash map
    left = 0
    right = 0
    max_length = 0
    count_map = {}
    
    while right < len(nums):
        # Add the current element to the hash map
        if nums[right] in count_map:
            count_map[nums[right]] += 1
        else:
            count_map[nums[right]] = 1
        
        # If the number of distinct integers exceeds k, shrink the window
        while len(count_map) > k:
            count_map[nums[left]] -= 1
            if count_map[nums[left]] == 0:
                del count_map[nums[left]]
            left += 1
        
        # Update the maximum length of the window
        max_length = max(max_length, right - left + 1)
        
        # Move the right pointer to expand the window
        right += 1
    
    return max_length

# Example usage
nums = [1, 2, 1, 2, 3]
k = 2
print(longest_subarray_with_k_distinct(nums, k))  # Output: 4

Complexity Analysis

The time complexity of the sliding window approach is O(n^2) because, in the worst case, each element is processed twice (once by the right pointer and once by the left pointer). The space complexity is O(n) due to the hash map used to store the count of distinct integers.

Edge Cases

Consider the following edge cases:

  • When k is 0, the output should be 0 as no subarray can have 0 distinct integers.
  • When k is greater than the number of unique elements in the array, the output should be the length of the entire array.
  • When the array is empty, the output should be 0.

Testing

To test the solution comprehensively, consider the following test cases:

def test_longest_subarray_with_k_distinct():
    assert longest_subarray_with_k_distinct([1, 2, 1, 2, 3], 2) == 4
    assert longest_subarray_with_k_distinct([1, 2, 1, 3, 4], 2) == 2
    assert longest_subarray_with_k_distinct([1, 2, 1, 2, 3], 3) == 5
    assert longest_subarray_with_k_distinct([], 2) == 0
    assert longest_subarray_with_k_distinct([1, 2, 3, 4, 5], 0) == 0
    assert longest_subarray_with_k_distinct([1, 2, 3, 4, 5], 5) == 5

test_longest_subarray_with_k_distinct()

Thinking and Problem-Solving Tips

When approaching such problems, consider the following tips:

  • Understand the problem requirements and constraints thoroughly.
  • Think about different approaches and their trade-offs.
  • Use diagrams or pseudo-code to visualize the solution.
  • Practice similar problems to improve problem-solving skills.

Conclusion

In this blog post, we discussed how to solve the problem of finding the longest subarray with at most K distinct integers using a sliding window approach. We covered the problem definition, approach, algorithm, code implementation, complexity analysis, edge cases, and testing. Understanding and solving such problems is crucial for improving algorithmic thinking and coding skills.

Additional Resources

For further reading and practice, consider the following resources: