Longest Subarray with at most K Distinct Integers in Python (O(n^2) Time Complexity)


Given an array of integers, find the longest subarray that contains at most K distinct integers and return its length.

Example

Input: nums = [1, 2, 1, 2, 3], k = 2
Output: 4
Explanation:the subarray nums[0...3] contains 2 distinct values: [1, 2] and is the longest subarray

Note:

Your algorithm should run in O(n^2) time and use O(n) extra space.


Problem Definition

The problem requires finding the longest subarray within a given array of integers that contains at most K distinct integers. The input consists of an array of integers and an integer K. The output is the length of the longest subarray that meets the criteria.

Input:

  • nums: List[int] - an array of integers
  • k: int - the maximum number of distinct integers allowed in the subarray

Output:

  • int - the length of the longest subarray with at most K distinct integers

Constraints:

  • 1 ≤ len(nums) ≤ 10^5
  • 0 ≤ nums[i] ≤ 10^5
  • 1 ≤ k ≤ len(nums)

Example:

Input: nums = [1, 2, 1, 2, 3], k = 2
Output: 4
Explanation: The subarray nums[0...3] contains 2 distinct values: [1, 2] and is the longest subarray

Understanding the Problem

The core challenge is to efficiently find the longest subarray with at most K distinct integers. This problem is significant in scenarios where we need to analyze data streams or sequences with constraints on diversity, such as in network traffic analysis or substring problems in text processing.

Potential pitfalls include misunderstanding the requirement of "at most K distinct integers" and not handling edge cases where K is larger than the number of unique elements in the array.

Approach

To solve this problem, we can use a sliding window approach combined with a hash map to keep track of the count of distinct integers within the current window. This approach ensures that we efficiently find the longest subarray without repeatedly traversing the same elements.

Naive Solution

A naive solution would involve checking all possible subarrays and counting the distinct integers in each, which would result in a time complexity of O(n^3). This is not optimal for large arrays.

Optimized Solution

The optimized solution uses a sliding window technique:

  1. Initialize two pointers, left and right, both starting at the beginning of the array.
  2. Use a hash map to keep track of the count of each integer within the window.
  3. Expand the window by moving the right pointer and update the hash map.
  4. If the number of distinct integers exceeds K, shrink the window by moving the left pointer until the number of distinct integers is at most K.
  5. Keep track of the maximum length of the window that meets the criteria.

Algorithm

Here is a step-by-step breakdown of the sliding window algorithm:

  1. Initialize left pointer to 0, max_length to 0, and an empty hash map count_map.
  2. Iterate through the array with the right pointer.
  3. For each element, add it to count_map and update its count.
  4. If the number of distinct integers in count_map exceeds K, move the left pointer to the right until the number of distinct integers is at most K, updating count_map accordingly.
  5. Update max_length with the current window size if it is larger than the previous maximum.
  6. Return max_length after the loop ends.

Code Implementation

def longest_subarray_with_k_distinct(nums, k):
    # Initialize pointers and variables
    left = 0
    max_length = 0
    count_map = {}
    
    # Iterate through the array with the right pointer
    for right in range(len(nums)):
        # Add the current element to the count_map
        if nums[right] in count_map:
            count_map[nums[right]] += 1
        else:
            count_map[nums[right]] = 1
        
        # If the number of distinct integers exceeds k, shrink the window
        while len(count_map) > k:
            count_map[nums[left]] -= 1
            if count_map[nums[left]] == 0:
                del count_map[nums[left]]
            left += 1
        
        # Update the maximum length of the subarray
        max_length = max(max_length, right - left + 1)
    
    return max_length

# Example usage
nums = [1, 2, 1, 2, 3]
k = 2
print(longest_subarray_with_k_distinct(nums, k))  # Output: 4

Complexity Analysis

The time complexity of the sliding window approach is O(n) because each element is processed at most twice (once by the right pointer and once by the left pointer). The space complexity is O(k) due to the hash map storing at most K distinct integers.

Edge Cases

Consider the following edge cases:

  • When the array is empty, the output should be 0.
  • When K is greater than the number of unique elements in the array, the output should be the length of the entire array.
  • When all elements in the array are the same, the output should be the length of the array.

Examples of Edge Cases:

Input: nums = [], k = 2
Output: 0

Input: nums = [1, 1, 1, 1], k = 2
Output: 4

Input: nums = [1, 2, 3, 4, 5], k = 5
Output: 5

Testing

To test the solution comprehensively, consider a variety of test cases:

  • Simple cases with small arrays and small K values.
  • Cases with large arrays and varying K values.
  • Edge cases as discussed above.

Using a testing framework like unittest in Python can help automate and organize these tests.

Thinking and Problem-Solving Tips

When approaching such problems, consider the following tips:

  • Break down the problem into smaller parts and understand each requirement.
  • Think about different approaches and their trade-offs before coding.
  • Use diagrams or pseudo-code to visualize the solution.
  • Practice similar problems to improve problem-solving skills.

Conclusion

In this blog post, we discussed the problem of finding the longest subarray with at most K distinct integers. We explored a sliding window approach to solve the problem efficiently and provided a detailed explanation of the algorithm and its implementation in Python. Understanding and solving such problems is crucial for improving algorithmic thinking and coding skills.

Additional Resources

For further reading and practice, consider the following resources: