Hash Sets in Python


Hash Sets are collections of unique items i.e no element can be repeated.

Hash Sets were designed to give us a way of adding and looking for unique values in a collection in a quick manner.

The values in a Hash Set can be either simple primitives like strings or integers as well as more complex object types like object literals or arrays.

In Python, the Hash Set is implemented as the set object.


Creation:

You can create a new empty set by using set():

emptySet = set()

Creating an empty set takes O(1) time.


Constructing from iterable:

The set constructor also accepts an optional iterable object. If you pass an iterable object to the set constructor, all the unique elements from the object will be added to the new set:

elements = [1, '2', 'apple', 1]
mySet = set(elements)

print(mySet) # {1, '2', 'apple'}

This takes O(n) time, where n is the number of elements in the iterable object.


Initialization

Sets containing values can also be initialized by using curly braces:

dataEngineer = {'Python', 'Java', 'Scala', 'Git', 'SQL', 'Hadoop'}

Keep in mind that curly braces can only be used to initialize a set containing values. Empty curly braces ({}) create an empty dictionary, not a set.


Adding elements:

To add an element to the set, you use the add() method:

mySet = {'a', 'b', 'c'}

# Adding new element:
mySet.add('d')

# Adding existing element:
mySet.add('a') # nothing happens

print(mySet) # {'a', 'b', 'c', 'd'}

The set first checks if that element already exists and if so, it does nothing and also doesn't raise an error.

Adding an element to a set takes O(1) time.


Checking if a value exists:

To check if a set has a specific element, you use the in operator:

mySet = {'a', 'b', 'c'}

print('a' in mySet) # True

exist = 'z' in mySet
print(exist) # False

Checking if a value exists in a set takes O(1) time.


Removing elements:

To delete a specified element from a set, you use the remove() method:

mySet = {'a', 'b', 'c', 'd'}

# Deleting elements:
mySet.delete('a')

# Deleting a non-existent element:
mySet.delete('z') # raises KeyError exception

console.log(mySet); # {'b', 'c', 'd'}

The set first checks if that key exists and if not, it raises a KeyError. If you want a function that leaves a set unchanged if the element is not present, you can use discard():

mySet = {'a', 'b', 'c', 'd'}

# Discarding existent element:
mySet.discard('a') # removes 'a'

# Discarding non-existent element:
mySet.delete('z') # nothing happens

console.log(mySet); # {'b', 'c', 'd'}

Removing an element from a set takes O(1) time.


Iterating over the Set values:

If we want to iterate over values of the set, we can use the for loop along with the in operator:

mySet = {'b', 'c', 'c', 'd'}

for val in mySet:
    print(val)

# This will print the following:
# b
# d
# c


Notice the order is not the same as we initiated. set keeps the data in random order

Iterating over a set takes O(n) time.


Space Complexity

A set uses O(n) space, where n is the number of elements existing in the set.


Assignment
Follow the Coding Tutorial and let's play with some Hash Sets.


Hint
Look at the examples above if you get stuck.


Introduction

In this lesson, we will explore the concept of Hash Sets in Python. Hash Sets are a fundamental data structure that allows for the storage of unique items. They are particularly useful in scenarios where you need to ensure that no duplicate elements are present in a collection. Hash Sets are implemented in Python using the set object, which provides efficient methods for adding, removing, and checking for the existence of elements.

Understanding the Basics

Before diving into the more complex aspects of Hash Sets, it's important to understand their basic properties and operations. A Hash Set is an unordered collection of unique elements. This means that each element in the set must be distinct, and the order in which elements are stored is not guaranteed.

Let's start with a simple example:

# Creating an empty set
emptySet = set()

# Creating a set from an iterable
elements = [1, '2', 'apple', 1]
mySet = set(elements)

print(mySet) # Output: {1, '2', 'apple'}

In this example, we create an empty set and a set from a list of elements. Notice that the duplicate element 1 is only included once in the set.

Main Concepts

Now that we have a basic understanding of Hash Sets, let's delve into some key concepts and techniques:

Examples and Use Cases

Let's look at some examples to see how these concepts are applied in various contexts:

# Example 1: Adding and checking elements
mySet = {'a', 'b', 'c'}
mySet.add('d')
print('a' in mySet) # Output: True
print('z' in mySet) # Output: False

# Example 2: Removing elements
mySet.remove('a')
print(mySet) # Output: {'b', 'c', 'd'}
mySet.discard('z') # No error raised

# Example 3: Iterating over a set
for val in mySet:
    print(val)

These examples demonstrate how to add, check, remove, and iterate over elements in a set.

Common Pitfalls and Best Practices

When working with Hash Sets, there are some common mistakes to avoid and best practices to follow:

Advanced Techniques

Once you are comfortable with the basics, you can explore some advanced techniques:

# Advanced Example: Set operations
set1 = {1, 2, 3}
set2 = {3, 4, 5}

union_set = set1.union(set2)
intersection_set = set1.intersection(set2)
difference_set = set1.difference(set2)
symmetric_difference_set = set1.symmetric_difference(set2)

print(union_set) # Output: {1, 2, 3, 4, 5}
print(intersection_set) # Output: {3}
print(difference_set) # Output: {1, 2}
print(symmetric_difference_set) # Output: {1, 2, 4, 5}

Code Implementation

Here is a comprehensive example that demonstrates the correct use of Hash Sets in Python:

# Comprehensive Example: Working with Hash Sets

# Creating a set
mySet = {'a', 'b', 'c'}

# Adding elements
mySet.add('d')
mySet.add('a') # Duplicate, no effect

# Checking existence
print('a' in mySet) # Output: True
print('z' in mySet) # Output: False

# Removing elements
mySet.remove('a')
mySet.discard('z') # No error raised

# Iterating over the set
for val in mySet:
    print(val)

# Set operations
set1 = {1, 2, 3}
set2 = {3, 4, 5}

union_set = set1.union(set2)
intersection_set = set1.intersection(set2)
difference_set = set1.difference(set2)
symmetric_difference_set = set1.symmetric_difference(set2)

print(union_set) # Output: {1, 2, 3, 4, 5}
print(intersection_set) # Output: {3}
print(difference_set) # Output: {1, 2}
print(symmetric_difference_set) # Output: {1, 2, 4, 5}

Debugging and Testing

When working with Hash Sets, it's important to test your code thoroughly. Here are some tips for debugging and testing:

import unittest

class TestHashSet(unittest.TestCase):
    def test_add_and_check(self):
        mySet = {'a', 'b', 'c'}
        mySet.add('d')
        self.assertTrue('a' in mySet)
        self.assertFalse('z' in mySet)

    def test_remove(self):
        mySet = {'a', 'b', 'c'}
        mySet.remove('a')
        self.assertFalse('a' in mySet)
        mySet.discard('z') # No error raised

    def test_set_operations(self):
        set1 = {1, 2, 3}
        set2 = {3, 4, 5}
        self.assertEqual(set1.union(set2), {1, 2, 3, 4, 5})
        self.assertEqual(set1.intersection(set2), {3})
        self.assertEqual(set1.difference(set2), {1, 2})
        self.assertEqual(set1.symmetric_difference(set2), {1, 2, 4, 5})

if __name__ == '__main__':
    unittest.main()

Thinking and Problem-Solving Tips

When working with Hash Sets, consider the following strategies:

Conclusion

In this lesson, we covered the fundamental concepts of Hash Sets in Python, including creation, adding elements, checking existence, removing elements, and iterating over sets. We also explored advanced techniques, common pitfalls, and best practices. By mastering these concepts, you can efficiently work with unique collections of elements in your programs.

Additional Resources

For further reading and practice, consider the following resources: