Hash Sets in C++


Hash Sets are collections of unique items i.e no element can be repeated.

Hash Sets were designed to give us a way of adding and looking for unique values in a collection in a quick manner.

The values in a Hash Set can be either simple primitives like strings or integers as well as more complex object types like object literals or arrays.

In C++, the Hash Set is implemented as an unordered_set object.


Creation:

To create a new empty unordered_set that holds strings, you use the following syntax:

unordered_set<string> mySet;

Creating an empty Set takes O(1) time.


Constructing from iterable

The Set constructor also accepts an optional iterable object. If you pass an iterable object to the Set constructor, all the unique elements of the object will be added to the new set:

vector<string> elements = {"b", "c", "c", "d"};
unordered_set<string> mySet(elements.begin(), elements.end());

// mySet contains {"b", "c", "d"}

This takes O(n) time, where n is the number of elements in the iterable object.


Adding elements:

To add an element to the set, you use the insert() method:

unordered_set<string> mySet({"b", "c", "c", "d"});

// Adding new elements
mySet.insert("x");

// Adding an existing element
mySet.insert("d"); // nothing happens

// mySet contains {"b", "c", "d", "x"}

The Set first checks if that element already exists and if so, it does nothing and also doesn't raise an error.

Adding an element to a Set takes O(1) time.


Checking if a value exists:

To check if a Set has a specific element, you use the count() method which returns 1 if the set contains the element, otherwise, it returns 0.

unordered_set<string> mySet({"b", "c", "c", "d"});

mySet.count("c"); // returns 1 since 'c' appears once (remember that set keeps unique values)

mySet.count("x"); // returns 0 since it does not exist in the set

Checking if a value exists in a Set takes O(1) time.


Removing elements:

To delete a specified element from a set, you use the erase() method:

unordered_set<string> mySet({"b", "c", "c", "d"});

// Deleting elements
mySet.erase("c");

// Deleting a non-existent element
mySet.erase("z"); // does nothing

// mySet contains {"b", "d"}

The Set first checks if that key exists and if not, it does nothing and also doesn't raise an error, so it's safe to delete non-existing elements.

Removing an element from a Set takes O(1) time.


Iterating over the Set values:

If we want to iterate over values of the unordered_set, we can use the for loop along with the : operator:

unordered_set<string> mySet({"b", "c", "c", "d"});

for (string val : mySet) {
    cout << val << endl;
}

// This will print the following:
// d
// c
// b


Iterating over an unordered_set takes O(n) time.

Notice the order is not the same as we initiated. set keeps the data in random order


Space Complexity

A Set uses O(n) space, where n is the number of elements existing in the Set.


Assignment
Follow the Coding Tutorial and let's play with some Hash Sets.


Hint
Look at the examples above if you get stuck.


Introduction

In this lesson, we will explore the concept of Hash Sets in C++. Hash Sets are a powerful data structure that allows for the storage of unique elements and provides efficient operations for adding, removing, and checking the existence of elements. Understanding Hash Sets is crucial for solving problems that require quick lookups and ensuring uniqueness in collections.

Hash Sets are particularly useful in scenarios where you need to maintain a collection of unique items, such as tracking unique user IDs, filtering duplicate entries, or implementing caching mechanisms.

Understanding the Basics

Before diving into the details of Hash Sets, let's understand some fundamental concepts:

Understanding these basics is essential before moving on to more complex aspects of Hash Sets.

Main Concepts

Let's delve into the key concepts and techniques involved in using Hash Sets in C++:

Examples and Use Cases

Let's look at some examples to understand how Hash Sets can be used in various contexts:

Example 1: Tracking Unique User IDs

#include <iostream>
#include <unordered_set>
#include <vector>
#include <string>

using namespace std;

int main() {
    vector<string> userIds = {"user1", "user2", "user3", "user1", "user4"};
    unordered_set<string> uniqueUserIds(userIds.begin(), userIds.end());

    for (const string& id : uniqueUserIds) {
        cout << id << endl;
    }

    return 0;
}

In this example, we initialize a Hash Set with user IDs from a vector. The Hash Set automatically removes duplicate entries, ensuring that only unique user IDs are stored.

Example 2: Filtering Duplicate Entries

#include <iostream>
#include <unordered_set>
#include <vector>
#include <string>

using namespace std;

vector<string> filterDuplicates(const vector<string>& input) {
    unordered_set<string> uniqueElements(input.begin(), input.end());
    return vector<string>(uniqueElements.begin(), uniqueElements.end());
}

int main() {
    vector<string> data = {"apple", "banana", "apple", "orange", "banana"};
    vector<string> filteredData = filterDuplicates(data);

    for (const string& item : filteredData) {
        cout << item << endl;
    }

    return 0;
}

In this example, we define a function filterDuplicates that takes a vector of strings and returns a new vector with duplicate entries removed using a Hash Set.

Common Pitfalls and Best Practices

When working with Hash Sets, it's important to be aware of common pitfalls and follow best practices:

Advanced Techniques

Let's explore some advanced techniques related to Hash Sets:

Code Implementation

Here is a comprehensive example demonstrating the use of Hash Sets in C++:

#include <iostream>
#include <unordered_set>
#include <vector>
#include <string>

using namespace std;

int main() {
    // Creating an empty unordered_set
    unordered_set<string> mySet;

    // Adding elements to the set
    mySet.insert("apple");
    mySet.insert("banana");
    mySet.insert("orange");

    // Checking if an element exists
    if (mySet.count("banana")) {
        cout << "Banana is in the set" << endl;
    }

    // Removing an element
    mySet.erase("banana");

    // Iterating over the set
    for (const string& fruit : mySet) {
        cout << fruit << endl;
    }

    return 0;
}

This code demonstrates the creation of a Hash Set, adding elements, checking for existence, removing elements, and iterating over the set.

Debugging and Testing

When working with Hash Sets, consider the following tips for debugging and testing:

Thinking and Problem-Solving Tips

Here are some strategies for approaching problems related to Hash Sets:

Conclusion

In this lesson, we explored the concept of Hash Sets in C++ and learned how to create, manipulate, and utilize them effectively. Hash Sets are a powerful tool for ensuring uniqueness and performing efficient lookups in collections. By mastering Hash Sets, you can solve a wide range of programming problems more efficiently.

Keep practicing and exploring further applications of Hash Sets to enhance your programming skills.

Additional Resources

For further reading and practice problems related to Hash Sets, consider the following resources: