Hash Sets are collections of unique items i.e no element can be repeated.
Hash Sets were designed to give us a way of adding and looking for unique values in a collection in a quick manner.
The values in a Hash Set can be either simple primitives like strings or integers as well as more complex object types like object literals or arrays.
In C++, the Hash Set is implemented as an unordered_set
object.
Creation:
To create a new empty unordered_set
that holds strings
, you use the following syntax:
unordered_set<string> mySet;
Creating an empty Set takes O(1)
time.
Constructing from iterable
The Set constructor also accepts an optional iterable object. If you pass an iterable object to the Set constructor, all the unique elements of the object will be added to the new set:
vector<string> elements = {"b", "c", "c", "d"};
unordered_set<string> mySet(elements.begin(), elements.end());
// mySet contains {"b", "c", "d"}
This takes O(n)
time, where n
is the number of elements in the iterable object.
Adding elements:
To add an element to the set, you use the insert()
method:
unordered_set<string> mySet({"b", "c", "c", "d"});
// Adding new elements
mySet.insert("x");
// Adding an existing element
mySet.insert("d"); // nothing happens
// mySet contains {"b", "c", "d", "x"}
The Set first checks if that element already exists and if so, it does nothing and also doesn't raise an error.
Adding an element to a Set takes O(1)
time.
Checking if a value exists:
To check if a Set has a specific element, you use the count()
method which returns 1
if the set contains the element, otherwise, it returns 0
.
unordered_set<string> mySet({"b", "c", "c", "d"});
mySet.count("c"); // returns 1 since 'c' appears once (remember that set keeps unique values)
mySet.count("x"); // returns 0 since it does not exist in the set
Checking if a value exists in a Set takes O(1)
time.
Removing elements:
To delete a specified element from a set, you use the erase()
method:
unordered_set<string> mySet({"b", "c", "c", "d"});
// Deleting elements
mySet.erase("c");
// Deleting a non-existent element
mySet.erase("z"); // does nothing
// mySet contains {"b", "d"}
The Set first checks if that key exists and if not, it does nothing and also doesn't raise an error, so it's safe to delete non-existing elements.
Removing an element from a Set takes O(1)
time.
Iterating over the Set values:
If we want to iterate over values of the unordered_set
, we can use the for
loop along with the :
operator:
unordered_set<string> mySet({"b", "c", "c", "d"});
for (string val : mySet) {
cout << val << endl;
}
// This will print the following:
// d
// c
// b
Iterating over an unordered_set
takes O(n)
time.
Notice the order is not the same as we initiated. set
keeps the data in random order
Space Complexity
A Set uses O(n)
space, where n
is the number of elements existing in the Set.
Assignment
Follow the Coding Tutorial and let's play with some Hash Sets.
Hint
Look at the examples above if you get stuck.
In this lesson, we will explore the concept of Hash Sets in C++. Hash Sets are a powerful data structure that allows for the storage of unique elements and provides efficient operations for adding, removing, and checking the existence of elements. Understanding Hash Sets is crucial for solving problems that require quick lookups and ensuring uniqueness in collections.
Hash Sets are particularly useful in scenarios where you need to maintain a collection of unique items, such as tracking unique user IDs, filtering duplicate entries, or implementing caching mechanisms.
Before diving into the details of Hash Sets, let's understand some fundamental concepts:
Understanding these basics is essential before moving on to more complex aspects of Hash Sets.
Let's delve into the key concepts and techniques involved in using Hash Sets in C++:
unordered_set
using the syntax unordered_set<string> mySet;
. This operation takes constant time, O(1)
.O(n)
, where n
is the number of elements in the iterable.insert()
method to add elements to the Hash Set. If the element already exists, it will be ignored. Adding an element takes constant time, O(1)
.count()
method to check if an element exists in the Hash Set. This operation also takes constant time, O(1)
.erase()
method to remove elements from the Hash Set. If the element does not exist, nothing happens. Removing an element takes constant time, O(1)
.for
loop. Iterating over the set takes linear time, O(n)
.Let's look at some examples to understand how Hash Sets can be used in various contexts:
#include <iostream>
#include <unordered_set>
#include <vector>
#include <string>
using namespace std;
int main() {
vector<string> userIds = {"user1", "user2", "user3", "user1", "user4"};
unordered_set<string> uniqueUserIds(userIds.begin(), userIds.end());
for (const string& id : uniqueUserIds) {
cout << id << endl;
}
return 0;
}
In this example, we initialize a Hash Set with user IDs from a vector. The Hash Set automatically removes duplicate entries, ensuring that only unique user IDs are stored.
#include <iostream>
#include <unordered_set>
#include <vector>
#include <string>
using namespace std;
vector<string> filterDuplicates(const vector<string>& input) {
unordered_set<string> uniqueElements(input.begin(), input.end());
return vector<string>(uniqueElements.begin(), uniqueElements.end());
}
int main() {
vector<string> data = {"apple", "banana", "apple", "orange", "banana"};
vector<string> filteredData = filterDuplicates(data);
for (const string& item : filteredData) {
cout << item << endl;
}
return 0;
}
In this example, we define a function filterDuplicates
that takes a vector of strings and returns a new vector with duplicate entries removed using a Hash Set.
When working with Hash Sets, it's important to be aware of common pitfalls and follow best practices:
Let's explore some advanced techniques related to Hash Sets:
Here is a comprehensive example demonstrating the use of Hash Sets in C++:
#include <iostream>
#include <unordered_set>
#include <vector>
#include <string>
using namespace std;
int main() {
// Creating an empty unordered_set
unordered_set<string> mySet;
// Adding elements to the set
mySet.insert("apple");
mySet.insert("banana");
mySet.insert("orange");
// Checking if an element exists
if (mySet.count("banana")) {
cout << "Banana is in the set" << endl;
}
// Removing an element
mySet.erase("banana");
// Iterating over the set
for (const string& fruit : mySet) {
cout << fruit << endl;
}
return 0;
}
This code demonstrates the creation of a Hash Set, adding elements, checking for existence, removing elements, and iterating over the set.
When working with Hash Sets, consider the following tips for debugging and testing:
Here are some strategies for approaching problems related to Hash Sets:
In this lesson, we explored the concept of Hash Sets in C++ and learned how to create, manipulate, and utilize them effectively. Hash Sets are a powerful tool for ensuring uniqueness and performing efficient lookups in collections. By mastering Hash Sets, you can solve a wide range of programming problems more efficiently.
Keep practicing and exploring further applications of Hash Sets to enhance your programming skills.
For further reading and practice problems related to Hash Sets, consider the following resources: