{"id":4441,"date":"2024-10-21T09:24:52","date_gmt":"2024-10-21T09:24:52","guid":{"rendered":"https:\/\/algocademy.com\/blog\/the-most-overlooked-data-structures-you-need-to-know\/"},"modified":"2024-10-21T09:24:52","modified_gmt":"2024-10-21T09:24:52","slug":"the-most-overlooked-data-structures-you-need-to-know","status":"publish","type":"post","link":"https:\/\/algocademy.com\/blog\/the-most-overlooked-data-structures-you-need-to-know\/","title":{"rendered":"The Most Overlooked Data Structures You Need to Know"},"content":{"rendered":"<p><!DOCTYPE html PUBLIC \"-\/\/W3C\/\/DTD HTML 4.0 Transitional\/\/EN\" \"http:\/\/www.w3.org\/TR\/REC-html40\/loose.dtd\"><br \/>\n<html><body><\/p>\n<article>\n<p>In the vast world of computer science and programming, data structures form the backbone of efficient algorithms and well-designed software. While most developers are familiar with common data structures like arrays, linked lists, and binary trees, there are several lesser-known yet powerful data structures that often go overlooked. In this comprehensive guide, we&#8217;ll explore these underappreciated gems and discuss why they deserve a place in your programming toolkit.<\/p>\n<h2>1. Bloom Filters<\/h2>\n<p>Bloom filters are probabilistic data structures used to test whether an element is a member of a set. They&#8217;re incredibly space-efficient and offer constant-time add and lookup operations.<\/p>\n<h3>How Bloom Filters Work<\/h3>\n<p>A Bloom filter consists of a bit array of m bits, initially all set to 0, and k different hash functions. To add an element, feed it to each of the k hash functions to get k array positions. Set the bits at all these positions to 1.<\/p>\n<p>To query for an element, feed it to each of the k hash functions to get k array positions. If any of the bits at these positions is 0, the element is definitely not in the set. If all are 1, then either the element is in the set, or the bits have been set to 1 during the insertion of other elements.<\/p>\n<h3>Use Cases<\/h3>\n<ul>\n<li>Spell checkers<\/li>\n<li>Cache filtering<\/li>\n<li>Network routers<\/li>\n<li>Cryptocurrency wallets<\/li>\n<\/ul>\n<h3>Code Example<\/h3>\n<pre><code>import mmh3\nimport math\nimport array\n\nclass BloomFilter:\n    def __init__(self, size, hash_count):\n        self.size = size\n        self.hash_count = hash_count\n        self.bit_array = array.array('b', [0] * size)\n    \n    def add(self, item):\n        for seed in range(self.hash_count):\n            index = mmh3.hash(item, seed) % self.size\n            self.bit_array[index] = 1\n    \n    def check(self, item):\n        for seed in range(self.hash_count):\n            index = mmh3.hash(item, seed) % self.size\n            if self.bit_array[index] == 0:\n                return False\n        return True\n\n# Usage\nbf = BloomFilter(1000000, 5)\nbf.add(\"hello\")\nprint(bf.check(\"hello\"))  # True\nprint(bf.check(\"world\"))  # False<\/code><\/pre>\n<h2>2. Skip Lists<\/h2>\n<p>Skip lists are a probabilistic alternative to balanced trees. They allow for fast search, insertion, and deletion of elements, with an expected time complexity of O(log n) for these operations.<\/p>\n<h3>How Skip Lists Work<\/h3>\n<p>A skip list is essentially a series of linked lists stacked on top of each other. The bottom layer contains all elements, and each higher layer acts as an &#8220;express lane&#8221; for the lists below, allowing for faster traversal.<\/p>\n<h3>Use Cases<\/h3>\n<ul>\n<li>Implementing sorted sets in Redis<\/li>\n<li>File systems<\/li>\n<li>In-memory databases<\/li>\n<\/ul>\n<h3>Code Example<\/h3>\n<pre><code>import random\n\nclass Node:\n    def __init__(self, key, level):\n        self.key = key\n        self.forward = [None] * (level + 1)\n\nclass SkipList:\n    def __init__(self, max_level, p):\n        self.max_level = max_level\n        self.p = p\n        self.header = Node(-1, max_level)\n        self.level = 0\n\n    def random_level(self):\n        lvl = 0\n        while random.random() &lt; self.p and lvl &lt; self.max_level:\n            lvl += 1\n        return lvl\n\n    def insert(self, key):\n        update = [None] * (self.max_level + 1)\n        current = self.header\n\n        for i in range(self.level, -1, -1):\n            while current.forward[i] and current.forward[i].key &lt; key:\n                current = current.forward[i]\n            update[i] = current\n\n        level = self.random_level()\n\n        if level &gt; self.level:\n            for i in range(self.level + 1, level + 1):\n                update[i] = self.header\n            self.level = level\n\n        new_node = Node(key, level)\n\n        for i in range(level + 1):\n            new_node.forward[i] = update[i].forward[i]\n            update[i].forward[i] = new_node\n\n    def search(self, key):\n        current = self.header\n\n        for i in range(self.level, -1, -1):\n            while current.forward[i] and current.forward[i].key &lt; key:\n                current = current.forward[i]\n\n        current = current.forward[0]\n\n        if current and current.key == key:\n            return current.key\n        return None\n\n# Usage\nsl = SkipList(4, 0.5)\nsl.insert(3)\nsl.insert(6)\nsl.insert(7)\nsl.insert(9)\nprint(sl.search(6))  # 6\nprint(sl.search(5))  # None<\/code><\/pre>\n<h2>3. Trie (Prefix Tree)<\/h2>\n<p>A trie, also called a prefix tree, is an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings.<\/p>\n<h3>How Tries Work<\/h3>\n<p>Each node in the trie represents a common prefix of some keys. All the descendants of a node have a common prefix of the string associated with that node. The root is associated with the empty string.<\/p>\n<h3>Use Cases<\/h3>\n<ul>\n<li>Autocomplete and predictive text<\/li>\n<li>Spell checkers<\/li>\n<li>IP routing tables<\/li>\n<li>Solving word games<\/li>\n<\/ul>\n<h3>Code Example<\/h3>\n<pre><code>class TrieNode:\n    def __init__(self):\n        self.children = {}\n        self.is_end_of_word = False\n\nclass Trie:\n    def __init__(self):\n        self.root = TrieNode()\n\n    def insert(self, word):\n        node = self.root\n        for char in word:\n            if char not in node.children:\n                node.children[char] = TrieNode()\n            node = node.children[char]\n        node.is_end_of_word = True\n\n    def search(self, word):\n        node = self.root\n        for char in word:\n            if char not in node.children:\n                return False\n            node = node.children[char]\n        return node.is_end_of_word\n\n    def starts_with(self, prefix):\n        node = self.root\n        for char in prefix:\n            if char not in node.children:\n                return False\n            node = node.children[char]\n        return True\n\n# Usage\ntrie = Trie()\ntrie.insert(\"apple\")\nprint(trie.search(\"apple\"))   # True\nprint(trie.search(\"app\"))     # False\nprint(trie.starts_with(\"app\")) # True<\/code><\/pre>\n<h2>4. Fenwick Tree (Binary Indexed Tree)<\/h2>\n<p>A Fenwick tree, also known as a binary indexed tree, is a data structure that can efficiently update elements and calculate prefix sums in a table of numbers.<\/p>\n<h3>How Fenwick Trees Work<\/h3>\n<p>Fenwick trees represent an array of numbers as a tree, where each node stores the sum of a range of elements. The structure of the tree allows for efficient updates and queries in O(log n) time.<\/p>\n<h3>Use Cases<\/h3>\n<ul>\n<li>Calculating cumulative frequency<\/li>\n<li>Range sum queries<\/li>\n<li>Computational geometry<\/li>\n<\/ul>\n<h3>Code Example<\/h3>\n<pre><code>class FenwickTree:\n    def __init__(self, n):\n        self.size = n\n        self.tree = [0] * (n + 1)\n\n    def update(self, i, delta):\n        while i &lt;= self.size:\n            self.tree[i] += delta\n            i += i &amp; (-i)\n\n    def sum(self, i):\n        s = 0\n        while i &gt; 0:\n            s += self.tree[i]\n            i -= i &amp; (-i)\n        return s\n\n    def range_sum(self, i, j):\n        return self.sum(j) - self.sum(i - 1)\n\n# Usage\nft = FenwickTree(10)\nft.update(1, 3)\nft.update(2, 2)\nft.update(3, 5)\nprint(ft.range_sum(1, 3))  # 10<\/code><\/pre>\n<h2>5. Segment Tree<\/h2>\n<p>A segment tree is a tree data structure used for storing information about intervals, or segments. It allows for efficient querying of cumulative data for a range of array positions, as well as efficient updating of array values.<\/p>\n<h3>How Segment Trees Work<\/h3>\n<p>The segment tree divides an array into a number of segments and stores them in a tree structure. Each node of the tree represents a segment of the array, with leaf nodes representing individual elements and internal nodes representing larger segments.<\/p>\n<h3>Use Cases<\/h3>\n<ul>\n<li>Range minimum\/maximum queries<\/li>\n<li>Range sum queries<\/li>\n<li>Lazy propagation<\/li>\n<\/ul>\n<h3>Code Example<\/h3>\n<pre><code>class SegmentTree:\n    def __init__(self, arr):\n        self.n = len(arr)\n        self.tree = [0] * (4 * self.n)\n        self.build(arr, 0, 0, self.n - 1)\n\n    def build(self, arr, node, start, end):\n        if start == end:\n            self.tree[node] = arr[start]\n        else:\n            mid = (start + end) \/\/ 2\n            self.build(arr, 2 * node + 1, start, mid)\n            self.build(arr, 2 * node + 2, mid + 1, end)\n            self.tree[node] = self.tree[2 * node + 1] + self.tree[2 * node + 2]\n\n    def update(self, index, value):\n        self._update(0, 0, self.n - 1, index, value)\n\n    def _update(self, node, start, end, index, value):\n        if start == end:\n            self.tree[node] = value\n        else:\n            mid = (start + end) \/\/ 2\n            if start &lt;= index &lt;= mid:\n                self._update(2 * node + 1, start, mid, index, value)\n            else:\n                self._update(2 * node + 2, mid + 1, end, index, value)\n            self.tree[node] = self.tree[2 * node + 1] + self.tree[2 * node + 2]\n\n    def query(self, left, right):\n        return self._query(0, 0, self.n - 1, left, right)\n\n    def _query(self, node, start, end, left, right):\n        if right &lt; start or end &lt; left:\n            return 0\n        if left &lt;= start and end &lt;= right:\n            return self.tree[node]\n        mid = (start + end) \/\/ 2\n        return (self._query(2 * node + 1, start, mid, left, right) +\n                self._query(2 * node + 2, mid + 1, end, left, right))\n\n# Usage\narr = [1, 3, 5, 7, 9, 11]\nst = SegmentTree(arr)\nprint(st.query(1, 3))  # 15\nst.update(2, 6)\nprint(st.query(1, 3))  # 16<\/code><\/pre>\n<h2>6. Disjoint Set (Union-Find)<\/h2>\n<p>A disjoint-set data structure, also known as a union-find data structure, keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets.<\/p>\n<h3>How Disjoint Sets Work<\/h3>\n<p>The data structure provides near-constant-time operations to add new sets, merge existing sets, and determine whether elements are in the same set. It does this by using techniques like &#8220;union by rank&#8221; and &#8220;path compression&#8221; to keep the tree balanced and flatten the structure during finds.<\/p>\n<h3>Use Cases<\/h3>\n<ul>\n<li>Kruskal&#8217;s algorithm for minimum spanning trees<\/li>\n<li>Finding connected components in graphs<\/li>\n<li>Image processing<\/li>\n<li>Network connectivity<\/li>\n<\/ul>\n<h3>Code Example<\/h3>\n<pre><code>class DisjointSet:\n    def __init__(self, vertices):\n        self.parent = {v: v for v in vertices}\n        self.rank = {v: 0 for v in vertices}\n\n    def find(self, item):\n        if self.parent[item] != item:\n            self.parent[item] = self.find(self.parent[item])\n        return self.parent[item]\n\n    def union(self, x, y):\n        xroot = self.find(x)\n        yroot = self.find(y)\n\n        if self.rank[xroot] &lt; self.rank[yroot]:\n            self.parent[xroot] = yroot\n        elif self.rank[xroot] &gt; self.rank[yroot]:\n            self.parent[yroot] = xroot\n        else:\n            self.parent[yroot] = xroot\n            self.rank[xroot] += 1\n\n# Usage\nvertices = [\"A\", \"B\", \"C\", \"D\", \"E\"]\nds = DisjointSet(vertices)\nds.union(\"A\", \"B\")\nds.union(\"C\", \"D\")\nprint(ds.find(\"A\") == ds.find(\"B\"))  # True\nprint(ds.find(\"A\") == ds.find(\"C\"))  # False<\/code><\/pre>\n<h2>7. Van Emde Boas Tree<\/h2>\n<p>The van Emde Boas tree (vEB tree) is a tree data structure which implements an associative array with m-bit integer keys. It supports operations such as insert, delete, and find-next in O(log log M) time, where M is the maximum value of a key.<\/p>\n<h3>How Van Emde Boas Trees Work<\/h3>\n<p>The vEB tree recursively divides the range of possible keys into clusters. Each node in the tree represents a range of integers and contains information about which clusters in its range contain elements.<\/p>\n<h3>Use Cases<\/h3>\n<ul>\n<li>IP routing tables<\/li>\n<li>Priority queues with integer keys<\/li>\n<li>Implementing other data structures like segment trees<\/li>\n<\/ul>\n<h3>Code Example<\/h3>\n<pre><code>import math\n\nclass VEBTree:\n    def __init__(self, universe_size):\n        self.universe_size = universe_size\n        self.min = None\n        self.max = None\n        if universe_size &lt;= 2:\n            self.summary = None\n            self.cluster = None\n        else:\n            self.summary = VEBTree(int(math.sqrt(universe_size)))\n            self.cluster = [VEBTree(int(math.sqrt(universe_size))) for _ in range(int(math.sqrt(universe_size)))]\n\n    def high(self, x):\n        return x \/\/ int(math.sqrt(self.universe_size))\n\n    def low(self, x):\n        return x % int(math.sqrt(self.universe_size))\n\n    def index(self, x, y):\n        return x * int(math.sqrt(self.universe_size)) + y\n\n    def insert(self, x):\n        if self.min is None:\n            self.min = self.max = x\n            return\n\n        if x &lt; self.min:\n            x, self.min = self.min, x\n\n        if self.universe_size &gt; 2:\n            if self.cluster[self.high(x)].min is None:\n                self.summary.insert(self.high(x))\n                self.cluster[self.high(x)].min = self.cluster[self.high(x)].max = self.low(x)\n            else:\n                self.cluster[self.high(x)].insert(self.low(x))\n\n        if x &gt; self.max:\n            self.max = x\n\n    def member(self, x):\n        if x == self.min or x == self.max:\n            return True\n        elif self.universe_size == 2:\n            return False\n        else:\n            return self.cluster[self.high(x)].member(self.low(x))\n\n    def successor(self, x):\n        if self.min is not None and x &lt; self.min:\n            return self.min\n        if self.universe_size == 2:\n            if x == 0 and self.max == 1:\n                return 1\n            else:\n                return None\n        if self.cluster[self.high(x)].max is not None and self.low(x) &lt; self.cluster[self.high(x)].max:\n            offset = self.cluster[self.high(x)].successor(self.low(x))\n            return self.index(self.high(x), offset)\n        succ_cluster = self.summary.successor(self.high(x))\n        if succ_cluster is None:\n            return None\n        offset = self.cluster[succ_cluster].min\n        return self.index(succ_cluster, offset)\n\n# Usage\nveb = VEBTree(16)\nveb.insert(2)\nveb.insert(3)\nveb.insert(4)\nveb.insert(5)\nveb.insert(7)\nveb.insert(14)\nprint(veb.member(4))  # True\nprint(veb.member(6))  # False\nprint(veb.successor(5))  # 7\nprint(veb.successor(14))  # None<\/code><\/pre>\n<h2>Conclusion<\/h2>\n<p>While these data structures may not be as commonly used as arrays or hash tables, they offer powerful solutions to specific problems. Understanding these overlooked data structures can significantly enhance your problem-solving skills and algorithm design capabilities.<\/p>\n<p>As you continue your journey in computer science and programming, remember that the right data structure can make all the difference in the efficiency and elegance of your solutions. Don&#8217;t shy away from exploring these lesser-known structures &acirc;&#8364;&#8220; they might just be the key to solving your next challenging problem.<\/p>\n<p>Keep practicing, experimenting, and expanding your knowledge. The world of data structures is vast and fascinating, with each structure offering unique advantages for specific scenarios. By mastering these overlooked data structures, you&#8217;ll be better equipped to tackle a wide range of programming challenges and optimize your code for various applications.<\/p>\n<\/article>\n<p><\/body><\/html><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the vast world of computer science and programming, data structures form the backbone of efficient algorithms and well-designed software&#8230;.<\/p>\n","protected":false},"author":1,"featured_media":4440,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[23],"tags":[],"class_list":["post-4441","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-problem-solving"],"_links":{"self":[{"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/posts\/4441"}],"collection":[{"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/comments?post=4441"}],"version-history":[{"count":0,"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/posts\/4441\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/media\/4440"}],"wp:attachment":[{"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/media?parent=4441"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/categories?post=4441"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/algocademy.com\/blog\/wp-json\/wp\/v2\/tags?post=4441"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}