Lowest Common Ancestor: A Fundamental Tree Algorithm Explained

When navigating tree data structures, finding the lowest common ancestor (LCA) between two nodes is a fundamental operation with wide-ranging applications in computer science. This technique is particularly important for developers preparing for technical interviews at top tech companies like Google, Amazon, and Microsoft. In this comprehensive guide, we’ll explore the concept of the lowest common ancestor, implement multiple solutions, and examine its real world applications.
What is the Lowest Common Ancestor?
In a tree data structure, the lowest common ancestor of two nodes is defined as the deepest node that has both nodes as descendants. A node can be a descendant of itself, which means if we’re looking for the LCA of a node and one of its ancestors, the ancestor itself will be the LCA.
To visualize this, consider a family tree where you’re trying to find the most recent common ancestor between two family members. The LCA would be the closest relative shared by both individuals.
Formal Definition
Given a tree (not necessarily binary) and two nodes u and v, their lowest common ancestor is the node w that is an ancestor of both u and v and has the greatest depth in the tree. If u is an ancestor of v, then u is the LCA.
Binary Tree LCA Implementation
Let’s start with the most common case: finding the LCA in a binary tree. We’ll first define a basic binary tree node structure:
class TreeNode {
int val;
TreeNode left;
TreeNode right;
TreeNode(int x) {
val = x;
left = null;
right = null;
}
}
Recursive Approach
The most intuitive way to find the LCA is using recursion. Here’s how it works:
- If the root is null, return null (base case)
- If either p or q matches the root, we’ve found one of our targets, so return the root
- Recursively search for p and q in the left and right subtrees
- If both left and right recursive calls return non-null values, we’ve found our LCA
- Otherwise, return whichever recursive call returned a non-null value
Here’s the implementation in Java:
public TreeNode lowestCommonAncestor(TreeNode root, TreeNode p, TreeNode q) {
// Base case: if root is null or matches either p or q
if (root == null || root == p || root == q) {
return root;
}
// Look for p and q in left and right subtrees
TreeNode left = lowestCommonAncestor(root.left, p, q);
TreeNode right = lowestCommonAncestor(root.right, p, q);
// If both left and right returned non-null, root is the LCA
if (left != null && right != null) {
return root;
}
// Otherwise, return the non-null result
return left != null ? left : right;
}
This elegant recursive solution has a time complexity of O(n), where n is the number of nodes in the tree. In the worst case, we might need to visit every node once. The space complexity is O(h) where h is the height of the tree, due to the recursion stack.
Iterative Approach with Parent Pointers
An alternative approach involves first traversing the tree to add parent pointers to each node, then finding the LCA by tracing upward from both target nodes:
public TreeNode lowestCommonAncestorIterative(TreeNode root, TreeNode p, TreeNode q) {
// Map to store parent pointers
Map<TreeNode, TreeNode> parent = new HashMap<>();
// Stack for DFS traversal
Stack<TreeNode> stack = new Stack<>();
parent.put(root, null);
stack.push(root);
// Find p and q, and meanwhile store all parent pointers
while (!parent.containsKey(p) || !parent.containsKey(q)) {
TreeNode node = stack.pop();
if (node.left != null) {
parent.put(node.left, node);
stack.push(node.left);
}
if (node.right != null) {
parent.put(node.right, node);
stack.push(node.right);
}
}
// Create a set to store all ancestors of p
Set<TreeNode> ancestors = new HashSet<>();
// Add all ancestors of p to the set
while (p != null) {
ancestors.add(p);
p = parent.get(p);
}
// Find the first ancestor of q that's also an ancestor of p
while (!ancestors.contains(q)) {
q = parent.get(q);
}
return q;
}
This iterative solution also has a time complexity of O(n) and a space complexity of O(n) due to the hashmap and set used to store parent pointers and ancestors.
LCA in Binary Search Trees
When working with binary search trees (BSTs), we can leverage the BST property to find the LCA more efficiently. In a BST, all nodes in the left subtree have values less than the node’s value, and all nodes in the right subtree have values greater than the node’s value.
Here’s an optimized approach for finding the LCA in a BST:
public TreeNode lowestCommonAncestorBST(TreeNode root, TreeNode p, TreeNode q) {
// Ensure p has the smaller value
if (p.val > q.val) {
TreeNode temp = p;
p = q;
q = temp;
}
// Navigate the BST
while (root != null) {
// If both nodes are greater than current node, move right
if (p.val > root.val && q.val > root.val) {
root = root.right;
}
// If both nodes are less than current node, move left
else if (p.val < root.val && q.val < root.val) {
root = root.left;
}
// We've found the split point, this is the LCA
else {
return root;
}
}
return null;
}
This BST-specific approach has a time complexity of O(h), where h is the height of the tree. Since we're taking advantage of the BST property, we don't need to explore both subtrees at each step, making it more efficient than the general binary tree solution.
LCA in N-ary Trees
The concept of LCA extends beyond binary trees to n-ary trees (trees where nodes can have any number of children). Here's how we can implement LCA for an n-ary tree:
class Node {
int val;
List<Node> children;
Node(int val) {
this.val = val;
children = new ArrayList<>();
}
}
public Node lowestCommonAncestorNary(Node root, Node p, Node q) {
// Base cases
if (root == null) return null;
if (root == p || root == q) return root;
// Keep track of found nodes
int count = 0;
Node result = null;
// Check all children
for (Node child : root.children) {
Node found = lowestCommonAncestorNary(child, p, q);
if (found != null) {
count++;
result = found;
}
}
// If we found both p and q in different subtrees, root is LCA
if (count == 2) return root;
// If we found only one node, pass it up
return result;
}
Applications of Lowest Common Ancestor
The LCA algorithm has numerous practical applications in computer science and software development:
1. Path Finding in Networks
In network routing, finding the common junction point between two destinations can help optimize routing paths. The LCA algorithm helps identify this junction efficiently.
2. Computational Biology
In phylogenetic trees (evolutionary trees), the LCA helps determine the most recent common ancestor between species, which is crucial for understanding evolutionary relationships.
3. Database Systems
In hierarchical database structures, finding the LCA helps with efficient query processing, especially for queries that need to traverse up and down the hierarchy.
4. File Systems
In file systems, where directories form a tree structure, the LCA can help find the common parent directory between two files or folders.
5. Distance Calculation in Trees
The distance between two nodes in a tree can be calculated using the LCA. If we know the depths of both nodes and their LCA, the distance is:
distance(u, v) = depth(u) + depth(v) - 2 * depth(LCA(u, v))
Optimizing LCA with Preprocessing
For scenarios where we need to perform multiple LCA queries on the same tree, preprocessing techniques can significantly improve performance. One such technique is the Sparse Table method, which allows for O(1) LCA queries after O(n log n) preprocessing.
Euler Tour Technique with Range Minimum Query
This advanced approach involves:
- Performing an Euler tour of the tree (visiting each edge twice)
- Creating arrays for the tour sequence, level/depth of each node in the tour, and first occurrence of each node
- Using a Range Minimum Query (RMQ) data structure on the level array
Here's a sketch of this approach:
// Preprocessing
void preprocessLCA(TreeNode root) {
// Perform Euler tour and fill arrays
eulerTour(root, 0);
// Build RMQ data structure on level array
buildRMQ();
}
// Query for LCA
TreeNode queryLCA(TreeNode u, TreeNode v) {
// Get first occurrences of u and v in the Euler tour
int firstU = firstOccurrence[u.val];
int firstV = firstOccurrence[v.val];
// Ensure firstU <= firstV
if (firstU > firstV) {
int temp = firstU;
firstU = firstV;
firstV = temp;
}
// Query RMQ for minimum level in the range [firstU, firstV]
int minLevelIndex = rmqQuery(firstU, firstV);
// Return the node corresponding to this index in the Euler tour
return tourNodes[minLevelIndex];
}
After preprocessing, this approach allows for O(1) LCA queries, making it extremely efficient for multiple queries on the same tree.
Common Interview Questions
If you're preparing for technical interviews, especially at FAANG companies, here are some common LCA-related questions you might encounter:
1. Find the LCA in a Binary Tree
This is the basic problem we've already covered. Make sure you understand both recursive and iterative approaches.
2. Find the LCA in a BST
Leverage the BST property to optimize your solution, as shown earlier.
3. Find the LCA When Nodes May Not Exist
A variation where you need to first verify that both nodes exist in the tree:
public TreeNode lowestCommonAncestorWithValidation(TreeNode root, TreeNode p, TreeNode q) {
// First check if both nodes exist
if (!exists(root, p) || !exists(root, q)) {
return null;
}
// Then proceed with regular LCA algorithm
return lowestCommonAncestor(root, p, q);
}
private boolean exists(TreeNode root, TreeNode node) {
if (root == null) return false;
if (root == node) return true;
return exists(root.left, node) || exists(root.right, node);
}
4. Find the LCA of Multiple Nodes
Extend the algorithm to find the LCA of more than two nodes:
public TreeNode lowestCommonAncestorMultiple(TreeNode root, List<TreeNode> nodes) {
if (root == null || nodes.contains(root)) {
return root;
}
TreeNode left = lowestCommonAncestorMultiple(root.left, nodes);
TreeNode right = lowestCommonAncestorMultiple(root.right, nodes);
if (left != null && right != null) {
return root;
}
return left != null ? left : right;
}
Conclusion
The Lowest Common Ancestor algorithm is a fundamental technique in tree traversal with applications spanning from basic computer science problems to real world systems. Understanding different approaches to finding the LCA and their respective tradeoffs is essential for any software developer, especially those preparing for technical interviews at top tech companies.
By mastering this algorithm, you'll not only be better prepared for coding interviews but also equipped with a powerful tool for solving a wide range of tree-related problems in your development career. Whether you're working with binary trees, binary search trees, or more complex tree structures, the concept of finding the lowest common ancestor remains a key building block for efficient tree operations.
Remember to practice implementing these solutions and test them on various tree structures to solidify your understanding. The recursive solution is elegant and concise, but make sure you also understand the iterative approach, as interviewers often ask for both. For BSTs, always leverage the ordering property to optimize your solution. With these techniques in your toolkit, you'll be well-prepared to tackle tree-related challenges in both interviews and real world applications.