TIW: BFS, DFS
Breadth First Search and Depth First Search are both essential skills to passing medium problems. If you’ve been following along, I’ve already written BFS once in STL#2. Depending on the problem, there are usually two ways I would go about writing BFS, and also two ways for DFS. But before I go on with solving problems, let me point out a few basic facts:
- BFS is good for finding shortest paths on a graph with no edge weight, or just traversing the whole graph in general.
- DFS is good for traversing the whole graph too, especially trees.
- DFS could be have time complexity issues depending on the graph you are walking, in which case it might be replaceable by BFS/Dijkstra.
- DFS with recursion uses the program stack and might overflow much easier than dynamically allocating memory (using vector, for example).
- There are probably many more use cases of BFS and DFS than you think, on things you would not think of as a graph.
I have a feeling this is going to be long. So here are the key points I will demonstrate:
- BFS on 2D array
- BFS layer by layer
- DFS on tree, iterative and recursive
- DFS as brute force search
BFS on 2D array
Let’s start with a standard problem for BFS: Number of Islands. Given a 2D matrix with ‘1’s and ‘0’s, count the connecting pieces of ‘1’s. The idea: go through each point, if it is land and we haven’t seen it, walk the whole island and mark each land as ‘0’ we see.
bool inboard(int x, int y, int n, int m) { // whether a point (x, y) is in the board with size n times m
return x >= 0 && x < n && y >= 0 && y < m;
}
int numIslands(vector<vector<char>>& grid) {
if (grid.empty())
return 0;
int m = grid.size(), n = grid[0].size();
int ans = 0;
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
if (grid[i][j] == '1') {
ans++;
queue<pair<int, int> > bfs;
bfs.push(make_pair(i, j));
grid[i][j] = '0';
while (!bfs.empty()) {
pair<int, int> pos = bfs.front();
bfs.pop();
for (auto& v : vector<vector<int> >{ {1, 0}, {0, 1}, {-1, 0}, {0, -1} }) {
int next_i = pos.first+v[0], next_j = pos.second+v[1];
if (inboard(next_i, next_j, m, n) && grid[next_i][next_j] == '1') {
grid[next_i][next_j] = '0';
bfs.push(make_pair(next_i, next_j));
}
}
}
}
}
}
return ans;
}
The inboard() function is often useful in 2D array graph problems. Notice the line:
for (auto& v : vector<vector<int> >{ {1, 0}, {0, 1}, {-1, 0}, {0, -1}})
This for loop will give you four directions along which to go, and you just need to add v[0] to your current position’s row index and v[1] to column index to get to the next position. Stop copy pasting code 4 times, please.
A queue<> has 3 functions we need: pop, push and front. If you need a queue that you can access from both sides, you should consider using deque<>, which has [push/pop]_[front/back], and also front() and back(). Everything is of course O(1), so you don’t have to worry.
It is important to note that in the above problem, there is no need to walk the graph in any particular order as far as you get the job done, so I did not mark the distance to each point. A DFS approach would also have worked and might even be preferred. In a DFS recursion approach, the extra memory will be allocated on the program stack instead of the heap, which is quite advantageous in terms of efficiency.
BFS layer by layer
Sometimes doing things layer by layer will make life easier. See Maximum Depth of Binary Tree:
int maxDepth(TreeNode* root) {
if (!root)
return 0;
vector<TreeNode*> bfs{root};
int ans;
for (ans = 0; !bfs.empty(); ans++) {
vector<TreeNode*> next;
for (TreeNode* node : bfs)
for (TreeNode* c : {node->left, node->right})
if (c)
next.push_back(c);
bfs = move(next);
}
return ans;
}
This is essentially the same approach I used for the problem in STL#2. DFS would also work for this problem, and might perform better in terms of space for some test cases, since space for a recursive approach would be O(height) instead of O(width). Anyway I still included this problem here for the sake of completeness. Noteworthy: (1) I made a for loop since I want to get the height, otherwise we can write a while (!bfs.empty()) loop; (2) there is no need to keep track of the visited nodes because we know for sure we will not visit the same node twice.
DFS on tree, iterative and recursive
While BFS is always written using a queue/vector, DFS uses a stack, so we could use either vector or recursion to accomplish the same goal.
I will redo the Maximum Depth problem twice to illustrate the idea:
// Iterative DFS
int maxDepth(TreeNode* root) {
if (!root)
return 0;
vector<pair<TreeNode*, int> > dfs{make_pair(root, 1)};
int ans = 0;
while (!dfs.empty()) {
pair<TreeNode*, int> back = dfs.back();
dfs.pop_back();
ans = max(ans, back.second);
for (TreeNode* c : {back.first->left, back.first->right})
if (c)
dfs.push_back(make_pair(c, back.second+1));
}
return ans;
}
// Recursive DFS
int maxDepth(TreeNode* root) {
return root ? 1+max(maxDepth(root->left), maxDepth(root->right)) : 0;
}
As you can see, recursion on trees is black magic. I mean, what’s shorter than a one liner? Whenever applicable, it often shortens the code by a great margin, and present the logic in a very clear and concise way.
DFS with recursion is great for tree problems that are self-similar. For example, the height of this tree is the height of the left subtree + 1, or the right subtree + 1, whichever is larger.
- Use recursion if you are going to DFS. Except for some Leetcode problems that test on memory limits, I have not found any reason why iterative DFS is superior. Recursion runs on stack while vector dynamically allocate memory in the heap, so in that particular case vector has less limitations on the space constraint.
- In recursion, first handle all base cases, then implement the recurrence relation (well… what else are you supposed to do anyways?)
- The essential problem here is to determine in order to solve the original problem, what information do you need for a sub-problem. Sometimes you need extra information from your parent or children to solve a task.
To illustrate the third point above, I’m going to do a slightly more complicated one (not a one-liner): given a TreeNode*, determine whether the tree is a binary search tree. This is a medium problem on LeetCode.
pair<int, int> helper(TreeNode* root, bool& ans) {
int l = root->val, r = root->val;
if (root->left) {
pair<int, int> left = helper(root->left, ans);
if (left.second >= root->val)
ans = false;
l = left.first;
}
if (root->right) {
pair<int, int> right = helper(root->right, ans);
if (right.first <= root->val)
ans = false;
r = right.second;
}
return make_pair(l, r);
}
bool isValidBST(TreeNode* root) {
if (!root)
return true;
bool ans = true;
helper(root, ans);
return ans;
}
The recurrence relation here is that the largest element in the left subtree must be smaller than the current node and the smallest element in the right subtree must be larger than the current node.
DFS as brute force search
Sometimes some problems require brute force enumeration of all possibilities. Then, recursive DFS would be a good approach. Let’s look at Decode String: given a string in the recursive format: #[x] where # is a number and x is an string, print x multiple times as denoted by #. For example “2[abc]3[cd]ef” becomes “abcabccdcdcdef”.
There is not much to optimize in this problem, since the runtime is lower bounded by the length of the output. The algorithm is basically running each recursive level of helper() as the same bracket level of the original string, passing by reference the current index in the string as we go along.
string helper(string& s, int& i) {
string ans;
while (i < s.length() && s[i] != ']') {
if (s[i] >= 'a' && s[i] <= 'z') {
ans += s[i];
i++;
} else {
int k = 0;
for (; s[i] != '['; i++)
k = k*10+s[i]-'0';
i++;
string ret = helper(s, i);
for (int j = 0; j < k; j++)
ans += ret;
}
}
i++;
return ans;
}
string decodeString(string s) {
int i = 0;
return helper(s, i);
}
There are many applications of DFS brute force, but they could be very different from each other and hard for me to generalize. So this is only a short description of what you can do with DFS.
Recap
Use BFS or recursive DFS for 2D board traversal; use vectors or queues for BFS; use recursion for trees almost always; use recursive DFS to brute force.