## Euler Path: 8 Lines Solution

The problem of Euler path marked a very fundamental moment in algorithm studies. When Euler posed the 7-bridge problem, there was no mathematical tool to solve it, hence he created the tool – graph theory. It sounds like how Newton invented calculus to solve his gravity problems and how Bernoulli invented calculus of variations to solve his brachistochrone problem. As I quote Sutherland,

“Well, I didn’t know it was hard.”

### Problem statement

Say you have arrived in a new country with a bunch of islands and one way bridges between some pairs of islands. As a tourist you would like to visit all the bridges once and only once. Abstractly, an Euler path is a path that traverses all the edges once and only once. Is it always possible? If possible, how do we find such a path?

### Existence of path

Unlike many other problems, we can determine the existence of the path without actually finding the path. There are 2 conditions for it. The first has to do with in and out degrees, and the second is about the connectivity of the graph.

For a directed graph, the in degree is the number of incoming edges to a vertex, and the out  degree is the number of outgoing edges to a vertex. Easily, if we have an Euler path, then there will be a start and end vertex. The start vertex will have an out degree – in degree of 1, the end vertex will have in degree – out degree of 1. Every other vertex will have an equal number of out degree and in degree, because if you have a different number, you obviously cannot go through all of those edges in one path. The only exception is when the start and end vertices are the same, in which case all the vertices have the same number of in and out degrees.

The second condition is that all vertices have to be weakly connected, meaning that by treating all edges as undirected edges, there exists a path between every pair of vertices. The only exception is for the vertices that have no edges – those can just be removed from the graph, and the Euler path trivially does not include them.

Given the above two properties, you can prove there is an Euler path by the following steps. First it is easy to see that if you start walking from the start vertex (out – in = 1) and removing edges as you walk through them, you can only end up at the end vertex (in – out = 1). Then, the remaining graph is full of cycles that can be visited through some vertex in the path we already have, and you just have to merge the cycles and the path to get the Euler path.

It is easier to see the other direction of the proof: if there is an Euler path, both conditions have to be met.

### Implementing existence of path

As mentioned above, we code up the two conditions and check whether we can find a valid starting position, otherwise return -1. Vertices are numbered 0 to n-1. The graph is stored as an adjacency list, meaning that adj[i] has all the neighbors (edges go from i to those). So the out degree of i is adj[i].size().

```int start(vector<vector<int> >& adj) {
// condition 1: in and out degrees
for (int i = 0; i < adj.size(); i++) {
deg[x]--;
}
int ans = -1;
for (int i = 0; i < deg.size(); i++)
if ((ans != -1 && deg[i] != 0 && deg[i] != -1)
|| deg[i] > 1)
return -1;
else if (deg[i] == 1)
ans = i;
if (ans == -1)  // start and end vertices are the same
for (int i = 0; i < adj.size() && ans == -1; i++)
ans = i;
if (ans == -1)  // there is no edge at all
return -1;
// condition 2: connectivity
vector<int> bfs{ans};
vis[ans] = true;
for (int i = 0; i < bfs.size(); i++)
if (!vis[x]) {
bfs.push_back(x);
vis[x] = true;
}
for (int i = 0; i < adj.size(); i++)
return -1;
return ans;
}
```

That is slightly more clumsy that I would like it to be, but it should be clear. Basically just counting in and out degrees and running a BFS on the starting vertex.

### Actually finding the path

As mentioned above in the sketch of proof, finding a path consists of the 3 steps:

1. Walk from the start vertex, removing edges as we use them, until there is nowhere to go. Then we have a path from start to end.
2. For the remaining edges, start at a vertex on the path and randomly walk until we go back to the same vertex. Then we have a cycle. Merge the cycle on the path. For example, if we had a path s->a->b->c->…->t and a cycle c->alpha->beta->c, we merge them to become s->a->b->c->alpha->beta->c->…->t.
3. Repeat step 2 until there is no remaining edge.

Well, that does not sound very easy to write nor very efficient to run, if we literally implemented the above. How many lines of code would that be?

### The 8 lines solution

The answer is 8. Here’s the code:

```void euler(vector<vector<int> >& adj, vector<int>& ans, int pos) {
}
ans.push_back(pos);
}
```

This is a very beautiful solution using recursion, in my opinion. I was quite surprised when I first saw this. Where is everything? Where is getting the main path? Where is getting the cycles? Where are we merging them?

The gist of this recursion is a post-order DFS, meaning that we visit the end of the graph first, and then backtrack. This comes from a very crucial observation: we can always be sure what could be at the end of the path, but not the front. It is important to know that the answer array is in reverse order of visit, i.e. we need to reverse it to get the Euler path. Let’s go through the stages of how the program works.

1. Path from start to end vertex. The first time we push back is when we run out of outgoing edges, which can only be the case of the end vertex, with in-out = 1. In all other vertices, the numbers are the same so if you can go in, you can definitely go out of that vertex. Hence the first push back occurs with the end vertex, and at that point the program execution stack has the entire path.
2. As we return from a recursive call of the function, we are essentially going back from the end to the start. If a vertex on the main path does not have any outgoing edge, we know we will visit it next, so we push it to the ans vector and return from the function.
3. But if a vertex on the path does have an outgoing edge, that means there is at least one cycle including this vertex. Then in the next iteration of the while loop, we will visit one of the outgoing edges and start another round of recursion. Again, this recursion must end on the same vertex because it is the only one with in-out = 1. This recursion gets you a cycle and by the time it returns, the cycle would have been pushed to the ans array already, finishing the merge operation.

By studying this code, there is one interesting point to note. That is, the while loop will only be executed 0 to 2 times in any recursive call. It will only be 0 at the end vertex, and on the main path with no branches, it will be 1. On the main path with branches, it will only be 2 but not more regardless of the out degree, because surely the recursive call for cycle needs to end on that vertex but it will not end until all outgoing edges are used up. Therefore to the program, multiple cycles on one vertex is just one big cycle. On the other vertices on cycles, it works the same way whether they have branches or not.

In case you want to see how to run it, here is the main function I wrote to test it:

```int main() {
int n, q;
cin >> n >> q;
for (int i = 0; i < q; i++) {
int u, v;
cin >> u >> v;
}
if (s == -1) {
cout << "no path" << endl;
return 0;
}
vector<int> ans;
reverse(ans.begin(), ans.end());
for (int x : ans)
cout << x << endl;
return 0;
}
```

That’s it – a simple problem with a simple solution. Leetcode has one Euler path problem, and the algorithm in this blog post comes from the discussion of that problem. Everything is pretty much the same for undirected graphs – you just have to use different data structures to store the edges. The proof is mostly the same with the first condition now about odd/even number of edges at each vertex, as there is no distinction between in and out degree.

## TIW: Binary Indexed Tree

Binary indexed tree, also called Fenwick tree, is a pretty advanced data structure for a specific use. Recall the range sum post: binary indexed tree is used to compute the prefix sum. In the prefix sum problem, we have an input array v, and we need to calculate the sum from the first item to index k. There are two operations. Update: change the number at one index by adding a value (not resetting the value), and query: getting the sum from begin to a certain index. How do we do it? There are two trivial ways:

1. Every time someone queries the sum, just loop through it and return the sum. O(1) update, O(n) query.
2. Precompute the prefix sum array, and return the precomputed value from the table. O(n) update, O(1) query.

To illustrate the differences and better explain what we’re trying to achieve, I will write the code for both approaches. They are not the theme of this post though.

```class Method1 {
private:
vector<int> x;
public:
Method1(int size) {
x = vector<int>(size);
}
void update(int v, int k) {
x[k] += v;
}
int query(int k) {
int ans = 0;
for (int i = 0; i <= k; i++)
ans += x[i];
return ans;
}
};
class Method2 {
private:
vetor<int> s;
public:
Method2(int size) {
s = vector<int>(size);
}
void update(int v, int k) {
for (; k < s.size(); k++)
x[k] += v;
}
int query(int k) {
return s[k];
}
};
```

Read through this and make sure you can write this code with ease. One note before we move on: we’re computing the sum from the first item to index k, but in general we want the range sum from index i to index j. To obtain range sum, you can simply subtract the prefix sums: query(j)-query(i-1).

OK, that looks good. If we make a lot of updates, we use method 1; if we make a lot of queries, we use method 2. What if we make the same amount of updates and queries? Say we make n each operations, then no matter which method we use, we end up getting O(n^2) time complexity (verify!). We either spend too much time pre-computing or too much time calculating the sum over and over again. Is there any way to do better?

Yes, of course! Instead of showing the code and convincing you that it works, I will derive it from scratch.

The quest of log(n)

The problem: say we have same amount of updates and queries, and we do not want to bias the computation on one of them. So we do a bit of pre-computation, and a bit of summation. That’s the goal.

Say we have an array of 8 numbers, {1, 2, 3, 4, 5, 6, 7, 8}. To calculate the sum of first 7 numbers, we would like to sum up a bunch of numbers (since there has to be a bit of summation). But the amount of numbers to be summed has to be sub-linear. Let’s say we want it to be log(n). log2(7) is almost 3, then maybe we can sum 3 numbers. In this case, we choose to sum the 3 numbers: sum{1, 2, 3, 4}, sum{5, 6} and sum{7}. Assume that we have these sums already pre-computed, we have log(n) numbers to sum, hence querying will be log(n). For clarity, let me put everything in a table:

Table 1a

sum{1} = sum{1}

sum{1, 2} = sum{1, 2}

sum{1, 2, 3} = sum{1, 2} + sum{3}

sum{1, 2, 3, 4} = sum{1, 2, 3, 4}

sum{1, 2, 3, 4, 5} = sum{1, 2, 3, 4} + sum{5}

sum{1, 2, 3, 4, 5, 6} = sum{1, 2, 3, 4} + sum{5, 6}

sum{1, 2, 3, 4, 5, 6, 7} = sum{1, 2, 3, 4} + sum{5, 6} + sum{7}

sum{1, 2, 3, 4, 5, 6, 7, 8} = sum{1, 2, 3, 4, 5, 6, 7, 8}

The left hand side of the table is the query, and all the terms on the right hand side are pre-computed. If you look closely enough you will see the pattern: for summing k numbers, first take the largest power of 2, 2^m, that is ≤ k, and pre-compute it. Then for the rest of the numbers, k-2^m, take the largest power of 2, 2^m’ such that 2^m’ ≤ k-2^m, and pre-compute it, and so on.

There are two steps to do: show that querying (adding terms on the right hand side) is log(n) and show that pre-computing the terms on the right hand side is log(n).

Querying is log(n) is easily seen, because by taking out the largest power of 2 each time, we will at least take out half of the numbers (Use proof by contradiction). Taking out no less than one half each time, after O(log(n)) time we would have taken out all of it.

Now we are one step from finishing on the theoretical side: how do we pre-compute those terms?

Let’s say we want to change the number 1 into 2, essentially carrying out update(1, 0). Look at the terms above: we need to change sum{1}, sum{1, 2}, sum{1, 2, 3, 4} and sum{1, 2, 3, 4, 5, 6, 7, 8}. Each time we update one more pre-computed term, we cover double the number of elements in the array. Therefore we also only need to update log(n) terms. Let’s see it in a table:

Table 1b

update 1: sum{1}, sum{1, 2}, sum{1, 2, 3, 4}, sum{1, 2, 3, 4, 5, 6, 7, 8}

update 2: sum{1, 2}, sum{1, 2, 3, 4}, sum{1, 2, 3, 4, 5, 6, 7, 8}

update 3: sum{3}, sum{1, 2, 3, 4}, sum {1, 2, 3, 4, 5, 6, 7, 8}

update 4: sum{1, 2, 3, 4}, sum{1, 2, 3, 4, 5, 6, 7, 8}

update 5: sum{5}, sum{5, 6}, sum{1, 2, 3, 4, 5, 6, 7, 8}

update 6: sum{5, 6}, sum{1, 2, 3, 4, 5, 6, 7, 8}

update 7: sum{7}, sum{1, 2, 3, 4, 5, 6, 7, 8}

update 8: sum{1, 2, 3, 4, 5, 6, 7, 8}

Cool, now we have a vague idea about what to pre-compute for update and what to add for query. Now we should figure out the details of the code.

How is the code written?

First, we need to determine the representation of the pre-computed terms. Here is a list of all pre-computed terms:

{1}, {1, 2}, {3}, {1, 2, 3, 4}, {5}, {5, 6}, {7}, {1, 2, 3, 4, 5, 6, 7, 8}

The last number of each term is unique and covers the range 1-8. That’s great news! We can use a vector to store these terms easily, and let the index of the array be the last number of the term. For example, the sum of {5, 6} will be stored at bit.

First, the query operation. Let’s revisit the table with binary representation of numbers:

Table 2a: revised version of table 1a, with sums written as bit elements, indices in binary

query 0001: bit

query 0010: bit

query 0011: bit+bit

query 0100: bit

query 0101: bit+bit

query 0110: bit+bit

query 0111: bit+bit+bit

query 1000: bit

Do you see the pattern yet? Hint: for queries that have k ones, we have k terms on the right. The pattern is that while the index has at least 2 ones, we remove the lowest bit that is one, then move on to the next term. 0111->0110->0100. Finally, here’s the code:

```int query(vector<int>& bit, int k) {
int ans = 0;
for (k++; k; k -= k & (-k))
ans += bit[k];
return ans;
}
```

After all the work we’ve been through, the code is extremely concise! Two things to notice: the k++ is to change the indexing from 0-based to 1-based, as we can see from the above derivation we go from 1 to 8, instead of 0 to 7. The second thing is the use of k & (-k) to calculate the lowest bit. You can refer to the previous blog post on bitwise operations.

OK, we’re almost done. What about update? Another table:

Table 2b: revised version of table 1b

update 0001: bit, bit, bit, bit

update 0010: bit, bit, bit

update 0011: bit, bit, bit

update 0100: bit, bit

update 0101: bit, bit, bit

update 0110: bit, bit

update 0111: bit, bit

update 1000: bit

What’s the pattern this time? Hint: again, look for the lowest bit! Yes, this time instead of removing the lowest bit, we add the lowest bit of the index to itself. This is less intuitive than the last part. For example, lowest bit of 0101 is 1, so the next index is 0101+1 = 0110; lowest bit is 0010, next index is 0110+0010 = 1000.

So here’s the code, note also the k++:

```void update(vector<int>& bit, int v, int k) {
for (k++; k < bit.size(); k += k & (-k))
bit[k] += v;
}
```

This is deceivingly easy! That can’t be right. It can’t be that easy… Actually it can, if you look at the category of this post; nothing I write about is hard.

Actually it was easy because we were just matching patterns and assuming it would generalize. Let’s study the for loops a little more to understand why and how low bits are involved in this. This is rather complicated, so for practical purposes you might as well skip them.

First, observe that the low bit of each index indicates the number of integers that index sums over. Say, 0110 has low bit 0010 which is 2, so bit is a sum of two numbers: 5 and 6. This is by design, since this is exactly how we picked the pre-computed terms, so there is no way of explanation.

Second, bit[k] is the sum from indices k-lowbit(k)+1 to k. This is a direct consequence from (1) bit[k] is a summation that ends at the kth number and (2) bit[k] sums over lowbit(k) numbers.

In light of this fact, the code for querying becomes clear: for an index k, we first get the sum from k-lowbit(k)+1 to k from bit[k], then we need to find the sum from 1 to k-lowbit(k). The latter becomes a sub-problem, which is solved by setting k-lowbit(k) as the new k value and going into the next iteration.

For updating, it is much trickier. From the above, we have l-lowbit(l) < k ≤ l, iff bit[l] includes k. Below is a sketch of proof, the actual proof will include more details and be more tedious and boring to go through. For the kth number, bit[k+lowbit(k)] must include it. This is because the lowbit of k+lowbit(k) must be at least 2 times lowbit(k), so k+lowbit(k)-lowbit(k+lowbit(k)) ≤ k-lowbit(k) < k ≤ k+lowbit(k), satisfying the inequality. Also, k can be updated to k+lowbit(k) in the next iteration, because given lowbit(k) < lowbit(m) < lowbit(n) and that bit[m] includes k and bit[n] includes m, bit[n] must include k as well. Till now, we have shown that the bit[k] values we have modified in the for loop must include k.

Then, we also need to show that all bit[l] values that include k are modified in our loop. We can actually count all the bit[l] values that include k: it is equal to one plus the number of zeros before lowbit(k). It is not difficult to see how the for loop reduces the number of zeros before lowbit(k) each time the loop moves on to the next iteration. The only question remaining is why that number? Let’s look at the table 2b again. The numbers of terms for the first four entries, i.e. {4, 3, 3, 2}, are one more than the number of terms for the second four entries, i.e. {3, 2, 2, 1}. This is by design, because bit covers the first four but not the second four, and everything else is pretty symmetric. Again, the first two entries have one more coverage than the second two entries, because bit records the first two but not the second two. Hence, each time we “go down the tree” on a “zero edge” (appending a 0 to the index prefix), the numbers will be covered once more than if we “go down the tree” on a “one edge” (appending a 1 to the index prefix). After we hit the low bit, no more terms of smaller low bits will cover this index, and of course the index itself includes itself, thus the plus one. This is a basic and very rough explanation on how the numbers of zeros relate to the number of terms including a certain index. Here we have argued semi-convincingly the update loop is valid and complete.

OK, anyway, time for practice: Range Sum Query – Mutable

It’s literally implementing a binary indexed tree, nothing more.

```class NumArray {
private:
vector<int> bit;
void update_helper(int v, int k) {
for (k++; k < bit.size(); k += k & (-k))
bit[k] += v;
}
int query_helper(int k) {
int ans = 0;
for (k++; k; k -= k & (-k))
ans += bit[k];
return ans;
}
public:
NumArray(vector<int> &nums) {
bit.resize(nums.size()+1);
for (int i = 0; i < nums.size(); i++)
update_helper(nums[i], i);
}
void update(int i, int val) {
update_helper(val-query_helper(i)+query_helper(i-1), i);
}
int sumRange(int i, int j) {
return query_helper(j)-query_helper(i-1);
}
};
```

It got a little complicated because I didn’t store the original values, so we need some work on line 21 to calculate the change at a certain index given the new value and the old range sums. But that’s nothing important.

That’s it for the basic introduction of binary indexed trees. There are some variants to it, such as replacing the + sign in update function to a min or max function to take prefix min or max, or extending the algorithm to a 2D matrix, aka 2D binary indexed tree. We can even use it for some dynamic programming questions. There are in fact a few more questions on Leetcode that uses this data structure. But that’s for later.

I learned binary indexed tree through the TopCoder tutorial. If you think I did a really bad job and you do not understand at all, you can refer to it as well.