In this section, we examine the problem of checking the equality of two trees. Suppose we are given two trees \(T_1\) and \(T_2\), and we want to provide a linear-time algorithm to check whether \(T_1\) and \(T_2\) are equal. Here, "equality" means that if we ignore vertex labels, the two trees can be drawn such that their shapes are completely identical.
Assume we know that vertex \(u_1\) from tree \(T_1\) must correspond to vertex \(u_2\) from tree \(T_2\). Then we can check the equality of two rooted trees, which appears simpler than the current problem.
Thus, if we can solve the equality problem for two rooted n-node trees in \(O(f(n))\), we will have a solution that solves the current problem in \(O(n \times f(n))\). It suffices to fix \(u_1\) as a constant vertex and enumerate over possible \(u_2\).
The key idea here is to select \(u_1\) and \(u_2\) with a specific property such that the number of nodes with this property is small. For example, if we consider \(u_1\) to be a leaf, then \(u_2\) must also be a leaf. Instead of enumerating over all \(n\) possibilities for \(u_2\), it suffices to enumerate over the leaves of \(T_2\).
In previous sections, we learned that every tree has at most 2 centroids. Therefore, if we choose the desired property to be "being a centroid", it suffices to fix \(u_1\) as one of the centroids of tree \(T_1\) and enumerate over all centroids of tree \(T_2\) (of which there are at most 2). This reduces the problem to \(O(f(n))\) complexity.
1def is_equal(t1, t2):
2 # Check equality of two rooted trees t1 and t2
3 if len(t1.children) != len(t2.children): # Agar tedad farzandan motafavet bashad
4 return False
5
6 # Morattab sazi farzandan bar asas reshte parantezi
7 c1 = sorted(t1.children, key=lambda x: x.bracket_string)
8 c2 = sorted(t2.children, key=lambda x: x.bracket_string)
9
10 for i in range(len(c1)):
11 if not is_equal(c1[i], c2[i]): # Barresi bazgashti barabari
12 return False
13 return True
Let \(r_1\) be the root of tree \(T_1\) and \(r_2\) be the root of tree \(T_2\). We want to determine if these two trees are equal. First, if the number of children of \(r_1\) and \(r_2\) differ, the trees are clearly unequal. If they have the same number of children, we need to determine which child of \(r_1\) corresponds to which child of \(r_2\). Once this mapping is established, we can solve the problem recursively by verifying the equality of each child subtree of \(r_1\) with its corresponding subtree in \(T_2\).
In other words, we need to permute the sequence of children of \(r_1\) and \(r_2\) in some way, then verify the equality of the \(i\)-th child subtree of \(r_1\) with that of \(r_2\).
The key idea is to sort the children of each node based on a specific characteristic. This eliminates the need to check all permutations of children - we simply compare them in this canonical order.
We assign a parenthesis sequence to each rooted tree \(T\) with root \(r\). First, we recursively compute the parenthesis sequences for each child subtree of \(r\). Then, we sort \(r\)'s children based on the lexicographical order of their corresponding strings. This establishes the fixed ordering we wanted to assign to the children of each node. Finally, the parenthesis sequence of \(T\) is formed as \(S = (S_1S_2...S_k)\), where \(S_i\) is the parenthesis sequence of the \(i\)-th child subtree of \(r\).
It can be shown that two rooted trees are equal if and only if their corresponding parenthesis sequences are identical.
MOD = 10**18 + 3
P = 10**9 + 7
def dfs(u, par):
h = []
# find children
for v in adj[u]:
if v != par:
h.append(dfs(v, u))
# sort hashes and calculate parent's hash
h.sort()
res = 1
for i in range(len(h)):
res += h[i] * pow(P, i + 1, MOD)
res %= MOD
return res
Since working with strings (concatenating two strings or checking equality of two strings) requires \(O(n)\) operations, this makes us think of assigning each vertex a number representing a string instead of directly working with strings!
Using the above logic, we calculate the number for each vertex as follows: First, we obtain the numbers for the children, then sort them. Assuming they are \(H_1,...,H_k\), our number will be \(H = 1 + \sum H_i \times P^i\) modulo \(M\), where \(M\) and \(P\) are two random prime numbers. This technique is called hashing. Since \(M\) and \(P\) are random numbers, we can assume the resulting numbers are random, and the probability of assigning the same number to two different trees will be extremely low. (To be more confident, we can use more \(M\) and \(P\) pairs to make the failure probability increasingly smaller).
The implementation of the algorithm we described is as follows:
const int maxn = 1e5 + 10, P = 101, M = 1e9 + 9;
vector <int> v[maxn];
int calc(int u, int par = -1){
vector<int> vec;
for(int y : v[u]){
if(y != par)
vec.push_back(calc(y, u));
}
sort(vec.begin(), vec.end());
int H = 0;
for(int x : vec)
H = (1ll * H * P + x) % M;
H = (1 + H) % M;
return H;
}
int main(){
int n;
cin >> n;
for(int i = 0; i < n - 1; i++){
int a, b;
cin >> a >> b;
v[a].push_back(b);
v[b].push_back(a);
}
cout << calc(1) << "\n"; // hash derakht rishe dar az 1
return 0;
}