We have a weighted directed graph \(G\). The weights of all edges in \(G\) are non-negative.
Now we want to find the length of the shortest path from the source vertex \(sc\) to all other vertices, where the length of a path is the sum of its edge weights.
We define an array named \(dis\). During the execution of the algorithm, \(dis_u\) represents the length of the shortest path to vertex \(u\) such that the preceding vertex in the path has definitely been selected. (Selection will be explained further below). Initially, all entries of the \(dis\) array are equal to \(\infty\). We initially know that \(dis_{sc} = 0\). Now we select \(sc\) and update the \(dis\) values of \(sc\)'s neighbors.
Now, in each step of the algorithm, from the vertices we have not yet selected, we choose the vertex with the minimum \(dis\) value and call it \(v\).
Now we prove that the length of the shortest path from \(sc\) to \(v\) is precisely the current value of \(dis_v\). To do this, we use proof by contradiction and assume there exists a shorter path (a path of length \(P\)). Take the last selected vertex in this path and name it \(last\). (It must exist because vertex \(sc\) has been selected.) We name the next vertex after \(last\) as \(u\). (\(u\) must also exist, as \(v\) itself has not been selected yet).
According to the contradictory assumption, the path length to \(v\) was less than \(dis_v\). Since edges are non-negative, the path length to \(u\) is also less than \(dis_v\). As the vertex preceding \(u\) in the path had been selected, and the final distance of selected vertices is their \(dis\) value, then \(dis_u\) would have been set to the shortest path length to \(u\) when \(last\) was selected. And since \(dis_u < P\) and \(P < dis_v\), it implies that \(dis_u < dis_v\), which contradicts the minimality of \(dis_v\) among the current values. Therefore, our contradictory assumption is false, and the claim is proven.
Now that it is proven that \(dis_v\) is exactly the length of the shortest path to \(v\), we select it. Then, for all its neighbors, such as \(adj\), where there is an edge from \(v\) to \(adj\) with weight \(w\), we perform \(dis_{adj} = min(dis_{adj}, dis_v + w)\). We continue this process until all vertices have been selected.
There are two general ways to implement this algorithm.
In each step, we iterate over all unselected vertices to find the minimum. In each step of the process, we perform \(\mathcal{O}(n)\) operations. Since the algorithm has \(n\) steps, the complexity of our program becomes \(\mathcal{O}(n^2)\).
In each step, instead of iterating over all vertices, we use data structures that can find the minimum faster, such as \(set\) and \(priority-queue\) in \(C++\). (Assume we use a \(set\) here).
In each step, the minimum \(dis\) value can be found in \(\mathcal{O}(1)\). Each time we change the value of an entry in \(dis\), we must update its value in the \(set\), which takes \(\mathcal{O}(lg(n))\).
Since, in each step, the \(dis\) values corresponding to the neighbors of the selected vertex might change, we update \(dis\) values a total number of times proportional to the sum of the outgoing degrees of all vertices (which is \(m\) for a directed graph). We know that the sum of the outgoing degrees is \(\mathcal{O}(m)\). So, in total, we incur a time cost of \(\mathcal{O}(m \log(n))\) for the updates. Therefore, the complexity of the program becomes \(\mathcal{O}(n + m \log(n))\).