Novel metric for the use of Minimum Spanning Trees in phylogenetic trees studies
The use of trees for phylogenetic representations started in the
middle of the 19th century. One of their most popular uses is Charles
Darwin's sole illustration in "The Origin of Species" . The
simplicity of the tree representation makes it still the method of
choice today to easily convey the diversification and relationships
between species. Yet trees suffer from several drawbacks that are not
always clear to researchers. Since several different algorithms can be
used to infer and draw the tree, one must be aware of each algorithm's
set of assumptions.
In the analysis of sequence-based microbial typing methods, Minimum
Spanning Trees (MSTs) are becoming the standard for representing
relationships between strains. However, these suffer from several
limitations that can mislead in the interpretation of the resulting
tree. The fact that a single tree is reported from a multitude of
possible and equally optimal solutions and that no statistical metrics
exist to evaluate them, justified a recent heuristic approach to
address these issues.
We present a new edge betweenness metric for undirected and weighted
graphs. This metric is defined as the fraction of minimum spanning
trees where a given edge is present and it was motivated by the
necessity of evaluating phylogenetic trees. Moreover we provide
results and methods concerning the exact computation of this metric
based on the well-known Kirchhoff's matrix tree theorem.