Reconstructing a minimum reticulation network from phylogenetic trees is used in evolutionary studies. In this thesis, we focus on finding temporal networks using cherry-picking sequences for binary trees with all taxa. Finding such a minimum reticulation temporal network is NP-h
...
Reconstructing a minimum reticulation network from phylogenetic trees is used in evolutionary studies. In this thesis, we focus on finding temporal networks using cherry-picking sequences for binary trees with all taxa. Finding such a minimum reticulation temporal network is NP-hard.
We introduce an algorithm to find a minimum reticulation network with a running time of O(2^n poly(n,t)). In addition, this study explores potential enhancements to the algorithm through the utilisation of branch and bound.
Additionally, we introduce a similar algorithm to determine the existence of a temporal phylogenetic network. This algorithm is improved upon by integrating a new concept called cherry growing. This leads to a notable speed-up in performance.
Furthermore, we examine the application of Graph Neural Networks (GNN) in heuristics to find a cherry-picking sequence which can be used to construct a network. This is done by classifying leaves into good, which leads to optimal solutions, and bad leaves. To assess this, two types of data were employed: one simulating evolutionary models and the other employing a fully random approach. The best-performing GNN model has a 97.4% accuracy for evolution-based data and a 79.1% accuracy for random-based data.
The GNN models are implemented as predictors in two classes of heuristics. The first generates a cherry-picking sequence by repeatedly picking leaves. The second class of heuristics is based on a tree search heuristic. This tree-search-based heuristic outperforms the cherry-picking-based heuristic. Furthermore, the GNN heuristics outperform their random variant, even for problems substantially larger than the GNN was trained on.
We also examine the use of GNN in predicting the existence of a phylogenetic temporal network given a set of trees. The best-performing GNN found for this problem has an accuracy of 80.3%.