论文标题
当数据跨谱系异质时,设计基于四重奏的方法的权重
Designing weights for quartet-based methods when data is heterogeneous across lineages
论文作者
论文摘要
跨谱系的同质性是系统发育学的常见假设,根据该假设,核苷酸取代速率在时间上保持恒定,并且不依赖于谱系。这是一个简化的假设,通常将其用于使序列进化过程更加易于处理。但是,它的有效性已在几篇论文中探索并质疑。另一方面,成功处理一般情况(跨谱系的异质性)是基于代数工具的系统发育重建方法的关键特征之一。 本文的目标是双重的。首先,我们提出了一个基于代数和半代数工具的四重奏(ASAQ)的新加权系统,因此特别指出可以处理在异质速率下进化的数据。该方法通过基于副距离估计的分支长度的阳性测试结合了两种先前的方法。当应用于GM数据时,ASAQ在统计学上是一致的,考虑了谱系之间的速率和基本组成异质性,并且不假定平稳性或时间可逆性。其次,我们测试并比较了几种基于四重奏的基于四重奏的方法的性能(即四重奏令人困惑,重量优化和威尔逊的方法)与ASAQ的重量和其他基于代数和半代理方法或副态距离的重量。这些测试应用于模拟和真实数据,并以ASAQ权重作为可靠且成功的重建方法支持权重优化。
Homogeneity across lineages is a common assumption in phylogenetics according to which nucleotide substitution rates remain constant in time and do not depend on lineages. This is a simplifying hypothesis which is often adopted to make the process of sequence evolution more tractable. However, its validity has been explored and put into question in several papers. On the other hand, dealing successfully with the general case (heterogeneity across lineages) is one of the key features of phylogenetic reconstruction methods based on algebraic tools. The goal of this paper is twofold. First, we present a new weighting system for quartets (ASAQ) based on algebraic and semi-algebraic tools, thus specially indicated to deal with data evolving under heterogeneus rates. This method combines the weights two previous methods by means of a test based on the positivity of the branch length estimated with the paralinear distance. ASAQ is statistically consistent when applied to GM data, considers rate and base composition heterogeneity among lineages and does not assume stationarity nor time-reversibility. Second, we test and compare the performance of several quartet-based methods for phylogenetic tree reconstruction (namely, Quartet Puzzling, Weight Optimization and Wilson's method) in combination with ASAQ weights and other weights based on algebraic and semi-algebraic methods or on the paralinear distance. These tests are applied to both simulated and real data and support Weight Optimization with ASAQ weights as a reliable and successful reconstruction method.