论文标题
活细胞中粒子轨迹的分类:机器学习与统计测试的分数异常扩散假设
Classification of particle trajectories in living cells: machine learning versus statistical testing hypothesis for fractional anomalous diffusion
论文作者
论文摘要
单粒子跟踪(SPT)已成为研究活细胞中分子细胞内转运的流行工具。推断其动力学特征很重要,因为它决定了细胞的组织和功能。因此,SPT数据分析的第一步之一是鉴定观察到的颗粒的扩散类型。识别轨迹类别的最流行方法是基于均方根位移(MSD)。但是,由于已知的局限性,已经提出了其他几种方法。随着算法的最新进展和现代硬件的发展,植根于机器学习(ML)的分类尝试特别感兴趣。在这项工作中,我们采用了两种ML集合算法,即随机森林和梯度提升,以解决轨迹分类的问题。我们提供了一组新的功能,用于将原始轨迹数据转换为分类器所需的输入向量。然后将所得模型应用于G蛋白偶联受体和G蛋白的真实数据。将分类结果与最近超出MSD的统计方法进行了比较。
Single-particle tracking (SPT) has become a popular tool to study the intracellular transport of molecules in living cells. Inferring the character of their dynamics is important, because it determines the organization and functions of the cells. For this reason, one of the first steps in the analysis of SPT data is the identification of the diffusion type of the observed particles. The most popular method to identify the class of a trajectory is based on the mean square displacement (MSD). However, due to its known limitations, several other approaches have been already proposed. With the recent advances in algorithms and the developments of modern hardware, the classification attempts rooted in machine learning (ML) are of particular interest. In this work, we adopt two ML ensemble algorithms, i.e. random forest and gradient boosting, to the problem of trajectory classification. We present a new set of features used to transform the raw trajectories data into input vectors required by the classifiers. The resulting models are then applied to real data for G protein-coupled receptors and G proteins. The classification results are compared to recent statistical methods going beyond MSD.