论文标题
一种用于检测二进制数据结构的概率潜在变量模型
A probabilistic latent variable model for detecting structure in binary data
论文作者
论文摘要
我们引入了一种新型的概率二进制潜在变量模型,以检测稀疏二进制数据中模式的嘈杂或近似重复。该模型基于“嘈杂或模型”(Heckerman,1990),该模型以前用于疾病和主题建模。通过从视网膜神经元的记录中提取结构来证明该模型的能力,但可以广泛应用于嘈杂的二进制数据中的潜在结构和模型。在尖峰神经数据的背景下,任务是用神经元组的“解释”单个神经元的峰值,“细胞组件”(CAS),由于相互作用或其他原因,通常会一起发射。该模型在一组二进制潜在变量中渗透活性,每个变量都描述了细胞组件的活性。当细胞组件的潜在变量处于活动状态时,它会降低属于该组件的神经元的概率。潜在组件的条件概率内核是从期望最大化方案中的数据中学到的,涉及潜在状态的推断和对模型的参数调整。我们彻底验证了该模型在统计上类似于对白噪声刺激和数据中自然电影刺激的视网膜响应构建的合成的尖峰列表。我们还将模型应用于视网膜神经节细胞(RGC)在使用电影刺激过程中记录的尖峰反应,并讨论发现的结构。
We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model latent structure in noisy binary data. In the context of spiking neural data, the task is to "explain" spikes of individual neurons in terms of groups of neurons, "Cell Assemblies" (CAs), that often fire together, due to mutual interactions or other causes. The model infers sparse activity in a set of binary latent variables, each describing the activity of a cell assembly. When the latent variable of a cell assembly is active, it reduces the probabilities of neurons belonging to this assembly to be inactive. The conditional probability kernels of the latent components are learned from the data in an expectation maximization scheme, involving inference of latent states and parameter adjustments to the model. We thoroughly validate the model on synthesized spike trains constructed to statistically resemble recorded retinal responses to white noise stimulus and natural movie stimulus in data. We also apply our model to spiking responses recorded in retinal ganglion cells (RGCs) during stimulation with a movie and discuss the found structure.