部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

How recurrent networks implement contextual processing in sentiment analysis

论文作者

Maheswaranathan, Niru, Sussillo, David

论文摘要

神经网络具有上下文处理的显着能力 - 使用最近或附近的输入来修改当前输入的处理。例如，在自然语言中，必须进行上下文处理以正确解释否定（例如“不错”等短语）。但是，我们了解网络处理环境的能力受到限制。在这里，我们提出了逆向工程复发神经网络（RNN）的一般方法，以识别和阐明上下文处理。我们应用这些方法来了解接受情感分类培训的RNN。该分析揭示了引起上下文效应的输入，量化了这些效果的强度和时间尺度，并确定了具有相似特性的这些输入的集合。此外，我们分析了与文档开始和结束的差异处理有关的上下文效应。利用从RNN中学到的见解，我们改善了基线词袋模型，并具有简单的扩展名，结合了上下文修改，恢复了RNN的90％以上的RNN性能提高。这项工作对RNN如何处理上下文信息有了新的了解，并提供了应该更广泛地提供类似见解的工具。

Neural networks have a remarkable capacity for contextual processing--using recent or nearby inputs to modify processing of current input. For example, in natural language, contextual processing is necessary to correctly interpret negation (e.g. phrases such as "not bad"). However, our ability to understand how networks process context is limited. Here, we propose general methods for reverse engineering recurrent neural networks (RNNs) to identify and elucidate contextual processing. We apply these methods to understand RNNs trained on sentiment classification. This analysis reveals inputs that induce contextual effects, quantifies the strength and timescale of these effects, and identifies sets of these inputs with similar properties. Additionally, we analyze contextual effects related to differential processing of the beginning and end of documents. Using the insights learned from the RNNs we improve baseline Bag-of-Words models with simple extensions that incorporate contextual modification, recovering greater than 90% of the RNN's performance increase over the baseline. This work yields a new understanding of how RNNs process contextual information, and provides tools that should provide similar insight more broadly.

下载PDF全文

下载文献需遵守相关版权规定

论文标题