论文标题
部分可观测时空混沌系统的无模型预测
Anomaly detection optimization using big data and deep learning to reduce false-positive
论文作者
论文摘要
基于异常的入侵检测系统(IDS)一直是一个热门研究主题,因为它具有检测新威胁的能力,而不仅仅是记忆的签名威胁基于签名的ID的威胁。尤其是在增加了增加黑客工具数量并增加攻击影响的高级技术之后。任何基于异常的模型的问题是其高阳性速率。高阳性速率是为什么在实践中通常不使用异常ID的原因。因为基于异常的模型将看不见的模式分类为一种威胁,在培训数据集中可能是正常但不包含的威胁。这种类型的问题称为模型无法概括的过度拟合。通过拥有包括所有可能的正常情况的大型培训数据集来优化基于异常的模型,可能是一个最佳解决方案,但不能在实践中应用。尽管我们可以增加培训样本的数量以包括更多正常情况,但我们仍然需要一个具有更多概括能力的模型。在这篇研究论文中,我们建议应用深层模型,而不是传统模型,因为它具有更大的概括能力。因此,我们将通过使用大数据和深层模型获得较少的假阳性。我们通过降低假阳性速率在优化基于异常ID的ID中进行了机器学习和深度学习算法进行比较。我们在NSL-KDD基准测试中进行了一个实验,并将我们的结果与IDS优化中传统学习中使用最好的分类器之一进行了比较。该实验显示,通过使用深度学习而不是传统学习,假阳性降低了10%。
Anomaly-based Intrusion Detection System (IDS) has been a hot research topic because of its ability to detect new threats rather than only memorized signatures threats of signature-based IDS. Especially after the availability of advanced technologies that increase the number of hacking tools and increase the risk impact of an attack. The problem of any anomaly-based model is its high false-positive rate. The high false-positive rate is the reason why anomaly IDS is not commonly applied in practice. Because anomaly-based models classify an unseen pattern as a threat where it may be normal but not included in the training dataset. This type of problem is called overfitting where the model is not able to generalize. Optimizing Anomaly-based models by having a big training dataset that includes all possible normal cases may be an optimal solution but could not be applied in practice. Although we can increase the number of training samples to include much more normal cases, still we need a model that has more ability to generalize. In this research paper, we propose applying deep model instead of traditional models because it has more ability to generalize. Thus, we will obtain less false-positive by using big data and deep model. We made a comparison between machine learning and deep learning algorithms in the optimization of anomaly-based IDS by decreasing the false-positive rate. We did an experiment on the NSL-KDD benchmark and compared our results with one of the best used classifiers in traditional learning in IDS optimization. The experiment shows 10% lower false-positive by using deep learning instead of traditional learning.