论文标题

灵活的日志文件使用隐藏的马尔可夫模型解析

Flexible Log File Parsing using Hidden Markov Models

论文作者

Kuhnert, Nadine, Maier, Andreas

论文摘要

我们的目标是建模未知文件处理。随着日志文件的内容通常会随着时间的流逝而发展,我们建立了一个动态统计模型,该模型可以学习和调整处理和解​​析规则。首先,我们仅通过关注那些导致所需的输出表的频繁模式来限制非结构化文本的数量[10]。其次,我们将发现的频繁模式转换为将解析表陈述的输出转换为隐藏的马尔可夫模型(HMM)。但是,我们将此HMM用作特定的日志文件处理模式的灵活表示。随着原始日志文件扭曲学习模式的更改,我们将模型自动调整以保持高质量输出。在一种系统类型上训练我们的模型之后,将模型和结果解析规则应用于具有略有不同日志文件模式的不同系统之后,我们的精度超过了99%。

We aim to model unknown file processing. As the content of log files often evolves over time, we established a dynamic statistical model which learns and adapts processing and parsing rules. First, we limit the amount of unstructured text by focusing only on those frequent patterns which lead to the desired output table similar to Vaarandi [10]. Second, we transform the found frequent patterns and the output stating the parsed table into a Hidden Markov Model (HMM). We use this HMM as a specific, however, flexible representation of a pattern for log file processing. With changes in the raw log file distorting learned patterns, we aim the model to adapt automatically in order to maintain high quality output. After training our model on one system type, applying the model and the resulting parsing rule to a different system with slightly different log file patterns, we achieve an accuracy over 99%.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源