Farshchi, Mostafa and Weber, Ingo and Della Corte, Raffaele and Pecchia, Antonio and Cinque, Marcello and Schneider, Jean-Guy and Grundy, John Technical Report: Anomaly Detection for a Critical Industrial System using Context, Logs and Metrics. Technical Report. [non definito].


Download (938kB) | Anteprima
[error in script] [error in script]
Tipologia del documento: Monografia (Technical Report)
Lingua: English
Titolo: Technical Report: Anomaly Detection for a Critical Industrial System using Context, Logs and Metrics
Farshchi, Mostafamfarshchi@swin.edu.au
Weber, Ingoingo.weber@data61.csiro.au
Della Corte, Raffaeleraffaele.dellacorte2@unina.it
Pecchia, Antonioantonio.pecchia@unina.it
Cinque, Marcellomacinque@unina.it
Schneider, Jean-Guyjschneider@swin.edu.au
Grundy, Johnjohn.grundy@monash.edu.au
Autore/i: Mostafa Farshchi, Ingo Weber, Raffaele Della Corte, Antonio Pecchia, Marcello Cinque, Jean-Guy Schneider, John Grundy
Istituzione: Università di Napoli Federico II
Istituzioni (extra): Data61, CSIRO, Sydney, Australia, Swinburne University of Technology, Melbourne, Australia, Monash University, Melbourne, Australia
Dipartimento: Dipartimento di Ingegneria Elettrica e delle Tecnologie dell'Informazione
Parole chiave: Anomaly detection, contextual anomaly, system monitoring, log analysis, change detection.
Riferimenti bibliografici: [1] M. Farshchi, I. Weber, R. Della Corte, A. Pecchia, M. Cinque, J.-G. Schneider, and J. Grundy, “Contextual anomaly detection for a critical industrial system based on logs and metrics,” in Proc. 14th European Dependable Computing Conference, (EDDC), Sep. 2018. [2] V.Chandola,A.Banerjee,andV.Kumar,“Anomalydetection:Asurvey,” ACM Computing Surveys (CSUR), vol. 41, no. 3, pp. 15:1–15:54, 2009. [3] O.Ibidunmoye,F.Hernandez-Rodriguez,andE.Elmroth,“Performance anomaly detection and bottleneck identification,” ACM Computing Sur- veys (CSUR), vol. 48, no. 4, pp. 1–35, 2015. [4] L.Cherkasova,K.Ozonat,N.Mi,J.Symons,andE.Smirni,“Anomaly? Application change? Or workload change? Towards automated detection of application performance anomaly and change,” in Proc. Intl. Conf. on Dependable Systems and Networks (DSN), Jun. 2008, pp. 452–461. [5] H. Kang, X. Zhu, and J. L. Wong, “DAPA: diagnosing application performance anomalies for virtualized infrastructures,” in Proc. USENIX Hot-ICE Workshop, 2012, pp. 1–8. [6] J.P.MagalhesandL.M.Silva,“Anomalydetectiontechniquesforweb- based applications: An experimental study,” in Proc. IEEE Intl. Symp. on Network Computing and Applications, Aug. 2012, pp. 181–190. [7] N. Gurumdimma, A. Jhumka, M. Liakata, E. Chuah, and J. Browne, “CRUDE: combining resource usage data and error logs for accurate error detection in large-scale distributed systems,” in Proc. IEEE Sym- posium on Reliable Distributed Systems (SRDS), Sep. 2016, pp. 51–60. [8] M. Farshchi, J.-G. Schneider, I. Weber, and J. Grundy, “Metric selection and anomaly detection for cloud operations using log and metric correlation analysis,” Journal of Systems and Software, 2017. [9] A. J. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,” in Proc. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2007, pp. 76–86. [10] M.Cinque,D.Cotroneo,R.DellaCorte,andA.Pecchia,“Characterizing direct monitoring techniques in software systems,” IEEE Transactions on Reliability, vol. 65, no. 4, pp. 1665–1681, Dec. 2016. [11] I. Weber, C. Li, L. Bass, X. Xu, and L. Zhu, “Discovering and Visualizing Operations Processes with POD-Discovery and POD-Viz,” in Proc. DSN, Jun. 2015, pp. 537–544. [12] T. Wang, J. Wei, W. Zhang, H. Zhong, and T. Huang, “Workload- aware anomaly detection for web applications,” Journal of Systems and Software, vol. 89, pp. 19–32, Mar. 2014. [13] T. Wang, W. Zhang, C. Ye, J. Wei, H. Zhong, and T. Huang, “FD4C: automatic fault diagnosis framework for web applications in cloud computing,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 46, no. 1, pp. 61–75, 2016. [14] T. Kelly, “Transaction mix performance models: Methods and applica- tion to performance anomaly detection,” in Proc. 20th ACM Symposium on Operating Systems Principles. ACM, 2005, pp. 1–3. [15] A. Pecchia, S. Russo, and S. Sarkar, “Assessing invariant mining tech- niques for cloud-based utility computing systems,” IEEE Transactions on Services Computing, pp. 1–1, 2017. [16] M. Farshchi, J.-G. Schneider, I. Weber, and J. Grundy, “Experience report: Anomaly detection of cloud application operations using log and cloud metric correlation analysis,” in Proc. IEEE Intl. Symp. on Software Reliability Engineering (ISSRE), Nov. 2015, pp. 24–34. [17] V. Hodge and J. Austin, “A survey of outlier detection methodologies,” Artificial Intelligence Review, vol. 22, no. 2, pp. 85–126, Oct. 2004. [18] M. Agyemang, K. Barker, and R. Alhajj, “A comprehensive survey of numeric and symbolic outlier mining techniques,” Intelligent Data Analysis, vol. 10, no. 6, pp. 521–538, 2006. [19] G. Aceto, A. Botta, W. de Donato, and A. Pescape`, “Cloud monitoring: A survey,” Computer Networks, vol. 57, no. 9, pp. 2093–2115, 2013. [20] P. He, J. Zhu, S. He, J. Li, and M. R. Lyu, “An evaluation study on log parsing and its use in log mining,” in Proc. IEEE/IFIP Intl. Conf. on Dependable Systems and Networks (DSN), June 2016, pp. 654–661. [21] V. I. Levenshtein, “Binary codes capable of correcting deletions, in- sertions, and reversals,” in Soviet physics doklady, vol. 10, 1966, pp. 707–710. [22] Q. Fu, J. G. Lou, Y. Wang, and J. Li, “Execution anomaly detection in distributed systems through unstructured log analysis,” in Proc. IEEE Intl. Conf. on Data Mining, Dec. 2009, pp. 149–158. [23] R. P. J. C. Bose and W. M. P. van der Aalst, “Discovering signature patterns from event logs,” in Proc. IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Apr. 2013, pp. 111–118. [24] W. Xu, “System problem detection by mining console logs,” Ph.D. dissertation, University of California, Berkely, 2010. [25] J. Stearley, “Towards informatic analysis of syslogs,” in Proc. IEEE International Conference on Cluster Computing, Sep. 2004, pp. 309– 318. [26] R. Vaarandi, M. Kont, and M. Pihelgas, “Event log analysis with the logcluster tool,” in Proc. IEEE Military Communications Conference, (MILCOM), Nov. 2016, pp. 982–987. [27] J.W.Osborne,“Predictioninmultipleregression,”PracticalAssessment, Research and Evaluation (PARE), vol. 7, no. 2, pp. 1–9, 2000. [28] J. P. Onyango and A. Plews, A textbook of basic statistics. East African Publishers, 1987. [29] J. Friedman, T. Hastie, and R. Tibshirani, The elements of statistical learning. Springer, 2001. [30] A.Avizienis,J.C.Laprie,B.Randell,andC.Landwehr,“Basicconcepts and taxonomy of dependable and secure computing,” IEEE Trans. on Dependable and Secure Computing, vol. 1, no. 1, pp. 11–33, Jan. 2004. [31] J. Makhoul, F. Kubala, R. Schwartz, and R. Weischedel, “Performance measures for information extraction,” in Proceedings of DARPA broad- cast news workshop, 1999, pp. 249–252. [32] P. Bodik, A. Fox, M. J. Franklin, M. I. Jordan, and D. A. Patterson, “Characterizing, modeling, and generating workload spikes for stateful services,” in Proc. 1st ACM Symposium on Cloud Computing. ACM, 2010, pp. 241–252. [33] M.Sladescu,A.Fekete,K.Lee,andA.Liu,“GEAP:agenericapproach to predicting workload bursts for web hosted events,” in Proc. 15th International Conference Web Information Systems Engineering (WISE). Springer International Publishing, Oct. 2014, pp. 319–335. [34] A. Mehta, J. Drango, J. Tordsson, and E. Elmroth, “Online spike detection in cloud workloads,” in Proc. IEEE International Conference on Cloud Engineering, Mar. 2015, pp. 446–451. [35] A. Patcha and J. Park, “An overview of anomaly detection techniques: Existing solutions and latest technological trends,” Computer Networks, vol. 51, no. 12, pp. 3448–3470, 2007. [36] C. Wohlin, P. Runeson, M. Ho ̈st, M. C. Ohlsson, B. Regnell, and A. Wessle ́n, Experimentation in Software Engineering: An Introduction. Kluwer Academic, 2000.
Settori scientifico-disciplinari del MIUR: Area 01 - Scienze matematiche e informatiche > INF/01 - Informatica
Area 09 - Ingegneria industriale e dell'informazione > ING-INF/05 - Sistemi di elaborazione delle informazioni
Depositato il: 25 Giu 2018 06:28
Ultima modifica: 25 Giu 2018 06:28
URI: http://www.fedoa.unina.it/id/eprint/11969
DOI: 10.6093/UNINA/FEDOA/11969


Recent advances in contextual anomaly detection attempt to combine resource metrics and event logs to un- cover unexpected system behaviors and malfunctions at run- time. These techniques are highly relevant for critical software systems, where monitoring is often mandated by international standards and guidelines. In this technical report, we analyze the effectiveness of a metrics-logs contextual anomaly detection technique in a middleware for Air Traffic Control systems. Our study addresses the challenges of applying such techniques to a new case study with a dense volume of logs, and finer monitoring sampling rate. We propose an automated abstraction approach to infer system activities from dense logs and use regression analysis to infer the anomaly detector. We observed that the detection accuracy is impacted by abrupt changes in resource metrics or when anomalies are asymptomatic in both resource metrics and event logs. Guided by our experimental results, we propose and evaluate several actionable improvements, which include a change detection algorithm and the use of time windows on contextual anomaly detection. This technical report accompanies the paper “Contextual Anomaly Detection for a Critical Industrial System based on Logs and Metrics” [1] and provides further details on the analysis method, case study and experimental results.

Actions (login required)

Modifica documento Modifica documento