Persia, Fabio (2013) Finding unexplained activities in time-stamped observation data. [Tesi di dottorato]

[img]
Preview
Documento PDF
Persia_Fabio_25.pdf

Download (6MB) | Preview
[error in script] [error in script]
Item Type: Tesi di dottorato
Lingua: English
Title: Finding unexplained activities in time-stamped observation data
Creators:
CreatorsEmail
Persia, Fabiofabio.persia@unina.it
Date: 2 April 2013
Number of Pages: 108
Institution: Università degli Studi di Napoli Federico II
Department: Ingegneria Elettrica e delle Tecnologie dell'Informazione
Scuola di dottorato: Ingegneria dell'informazione
Dottorato: Ingegneria informatica ed automatica
Ciclo di dottorato: 25
Coordinatore del Corso di dottorato:
nomeemail
Garofalo, Francescofranco.garofalo@unina.it
Tutor:
nomeemail
Picariello, Antoniopicus@unina.it
Moscato, Vincenzovmoscato@unina.it
Date: 2 April 2013
Number of Pages: 108
Uncontrolled Keywords: unexplained activities, activity recognition, data reasoning
Settori scientifico-disciplinari del MIUR: Area 09 - Ingegneria industriale e dell'informazione > ING-INF/05 - Sistemi di elaborazione delle informazioni
Aree tematiche (7° programma Quadro): TECNOLOGIE DELL'INFORMAZIONE E DELLA COMUNICAZIONE > Macchine "più intelligenti", servizi migliori
Date Deposited: 05 Apr 2013 12:35
Last Modified: 22 Jul 2014 11:32
URI: http://www.fedoa.unina.it/id/eprint/9362
DOI: 10.6092/UNINA/FEDOA/9362

Abstract

The activity recognition is a very big challenge for the entire research community. Thus, there are already numerous techniques able to find occurrences of activities in time-stamped observation data (e.g., a video, a sequence of transactions at a website, etc.) with each occurrence having an associated probability. However, all these techniques rely on models encoding a priori knowledge of either normal or malicious behavior. They cannot deal with events such as “zero day” attacks that have never been seen before. In practice, all these methods are incapable of quantifying how well available models explain a sequence of events observed in an observation stream. By the way, the goal of this thesis is different: in order to address the issue listed above, we want to find the subsequences of the observation data, called unexplained sequences, that known models are not able to “explain” with a certain confidence. Thus, we start with a known set A of activities (both innocuous and dangerous) that we wish to monitor and we wish to identify “unexplained” subsequences in an observation sequence that are poorly explained (e.g., because they may contain occurrences of activities that have never been seen or anticipated before, i.e. they are not in A). We formally define the probability that a sequence of observations is unexplained (totally or partially) w.r.t. A. We develop efficient algorithms to identify the top-k Totally and Partially Unexplained Activities w.r.t. A. These algorithms leverage theorems that enable us to speed up the search for totally/partially unexplained activities. We describe experiments using real-world video and cyber security datasets showing that our approach works well in practice in terms of both running time and accuracy.

Actions (login required)

View Item View Item