Drago, Carlo (2011) The Density Valued Data Analysis in a Temporal Framework: The Data Model Approach. [Tesi di dottorato] (Unpublished)


Download (6MB) | Preview
[error in script] [error in script]
Item Type: Tesi di dottorato
Resource language: English
Title: The Density Valued Data Analysis in a Temporal Framework: The Data Model Approach
Drago, Carloc.drago@mclink.it
Date: 30 November 2011
Number of Pages: 479
Institution: Università degli Studi di Napoli Federico II
Department: Matematica e statistica
Scuola di dottorato: Scienze matematiche e informatiche
Dottorato: Statistica
Ciclo di dottorato: 24
Coordinatore del Corso di dottorato:
Lauro, Carlo Nataleclauro@unina.it
Lauro, Carloclauro@unina.it
Scepi, Germanagermana.scepi@unina.it
Date: 30 November 2011
Number of Pages: 479
Keywords: Time Series, Forecasting, Clustering.
Settori scientifico-disciplinari del MIUR: Area 13 - Scienze economiche e statistiche > SECS-S/01 - Statistica
Area 01 - Scienze matematiche e informatiche > INF/01 - Informatica
Area 13 - Scienze economiche e statistiche > SECS-S/02 - Statistica per la ricerca sperimentale e tecnologica
Date Deposited: 15 Dec 2011 15:53
Last Modified: 05 Dec 2014 14:37
URI: http://www.fedoa.unina.it/id/eprint/9001
DOI: 10.6092/UNINA/FEDOA/9001

Collection description

High Frequency Data are data characterized by an overwhelming number of observations in the period of reference, often a single day. Typically, these data are synthesized by their average or by the variation of the observed values in terms of the upper and lower values (or suitable quantiles). Usually, this interval or range provides interesting information on the data for the representation of the data variability. Recently, histograms and boxplots have been employed in order to obtain a more informative representation of high frequency data. Anomalies and casual or systematic errors can affect such high frequency data representation and consequent interpretation and use. In order to face such problems assuming the classical decomposition of data as the sum of a model plus an error, we propose to represent intra-period high frequency data by density models such as the beanplots, based on a suitable mixture of distributions. The location, size and shape of such models are summarized in the estimated model coefficients and visualized by means of classical beanplot silhouettes. On this modeling based approach we build a beanplots time series consisting of a vectorial time series whose elements are the estimated coefficients of each bean plot. In this way we can solve the problem of the storage of high frequency data through few coefficients: in fact, only one beanplot and the generating matrix are required. But the main advantage of using this kind of representation and the corresponding visualization is in their capacity to highlight anomalies or anticipate structural pattern changes in a beanplot time series, as well as to provide useful tools for short period forecasting. In this respect, it is fruitful to use multivariate control chart techniques to provide signals of anomalous observations or early warnings for structural changes. At the same time, these models are useful to study the evolution in the mid and long run by considering classical approaches developed for multivariate time series or approaches based on a time series factor analysis for multivariate successions of vectors of coefficients. These modelizations of single or multiple beanplot time series over the chosen period interval are also useful in forecasting problems. In the case of multiple beanplot time series based on different sets of high frequency data observed simultaneously, or of the same set observed in different occasions, cluster analysis methods can be used to search for suitable prototypes in building composite indicators or to discover homogeneous (and contiguous) time segments corresponding to pattern changes. The tools considered through this thesis are useful in various financial applications such as Trading, Stock Picking, Statistical Arbitrage and Risk Management. The Thesis is structured as follows: Chapter 1 The Analysis of Massive Data Sets Chapter 2 Complex Data in a Temporal Framework Chapter 3 Foundations of Interval Data Representations Chapter 4 Foundations of Boxplot and Histogram Data Representations Chapter 5 Foundations of Density Valued Data: Representations Chapter 6 Visualization and Exploratory Analysis of Beanplot Data Chapter 7 Beanplot Modelling Chapter 8 Beanplot Time Series Forecasting Chapter 9 Beanplot Time Series Clustering Chapter 10 Beanplot Model Evaluation Chapter 11 Case Studies: Market Monitoring, Asset Allocation, Statistical Arbitrage and Risk Management The Thesis is accompanied by a library of programs in R built on the presented methods.


Downloads per month over past year

Actions (login required)

View Item View Item