Guerriero, Antonio (2022) Operational Accuracy Assessment of CNN-based Image Classifiers. [Tesi di dottorato]

[thumbnail of Guerriero_Antonio_34.pdf]
Preview
Text
Guerriero_Antonio_34.pdf

Download (7MB) | Preview
Item Type: Tesi di dottorato
Resource language: English
Title: Operational Accuracy Assessment of CNN-based Image Classifiers
Creators:
Creators
Email
Guerriero, Antonio
antonio.guerriero@unina.it
Date: 10 March 2022
Number of Pages: 106
Institution: Università degli Studi di Napoli Federico II
Department: Ingegneria Elettrica e delle Tecnologie dell'Informazione
Dottorato: Information technology and electrical engineering
Ciclo di dottorato: 34
Coordinatore del Corso di dottorato:
nome
email
Riccio, Daniele
daniele.riccio@unina.it
Tutor:
nome
email
Russo, Stefano
UNSPECIFIED
Pietrantuono, Roberto
UNSPECIFIED
Date: 10 March 2022
Number of Pages: 106
Keywords: Machine Learning, Image Classification, Automatic Oracles, Sampling, Assessment
Settori scientifico-disciplinari del MIUR: Area 09 - Ingegneria industriale e dell'informazione > ING-INF/05 - Sistemi di elaborazione delle informazioni
Date Deposited: 22 May 2022 21:27
Last Modified: 28 Feb 2024 10:50
URI: http://www.fedoa.unina.it/id/eprint/14445

Collection description

Machine Learning (ML) systems are nowadays largely adopted in many application domains. In the field of Image Classification (IC), where Convolutional Neural Networks (CNN) represent the state of the art ML models, they can even outperform human beings. The performance of a CNN once deployed in the operational environment can be very different from the one estimated before release, due to unpredictable/unconsidered operating conditions. The life cycle of ML systems in use in big companies like Google envisages a loop where the system is continuously monitored in operation, and gathered data are used to decide corrections/improvements to be applied in the next cycle. Since the amount of monitoring data in a cycle can be huge and the correct output for each operational input is unknown (manual labeling is required), the evaluation of the accuracy of the system in operation (operational accuracy) is costly and time consuming. This dissertation deals with the problem of assessing the operational accuracy of CNN-based image classifiers. In line with the emerging life cycle for ML systems, the thesis targets the assessment problem from two perspectives: online assessment, directly in the operational environment, and offline assessment, in the development environment. Online assessment is based on automated oracles, typically used for failure detection. Its advantage is to provide a continuous evaluation of the accuracy, as close as possible to the actual one, without requiring human intervention. The drawback is that online assessment is typically based on a probabilistic (not deterministic) knowledge of the correct label of an operational input. Offline assessment typically involves human beings, who have to provide the correct label of the operational images, ultimately yielding more faithful accuracy estimates. This thesis investigates experimentally the complementary characteristics of online and offline assessment techniques, and then proposes to combine them for providing continuous yet faithful estimates of the operational accuracy of CNN-based image classifiers, limiting the involvement of human beings to a level that may be considered affordable in many applications. The thesis proposes and evaluates experimentally two online assessment techniques and one offline technique to evaluate the accuracy of CNN. The results of experiments show their respective strengths and limitations. In particular, the higher cost to perform the offline assessment compared to the online one is balanced by estimates closer to the actual accuracy. Building on these experimental results, the thesis proposes a hybrid CNN Accuracy Assessment Cycle (CNN-AAC) – combining online and offline techniques - which can be integrated into iterative industrial-strength life cycle models for CNN-based systems.

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item