Guerriero, Antonio (2022) Operational Accuracy Assessment of CNN-based Image Classifiers. [Tesi di dottorato]
Preview |
Text
Guerriero_Antonio_34.pdf Download (7MB) | Preview |
Item Type: | Tesi di dottorato |
---|---|
Resource language: | English |
Title: | Operational Accuracy Assessment of CNN-based Image Classifiers |
Creators: | Creators Email Guerriero, Antonio antonio.guerriero@unina.it |
Date: | 10 March 2022 |
Number of Pages: | 106 |
Institution: | Università degli Studi di Napoli Federico II |
Department: | Ingegneria Elettrica e delle Tecnologie dell'Informazione |
Dottorato: | Information technology and electrical engineering |
Ciclo di dottorato: | 34 |
Coordinatore del Corso di dottorato: | nome email Riccio, Daniele daniele.riccio@unina.it |
Tutor: | nome email Russo, Stefano UNSPECIFIED Pietrantuono, Roberto UNSPECIFIED |
Date: | 10 March 2022 |
Number of Pages: | 106 |
Keywords: | Machine Learning, Image Classification, Automatic Oracles, Sampling, Assessment |
Settori scientifico-disciplinari del MIUR: | Area 09 - Ingegneria industriale e dell'informazione > ING-INF/05 - Sistemi di elaborazione delle informazioni |
Date Deposited: | 22 May 2022 21:27 |
Last Modified: | 28 Feb 2024 10:50 |
URI: | http://www.fedoa.unina.it/id/eprint/14445 |
Collection description
Machine Learning (ML) systems are nowadays largely adopted in many application domains. In the field of Image Classification (IC), where Convolutional Neural Networks (CNN) represent the state of the art ML models, they can even outperform human beings. The performance of a CNN once deployed in the operational environment can be very different from the one estimated before release, due to unpredictable/unconsidered operating conditions. The life cycle of ML systems in use in big companies like Google envisages a loop where the system is continuously monitored in operation, and gathered data are used to decide corrections/improvements to be applied in the next cycle. Since the amount of monitoring data in a cycle can be huge and the correct output for each operational input is unknown (manual labeling is required), the evaluation of the accuracy of the system in operation (operational accuracy) is costly and time consuming. This dissertation deals with the problem of assessing the operational accuracy of CNN-based image classifiers. In line with the emerging life cycle for ML systems, the thesis targets the assessment problem from two perspectives: online assessment, directly in the operational environment, and offline assessment, in the development environment. Online assessment is based on automated oracles, typically used for failure detection. Its advantage is to provide a continuous evaluation of the accuracy, as close as possible to the actual one, without requiring human intervention. The drawback is that online assessment is typically based on a probabilistic (not deterministic) knowledge of the correct label of an operational input. Offline assessment typically involves human beings, who have to provide the correct label of the operational images, ultimately yielding more faithful accuracy estimates. This thesis investigates experimentally the complementary characteristics of online and offline assessment techniques, and then proposes to combine them for providing continuous yet faithful estimates of the operational accuracy of CNN-based image classifiers, limiting the involvement of human beings to a level that may be considered affordable in many applications. The thesis proposes and evaluates experimentally two online assessment techniques and one offline technique to evaluate the accuracy of CNN. The results of experiments show their respective strengths and limitations. In particular, the higher cost to perform the offline assessment compared to the online one is balanced by estimates closer to the actual accuracy. Building on these experimental results, the thesis proposes a hybrid CNN Accuracy Assessment Cycle (CNN-AAC) – combining online and offline techniques - which can be integrated into iterative industrial-strength life cycle models for CNN-based systems.
Downloads
Downloads per month over past year
Actions (login required)
View Item |