Paper ID: 2411.00876
Resilience to the Flowing Unknown: an Open Set Recognition Framework for Data Streams
Marcos Barcina-Blanco, Jesus L. Lobo, Pablo Garcia-Bringas, Javier Del Ser
Modern digital applications extensively integrate Artificial Intelligence models into their core systems, offering significant advantages for automated decision-making. However, these AI-based systems encounter reliability and safety challenges when handling continuously generated data streams in complex and dynamic scenarios. This work explores the concept of resilient AI systems, which must operate in the face of unexpected events, including instances that belong to patterns that have not been seen during the training process. This is an issue that regular closed-set classifiers commonly encounter in streaming scenarios, as they are designed to compulsory classify any new observation into one of the training patterns (i.e., the so-called \textit{over-occupied space} problem). In batch learning, the Open Set Recognition research area has consistently confronted this issue by requiring models to robustly uphold their classification performance when processing query instances from unknown patterns. In this context, this work investigates the application of an Open Set Recognition framework that combines classification and clustering to address the \textit{over-occupied space} problem in streaming scenarios. Specifically, we systematically devise a benchmark comprising different classification datasets with varying ratios of known to unknown classes. Experiments are presented on this benchmark to compare the performance of the proposed hybrid framework with that of individual incremental classifiers. Discussions held over the obtained results highlight situations where the proposed framework performs best, and delineate the limitations and hurdles encountered by incremental classifiers in effectively resolving the challenges posed by open-world streaming environments.
Submitted: Oct 31, 2024