Forgetful Active Learning with Switch Events - Efficient Sampling for Out-of-Distribution Data

Personnel: Ryan Benkert, Mohit Prabhushankar

Goal/Motivation: The objective of our-of-distribution active learning is the selection of unlabeled data points for enhanced robustness. Specifically, conventional active learning improves performance on data points similar to the training set; Out-of-distribution active learning improves performance on dissimilar samples from unknown origins.

Challenges: A major issue in neural network deployment is their sensitivity to unknown inputs. Within the context of deep learning, unknown inputs typically refer to samples originating from settings that significantly differ from the training environment. Due to the probabilistic nature of neural network predictions, samples originating from the training environment are called in-distribution while unknown inputs are called out-of-distribution, or OOD in short. In practice, OOD samples result in unpredictable or even random predictions and represent a major challenge for real-world deployment of machine learning paradigms like active learning.

High Level Description of the Work: For active learning, robustness properties are not directly apparent from a test set distribution. While two strategies improve test set performance by a similar margin, one strategy may outperform another significantly in out-of-distribution settings (Figure~1). Furthermore, popular approaches are typically not optimized for robustness, and data importance is strictly based on in-distribution settings.

In this work, we address active learning robustness with learning dynamics in neural networks. Specifically, we decouple data selection from the model representation by defining sample importance with representation shifts. In each round, we track the frequency of prediction switches on unlabeled data and select the most frequently switching data points for annotation (Figure~2). Within the context of out-of-distribution samples, the step is crucial as model representations are notoriously inaccurate on unknown data samples [1, 2, 3]. We call our approach “Forgetful Active Learning with Switch Events” or FALSE in short [4]. FALSE effectively outperforms popular strategies in out-of-distribution and in-distribution active learning settings (Figure~3).

References:

J. Lee and G. AlRegib, "Gradients as a Measure of Uncertainty in Neural Networks," in IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, Oct. 2020.
G. Kwon, M. Prabhushankar, D. Temel, and G. AlRegib, "Backpropagated Gradient Representations for Anomaly Detection," in Proceedings of the European Conference on Computer Vision (ECCV), SEC, Glasgow, Aug. 23-28 2020. [PDF] [Code]
D. Temel, G. Kwon*, M. Prabhushankar*, and G. AlRegib, "CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign Recognition," in Advances in Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Intelligent Transportation Systems, Long Beach, CA, Dec. 2017
R. Benkert, M. Prabhushankar, and G. AlRegib, "Forgetful Active Learning With Switch Events: Efficient Sampling for Out-of-Distribution Data," in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 16-19 2022.