Cross-triggering issue in audio event detection and mitigation
2024
Cross-triggering is a critical problem for applications of audio event detection (AED), particularly in low-resource settings. However, not much attention (if not none) has been paid to this problem in the AED research community. In this work, we tackle this problem via a regularization approach. We propose a regularizer, namely mutual exclusivity regularizer, that is able to enforce pairwise exclusivity between two event classes when they do not co-occur. When the regularizer is added to the loss function for network training, in effect, the increase in the score of one event class will result in the decrease of the other and vice versa. To quantify the effectiveness of the proposed regularizer, we developed an AED system based on convolutional neural network (CNN) for the detection of hand clap and door knock, two transient audio events that share similar spectro-temporal profiles, and conducted experiments on a large-scale real-world dataset (around 274.2 hours). The experimental results show that the proposed approach is able to largely mitigate the cross-triggering issue in various experimental settings. Further-more, the reduction in cross-triggering, as a result, leads to improvement in the detection performance.
Research areas