Retrieving Sound Samples of Subjective Interest With User Interaction
Jan Jakubik
DOI: http://dx.doi.org/10.15439/2020F82
Citation: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 21, pages 387–390 (2020)
Abstract. This paper concerns the retrieval of audio samples with a high degree of user interaction, motivated by a practical use case. We consider an open set recognition scenario in which the goal is to find all occurrences of a subjectively interesting sound selected by a user within a particular audio file. We use only a single starting example and maintain interaction through yes-no answers from the user, indicating whether any new retrieved sound matches the target pattern. We present a small dataset for this task and evaluate a baseline solution based on Nonnegative Matrix Factorization and greedy feature selection.
References
- B. McFee, C. Raffel, D. Liang, D. Ellis, M. Mcvicar, E. Battenberg, and O. Nieto, “librosa: Audio and music signal analysis in python,” in Proc. of the 14th Python in Science Conf. (SCIPY 2015), 01 2015, pp. 18–24. the 14th Python in Science Conf. (SCIPY 2015), 01 2015, pp. 18–24.
- H. Purwins, B. Li, T. Virtanen, J. Schluter, S.-Y. Chang, and T. Sainath, “Deep learning for audio signal processing,” vol. 13, 2019, pp. 206–219.
- B. McFee, C. Raffel, D. Liang, D. Ellis, M. Mcvicar, E. Battenberg, and O. Nieto, “librosa: Audio and music signal analysis in python,” in Proc. of the 14th Python in Science Conf. (SCIPY 2015), 01 2015, pp. 18–24. the 14th Python in Science Conf. (SCIPY 2015), 01 2015, pp. 18–24.
- L. Buitinck, G. Louppe, M. Blondel, F. Pedregosa, A. Mueller, O. Grisel, V. Niculae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton, J. VanderPlas, A. Joly, B. Holt, and G. Varoquaux, “API design for machine learning software: experiences from the scikit-learn project,” in ECML PKDD Workshop: Languages for Data Mining and Machine Learning, 2013, pp. 108–122.