Database dependence comparison in detection of physical access voice spoofing attacks

Abstract

The antispoofing challenges are designed to work on a single database, on which we can test our model. The automatic speaker verification spoofing and countermeasures (ASVspoof) [1] challenge series is a community-led initiative that aims to promote the consideration of spoofing and the development of countermeasures. In general, the idea of analyzing the databases individually has been the dominant approach but this could be rather misleading. This paper provides a study of the generalization capability of antispoofing systems based on neural networks by combining different databases for training and testing. We will try to give a broader vision of the advantages of grouping different datasets. We will delve into the ”replay attacks” on physical data. This type of attack is one of the most difficult to detect since only a few minutes of audio samples are needed to impersonate the voice of a genuine speaker and gain access to the ASV system. To carry out this task, the ASV databases from ASVspoof-challenge [2], [3],[4] have been chosen and will be used to have a more concrete and accurate vision of them. We report results on these databases using different neural network architectures and set-ups.

Publication
IberSPEECH 2022