Título
[en] BUILDING A NOISY AUDIO DATASET TO EVALUATE MACHINE LEARNING APPROACHES FOR AUTOMATIC SPEECH RECOGNITION SYSTEMS
Autor
[pt] SERGIO COLCHER
Autor
[pt] JULIO CESAR DUARTE
Vocabulário
[en] MACHINE LEARNING
Vocabulário
[en] NOISY AUDIO
Vocabulário
[en] AUTOMATIC SPEECH RECOGNITION SYSTEMS
Vocabulário
[en] DATASETS
Resumo
[en] Automatic speech recognition systems are part of people’s daily lives, embedded in personal assistants and mobile phones, helping as a facilitator for human-machine
interaction while allowing access to information in a practically intuitive way. Such systems
are usually implemented using machine learning techniques, especially with deep neural
networks. Even with its high performance in the task of transcribing text from speech, few
works address the issue of its recognition in noisy environments and, usually, the datasets
used do not contain noisy audio examples, while only mitigating this issue using data augmentation techniques. This work aims to present the process of building a dataset of noisy
audios, in a specific case of degenerated audios due to interference, commonly present in
radio transmissions. Additionally, we present initial results of a classifier that uses such
data for evaluation, indicating the benefits of using this dataset in the recognizer’s training
process. Such recognizer achieves an average result of 0.4116 in terms of character error
rate in the noisy set (SNR = 30).
Catalogação
2022-10-26
Tipo
[pt] TEXTO
Formato
application/pdf
Idioma(s)
INGLÊS
Referência [en]
https://www.maxwell.vrac.puc-rio.br/colecao.php?strSecao=resultado&nrSeq=60957@2
Referência DOI
https://doi.org/10.17771/PUCRio.DImcc.60957
Arquivos do conteúdo
NA ÍNTEGRA PDF