Development of Hausa dataset a baseline for speech recognition

doi:10.1016/j.dib.2022.107820

Data in Brief

Volume 40, February 2022, 107820

https://doi.org/10.1016/j.dib.2022.107820 Get rights and content

Under a Creative Commons license

open access

Abstract

The Hausa language read-speech dataset was created by recording native Hausa speakers. The recording took place at Nile university of Nigeria audio studio and radio broadcasting studio. The recorded dataset was segmented into unigram and bigram. The Hausa speech dataset contain 47hr of recorded audio speech. The dataset can be used for automatic speech recognition, speech synthesis, Text-to-Speech and speech-to-text application.

Keywords

Corpus

Automatic speech

NLP

Text-to-speech

Hausa corpus

Data in Brief

Data ArticleDevelopment of Hausa dataset a baseline for speech recognition

Abstract

Keywords

Data Article
Development of Hausa dataset a baseline for speech recognition