Speech vs nonspeech segmentation of audio signals using support vector machines

Danisman T., ALPKOÇAK A.

IEEE 15th Signal Processing and Communications Applications Conference, Eskişehir, Türkiye, 11 - 13 Haziran 2007, ss.854-857, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Doi Numarası: 10.1109/siu.2007.4298688
Basıldığı Şehir: Eskişehir
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.854-857
Anahtar Kelimeler: speech segmentation, statistical signal processing
Akdeniz Üniversitesi Adresli: Hayır

Özet

In this study, we have presented a speech vs nonspeech segmentation of audio signals extracted from video. We have used 4330 seconds of audio signal extracted from "Lost" TV series for training. Our training set is automatically builded by using timestamp information exists in subtitles. After that, silence areas within those speech areas are discarded with a further study. Then, standard deviation of MFCC feature vectors of size 20 have been obtained Finally, Support Vector Machines (SVM) is used with one-vs-all method for the classification.