Abstract
Abstract
Automatic segmentation of audio streams according to speaker identities, environmental and channel conditions has become an important preprocessing step for speech processing, speaker recognition and audio mining. This paper presents an automatic speech segmentation system where the performance of the probabilistic neural network (PNN)(which is the main part of the system) is examined and then enhanced in the area of segmentation of conversational speech. The results show that a percentage false segmentation (PFS) of 18% can be achieved. PFS is dropped to 6.1% enhancing the system. The experiments were carried out on a dataset created by concatenating speakers from the TIMIT database.
Keywords: Speech segmentation, PNN, Probabilistic neural network.