Speaker identification - prototype development and performance

Watts, David Michael Graeme (2006) Speaker identification - prototype development and performance. [USQ Project] (Unpublished)

[img]
Preview
PDF
WATTS_David_2006.pdf

Download (3929Kb)

Abstract

Human speech is our most natural form of communication and conveys both meaning and identity. The identity of a speaker can be determined from the information contained in the speech signal through speaker identification. Speaker identification is concerned with identifying unknown speakers from a database of speaker models previously enrolled in the system. The general process of speaker identification involves two stages. The first stage extracts features from speakers that are to be enrolled into the system. The second stage involves processing the identity of a speaker using features extracted from the speech and comparing these to the speaker models. Several techniques available for feature extraction including Linear Predictive Coding (LPC), Mel-Frequency Cepstral Coefficients and LPC Cepstral coefficients. These features are used with a classification technique to create a speaker model. Vector Quantization is commonly used in speaker identification producing reliable results. This project demonstrates a prototype speaker identification system tailored for utterances containing less than ten words and target sets of less than eight voice profiles. VQ (codebook size = 128) with 20-dimension LPCC obtain accuracy results of 83% and 100% using 12 speakers with the NTIMIT and Alternative (own) corpus, respectively. Tests were conducted using 30s of training speech and 3s of testing speech.


Statistics for USQ ePrint 2338
Statistics for this ePrint Item
Item Type: USQ Project
Refereed: No
Item Status: Live Archive
Depositing User: epEditor USQ
Faculty / Department / School: Historic - Faculty of Engineering and Surveying - Department of Electrical, Electronic and Computer Engineering
Date Deposited: 11 Oct 2007 01:03
Last Modified: 02 Jul 2013 22:43
Uncontrolled Keywords: speech; linear predictive coding (LPC); vector quantization (VQ); gaussian mixture models; NTIMIT
Fields of Research (FOR2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080107 Natural Language Processing
URI: http://eprints.usq.edu.au/id/eprint/2338

Actions (login required)

View Item Archive Repository Staff Only