Dynamic Time Warping (DTW) is a common technique widely used for nonlinear time normalization of different utterances in many speech recognition systems. Two major problems are usually encountered when the DTW is applied for recognizing speech utterances: (i) the normalization factors used in a warping path; and (ii) finding the K-best warping paths. Although DTW is modified to compute multiple warping paths by using the Tree-Trellis Search (TTS) algorithm, the use of actual normalization factor still remains a major problem for the DTW. In this paper, a Parallel Genetic Time Warping (PGTW) is proposed to solve the above said problems. A database extracted from the TIMIT speech database of 95 isolated words is set up for evaluating the performance of the PGTW. In the database, each of the first 15 words had 70 different utterances, and the remaining 80 words had only one utterance. For each of the 15 words, one utterance is arbitrarily selected as the test template for recognition. Distance measure for each test template to the utterances of the same word and to those of the 80 words is calculated with three different time warping algorithms: TTS, PGTW and Sequential Genetic Time Warping (SGTW). A Normal Distribution Model based on Rabiner 23 is used to evaluate the performance of the three algorithms analytically. The analyzed results showed that the PGTW had performed better than the TTS. It also showed that the PGTW had very similar results as the SGTW, but about 30% CPU time is saved in the single processor system.
|Journal||International Journal of Pattern Recognition and Artificial Intelligence|
|Publication status||Published - Aug 1998|
- Dynamic time warping
- Genetic algorithm
- Parallel genetic algorithm
- Speech recognition
- Tree-trellis search