Characteristics-based effective applause detection for meeting speech

Yan-Xiong LI, Qian-Hua HE, Sam KWONG, Tao LI, Ji-Chen YANG

Research output: Journal PublicationsJournal Article (refereed)peer-review

19 Citations (Scopus)


Applause frequently occurs in multi-participants meeting speech. In fact, detecting applause is quite important for meeting speech recognition, semantic inference, highlight extraction, etc. In this paper, we will first study the characteristic differences between applause and speech, such as duration, pitch, spectrogram and occurrence locations. Then, an effective algorithm based on these characteristics is proposed for detecting applause in meeting speech stream. In the algorithm, the non-silence signal segments are first extracted by using voice activity detection. Afterward, applause segments are detected from the non-silence signal segments based on the characteristic differences between applause and speech without using any complex statistical models, such as hidden Markov models. The proposed algorithm can accurately determine the boundaries of applause in meeting speech stream, and is also computationally efficient. In addition, it can extract applause sub-segments from the mixed segments. Experimental evaluations show that the proposed algorithm can achieve satisfactory results in detecting applause of the meeting speech. Precision rate, recall rate, and F1-measure are 94.34%, 98.04%, and 96.15%, respectively. When compared with the traditional algorithm under the same experimental conditions, 3.62% improvement in F1-measure is achieved, and about 35.78% of computational time is saved. © 2009 Elsevier B.V. All rights reserved.
Original languageEnglish
Pages (from-to)1625-1633
JournalSignal Processing
Issue number8
Publication statusPublished - Aug 2009
Externally publishedYes


  • Applause characteristics
  • Applause detection
  • Meeting speech
  • Spontaneous speech recognition


Dive into the research topics of 'Characteristics-based effective applause detection for meeting speech'. Together they form a unique fingerprint.

Cite this