Investigating an Automatic Method for Summarizing and Presenting a Video Speech Using Acoustic Features (음향학적 자질을 활용한 비디오 스피치 요약의 자동 추출과 표현에 관한 연구)

Kim, Hyun Hee (김현희)

doi:10.3743/KOSIM.2012.29.4.191

Investigating an Automatic Method for Summarizing and Presenting a Video Speech Using Acoustic Features

Journal of the Korean Society for Information Management
Abbr : JKOSIM
2012, 29(4), pp.191~208
DOI : 10.3743/KOSIM.2012.29.4.191
Publisher : 한국정보관리학회
Research Area : Interdisciplinary Studies > Library and Information Science
Received : November 21, 2012
Accepted : December 13, 2012
Published : December 30, 2012

Kim, Hyun Hee ¹

¹명지대학교

Accredited

ABSTRACT

Two fundamental aspects of speech summary generation are the extraction of key speech content and the style of presentation of the extracted speech synopses. We first investigated whether acoustic features (speaking rate, pitch pattern, and intensity) are equally important and, if not, which one can be effectively modeled to compute the significance of segments for lecture summarization. As a result, we found that the intensity (that is, difference between max DB and min DB) is the most efficient factor for speech summarization. We evaluated the intensity-based method of using the difference between max-DB and min-DB by comparing it to the keyword-based method in terms of which method produces better speech summaries and of how similar weight values assigned to segments by two methods are. Then, we investigated the way to present speech summaries to the viewers. As such, for speech summarization, we suggested how to extract key segments from a speech video efficiently using acoustic features and then present the extracted segments to the viewers.

KEYWORDS

speech summarization, acoustic features, prosodic features, TED Talks, Praat

Citation status

* References for papers published after 2025 are currently being built.

[journal] 김현희 / 2011 / A Study on the Interactive Effect of Spoken Words and Imagery not Synchronized in Multimedia Surrogates for Video Gisting / 한국문헌정보학회지 / 한국문헌정보학회 45(2) : 97~118

[book] 정영미 / 2007 / 정보검색연구 / 구미무역출판부

[web] Boersma, P. / 2006 / Praat: Doing phonetics by computer / http://www.praat.org/

[book] Cawkell, A. / 1995 / A guide to image processing and picture management / Gower Publishing Ltd

[journal] Chen, B. / 2012 / A risk-aware modeling framework for speech summarization / IEEE Transactions on Audio, Speech, and Language Processing 20(1) : 211~222

[confproc] Ding, W. / 1999 / Multimodal surrogates for video browsing / Proceedings of the Fourth ACM conference on Digital Libraries : 85~93

[confproc] Fujii, Y. / 2008 / Class lecture summarization taking into account consecutiveness of important sentences / Proceedings of Interspeech : 2438~2441

[journal] Furui, S. / 2004 / Speech-to-text and speech-to-speech summarization of spontaneous speech / IEEE Transactions on Speech Audio Process 12(4) : 401~408

[confproc] Hirschberg, J. / 1996 / prosodic analysis of discourse segments in direction-given monologues / Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics : 286~293

[journal] Lin, S. / 2009 / A comparative study of probabilistic ranking models for Chinese spoken document summarization / ACM Transactions on Asian Language Information Processing 8(1) : 1~23

[book] Liu, Y. / 2011 / Speech summarization, In Spoken language understanding: Systems for extracting semantic information from speech / John Wiley & Sons, Ltd : 357~392

[thesis] Maskey, S. / 2008 / Automatic broadcast news speech summarization / 박사 / Columbia University

[confproc] Maskey, S. / 2005 / Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization / Proceedings of Interspeech : 621~624

[confproc] Maskey, S. / 2006 / Summarizing speech without text using Hidden Markov Models / Proceedings of the Human Language Technology Conference of the NAACL (Companion Volume: Short Papers) / Association for Computational Linguistics : 89~92

[journal] Marchionini, G. / 2009 / Multimedia surrogates for video gisting: Toward combining spoken words and imagery / Information Processing and Management 45(6) : 615~630

[confproc] Murray, G. / 2005 / Extractive summarization of meeting recordings / Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH) : 593~596

[thesis] Turner, J. / 1994 / Determining the subject content of still and moving documents for storage and retrieval: An experimental investigation / 박사 / University of Toronto

[journal] Turney, P. / 2000 / Learning algorithms for keyphrase extraction / Information Retrieval 2(4) : 303~336

[report] van Houten, Y. / 2000 / Video browsing and summarization / Telematica Instituut

[journal] Wang, D. / 2007 / An acoustic measure for word prominence in spontaneous speech / IEEE Transactions on Audio, Speech, and Language Processing 15(2) : 690~701

[confproc] Xie, S. / 2009 / Integrating prosodic features in extractive meeting summarization / Proceedings of the 11th Biannual IEEE Workshop on Automatic Speech Recognition and Understanding : 387~391

[confproc] Zhang, J. / 2007 / Speech summarization without lexical features for Mandarin broadcast news / Proceedings of NAACL HLT(Companion Volume) : 213~216

[journal] Zhang, Z. / 2012 / Active learning with semi-automatic annotation for extractive speech summarization / ACM Transactions on Speech and Language Processing 8(4) : 1~25

[confproc] Zhang, J. / 2007 / Improving lecture speech summarization using rhetorical information / Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding : 195~200

[confproc] Zhang, J. / 2007 / A comparative study on speech summarization of broadcast news and lecture speech / Proceedings of the annual conference of the international speech communication association : 2781~2784

[confproc] Zhu, X. / 2009 / Summarizing multiple spoken documents: Finding evidence from untranscribed audio / Proceedings of ACL/AFNLP : 549~557

This paper was written with support from the National Research Foundation of Korea.

KJCKorea
Journal Central

Journal of the Korean Society for Information Management 2025 KCI Impact Factor : 1.27

Investigating an Automatic Method for Summarizing and Presenting a Video Speech Using Acoustic Features

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2025 are currently being built.

Journal of the Korean Society for Information Management 2025 KCI Impact Factor : 1.27

Investigating an Automatic Method for Summarizing and Presenting a Video Speech Using Acoustic Features

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (0)

REFERENCES (26) * References for papers published after 2025 are currently being built.

Search PDF

Citation

* References for papers published after 2025 are currently being built.