An algorithm for similar utterance section extraction for managing spoken documents期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

An algorithm for similar utterance section extraction for managing spoken documents

Authors:	Yoshiaki Itoh Kazuyo Tanaka Shi-Wook Lee

Affiliation:	(1) Faculty of Software and Information Science, Iwate Prefectural University, Sugo, Takizawa, Iwate 020-0193, Japan;(2) Institute of Library and Information Science, University of Tsukuba, 1-2 Kasuga, Tsukuba 305-8550, Japan;(3) National Institute of Advanced Industrial Science and Technology: AIST, Tukuba-shi Ibaragi 305-8568, Japan

Abstract:	This paper proposes a new, efficient algorithm for extracting similar sections between two time sequence data sets. The algorithm, called Relay Continuous Dynamic Programming (Relay CDP), realizes fast matching between arbitrary sections in the reference pattern and the input pattern and enables the extraction of similar sections in a frame synchronous manner. In addition, Relay CDP is extended to two types of applications that handle spoken documents. The first application is the extraction of repeated utterances in a presentation or a news speech because repeated utterances are assumed to be important parts of the speech. These repeated utterances can be regarded as labels for information retrieval. The second application is flexible spoken document retrieval. A phonetic model is introduced to cope with the speech of different speakers. The new algorithm allows a user to query by natural utterance and searches spoken documents for any partial matches to the query utterance. We present herein a detailed explanation of Relay CDP and the experimental results for the extraction of similar sections and report results for two applications using Relay CDP. Yoshiaki Itoh has been an associate professor in the Faculty of Software and Information Science at Iwate Prefectural University, Iwate, Japan, since 2001. He received the B.E. degree, M.E. degree, and Dr. Eng. from Tokyo University, Tokyo, in 1987, 1989, and 1999, respectively. From 1989 to 2001 he was a researcher and a staff member of Kawasaki Steel Corporation, Tokyo and Okayama. From 1992 to 1994 he transferred as a researcher to Real World Computing Partnership, Tsukuba, Japan. Dr. Itoh's research interests include spoken document processing without recognition, audio and video retrieval, and real-time human communication systems. He is a member of ISCA, Acoustical Society of Japan, Institute of Electronics, Information and Communication Engineers, Information Processing Society of Japan, and Japan Society of Artificial Intelligence. Kazuyo Tanaka has been a professor at the University of Tsukuba, Tsukuba, Japan, since 2002. He received the B.E. degree from Yokohama National University, Yokohama, Japan, in 1970, and the Dr. Eng. degree from Tohoku University, Sendai, Japan, in 1984. From 1971 to 2002 he was research officer of Electrotechnical Laboratory (ETL), Tsukuba, Japan, and the National Institute of Advanced Science and Technology (AIST), Tsukuba, Japan, where he was working on speech analysis, synthesis, recognition, and understanding, and also served as chief of the speech processing section. His current interests include digital signal processing, spoken document processing, and human information processing. He is a member of IEEE, ISCA, Acoustical Society of Japan, Institute of Electronics, Information and Communication Engineers, and Japan Society of Artificial Intelligence. Shi-Wook Lee received the B.E. degree and M.E. degree from Yeungnam University, Korea and Ph.D. degree from the University of Tokyo in 1995, 1997, and 2001, respectively. Since 2001 he has been working in the Research Group of Speech and Auditory Signal Processing, the National Institute of Advanced Science and Technology (AIST), Tsukuba, Japan, as a postdoctoral fellow. His research interests include spoken document processing, speech recognition, and understanding.

Keywords:	Speech labeling Spoken documents retrieval Repeated utterance Similar section extraction Time sequence data
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏