首页 | 本学科首页   官方微博 | 高级检索  
     

中文歌词的统计特征及其检索应用
引用本文:郑亚斌,刘知远,孙茂松.中文歌词的统计特征及其检索应用[J].中文信息学报,2007,21(5):61-67.
作者姓名:郑亚斌  刘知远  孙茂松
作者单位:清华大学 计算机科学与技术系, 北京 100084
摘    要:我们在歌词上做了一些传统的自然语言处理相关的实验。歌词是歌曲语义上的重要表达,因此,对歌词的分析可以作为歌曲音频处理的互补。我们利用齐夫定律对歌词语料库的字和词进行统计特征的考察,实验表明,其分布基本符合齐夫定律。利用向量空间模型的表示,我们可以找到比较相似的歌词集合。另外,我们探讨了如何利用歌词中的时间标注信息进行进一步的分析: 例如发现歌曲中重复片段,节奏划分,检索等。初步的实验表明,我们的方法具有一定的效果。

关 键 词:计算机应用  中文信息处理  歌词  齐夫定律  k-近邻  节奏  
文章编号:1003-0077(2007)05-0061-07
收稿时间:2007-04-13
修稿时间:2007-04-132007-06-25

Statistical Features of Chinese Song Lyrics and its Application to Retrieval
ZHENG Ya-bin,LIU Zhi-yuan,SUN Mao-song.Statistical Features of Chinese Song Lyrics and its Application to Retrieval[J].Journal of Chinese Information Processing,2007,21(5):61-67.
Authors:ZHENG Ya-bin  LIU Zhi-yuan  SUN Mao-song
Affiliation:Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
Abstract:We report experiments on song lyrics based on natural language processing techniques.Song lyrics play an important role of the semantics in songs;therefore,analysis of lyrics may be a complement of acoustic methods.We investigate the lyrics corpus based on Zip'f Law using both character and word as a unit,which proves the validness Zip'f Law in such corpus.Also,we find a set of lyrics that are similar to each other by means of vector space mo-del.Moreover,we discuss how to use the time annotation for further analysis;detecting the repetition of songs identifying rhythms,retrieving songs and soon.Preliminary experiment shows the effectiveness of our proposed method.
Keywords:computer application  Chinese information processing  song lyrics  zipf's law  k-NN  rhythm
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号