首页 | 本学科首页   官方微博 | 高级检索  
     


A synergistic strategy for combining thesaurus-based and corpus-based approaches in building ontology for multilingual search engines
Affiliation:1. The American College of Greece;2. College of Computer and Information Sciences;3. Instituto Politécnico Nacional;4. Department of Applied Informatics;5. University of Science and Technology of China (USTC);6. King Abdulaziz University;1. Center of Excellence in Information Assurance (CoEIA), King Saud University (KSU), Riyadh, Saudi Arabia;2. Center of Excellence in Information Assurance (CoEIA), College of Computer and Information Sciences (CCIS), King Saud University (KSU), Riyadh, Saudi Arabia;3. College of Computer and Information Sciences (CCIS), King Saud University (KSU), Riyadh, Saudi Arabia;4. Intelligent Systems Group (ISG), Department of Computing, Macquarie University, NSW 2109, Australia;1. School of Business, Anhui University, 230039 Hefei, China;2. School of Computer Science and Technology, University of Science and Technology of China, 230027 Hefei, China
Abstract:In this article we illustrate a methodology for building cross-language search engine. A synergistic approach between thesaurus-based approach and corpus-based approach is proposed. First, a bilingual ontology thesaurus is designed with respect to two languages: English and Spanish, where a simple bilingual listing of terms, phrases, concepts, and subconcepts is built. Second, term vector translation is used – a statistical multilingual text retrieval techniques that maps statistical information about term use between languages (Ontology co-learning). These techniques map sets of t f id f term weights from one language to another. We also applied a query translation method to retrieve multilingual documents with an expansion technique for phrasal translation. Finally, we present our findings.
Keywords:Multi-language search engines  Cross-language search engines  Ontologies  Social Networks  Ontology co-learning
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号