首页 | 本学科首页   官方微博 | 高级检索  
     


Stochastic Grammatical Inference of Text Database Structure
Authors:Young-Lai  Matthew  Tompa  Frank WM.
Affiliation:(1) Computer Science Department, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1;(2) Computer Science Department, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1
Abstract:For a document collection in which structural elements are identified with markup, it is often necessary to construct a grammar retrospectively that constrains element nesting and ordering. This has been addressed by others as an application of grammatical inference. We describe an approach based on stochastic grammatical inference which scales more naturally to large data sets and produces models with richer semantics. We adopt an algorithm that produces stochastic finite automata and describe modifications that enable better interactive control of results. Our experimental evaluation uses four document collections with varying structure.
Keywords:stochastic grammatical inference  text database structure
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号