首页 | 本学科首页   官方微博 | 高级检索  
     


Rule-based spreadsheet data transformation from arbitrary to relational tables
Affiliation:1. Department of Environmental and Molecular Toxicology, Oregon State University, Corvallis, OR 97331-7301, USA;2. The Linus Pauling Institute, Oregon State University, Corvallis, OR 97331-7301, USA;3. College of Pharmacy, Oregon State University, Corvallis, OR 97331-7301, USA;4. Environmental Health Sciences Center, Oregon State University, Corvallis, OR 97331-7301, USA;5. Department of Biological Sciences, Allergan, Inc., Irvine, CA 92623-9534, USA;6. Department of Chemical Sciences, Allergan, Inc., Irvine, CA 92623-9534, USA;1. College of Information Science and Engineering, Fujian University of Technology, Fuzhou, Fujian, 350118, China;2. Fujian Provincial Key Laboratory of Big Data Mining and Applications (Fujian University of Technology), Fuzhou, Fujian, 350118, China
Abstract:The paper discusses issues of rule-based data transformation from arbitrary spreadsheet tables to a canonical (relational) form. We present a novel table object model and rule-based language for table analysis and interpretation. The model is intended to represent a physical (cellular) and logical (semantic) structure of an arbitrary table in the transformation process. The language allows drawing up this process as consecutive steps of table understanding, i. e. recovering implicit semantics. Both are implemented in our tool for spreadsheet data canonicalization. The presented case study demonstrates the use of the tool for developing a task-specific rule-set to convert data from arbitrary tables of the same genre (government statistical websites) to flat file databases. The performance evaluation confirms the applicability of the implemented rule-set in accomplishing the stated objectives of the application.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号