首页 | 本学科首页   官方微博 | 高级检索  
     

带复杂计算的金融领域自然语言查询的SQL生成
引用本文:何佳壕,刘喜平,舒晴,万常选,刘德喜,廖国琼. 带复杂计算的金融领域自然语言查询的SQL生成[J]. 浙江大学学报(工学版), 2022, 57(2): 277-286. DOI: 10.3785/j.issn.1008-973X.2023.02.008
作者姓名:何佳壕  刘喜平  舒晴  万常选  刘德喜  廖国琼
作者单位:江西财经大学 信息管理学院,江西 南昌 330013
基金项目:国家自然科学基金资助项目(62076112, 61972184); 江西省自然科学基金资助项目(20192BAB207017); 江西省教育厅科学技术研究资助项目(GJJ190255); 江西省研究生创新专项资金项目(YC2021-B130)
摘    要:研究金融领域基于自然语言查询的结构化查询语言(SQL)生成问题(Text-to-SQL), 构建一个金融领域Text-to-SQL数据集,称为SOFT数据集. 该数据集覆盖了金融领域的常见查询,具有鲜明的特点,并对Text-to-SQL提出了挑战. 提出金融领域Text-to-SQL模型FinSQL,该模型优化了对金融领域复杂查询的支持. 通过分析一类复杂计算查询(行计算查询)的特点,提出一种基于分治的方法,即先将一个行计算查询分解为若干个子查询,分别针对每个子查询生成SQL语句,再将子查询的SQL语句组合在一起得到原始查询的SQL语句. 在SOFT数据集上进行验证,结果显示,本研究所提的方法在复杂查询上效果优于已有方法. 特别地,所提出的模型FinSQL能够较好地支持行计算查询.

关 键 词:Text-to-SQL  自然语言查询  金融领域  行计算查询  分治方法  

SQL generation from natural language queries with complex calculations on financial data
Jia-hao HE,Xi-ping LIU,Qing SHU,Chang-xuan WAN,De-xi LIU,Guo-qiong LIAO. SQL generation from natural language queries with complex calculations on financial data[J]. Journal of Zhejiang University(Engineering Science), 2022, 57(2): 277-286. DOI: 10.3785/j.issn.1008-973X.2023.02.008
Authors:Jia-hao HE  Xi-ping LIU  Qing SHU  Chang-xuan WAN  De-xi LIU  Guo-qiong LIAO
Abstract:The problem of structured query language (SQL) generation from natural language queries (Text-to-SQL) in financial domain was investigated. First, SOFT, a Text-to-SQL dataset in the financial domain was constructed. The dataset covered common queries in the financial domain with distinctive features and presented challenges to Text-to-SQL research. Then, FinSQL, a Text-to-SQL model, which optimized the support for complex queries in the financial domain, was proposed. In particular, by analyzing the characteristics of row calculation queries, a class of queries with complex numerical calculations, a divide-and-conquer based method was proposed. A row calculation query was divided into several subqueries, the SQL statement for each subquery was generated, and the SQL statements were finally combined into together to get the SQL statement for the original query. Experimental results on SOFT dataset show that the proposed FinSQL model outperforms existing methods for the hard queries, and performs well for row calculation queries.
Keywords:Text-to-SQL  natural language query  financial field  row calculation query  divide-and-conquer method  
点击此处可从《浙江大学学报(工学版)》浏览原始摘要信息
点击此处可从《浙江大学学报(工学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号