期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

郑耀东李旭峰陈和平贺桂娇《计算机系统应用》2023,32(12):32-42

自然语言转为SQL (NL2SQL)的研究有较高的应用价值, 随着深度学习技术的成熟, 越来越多的研究者开始将深度学习技术应用于NL2SQL任务中. 本文梳理了英文和中文领域NL2SQL的研究现状, 总结按年份发布的数据集和模型, 对比当前4大中文NL2SQL数据集的特点, 阐述了当前基于深度学习的NL2SQL任务的基本框架以及针对中文领域的单表简单问题和跨表复杂问题所适用的典型模型, 介绍了一般常用的模型评测方法, 并提出未来研究方向的展望. 相似文献

2.

结合数据库结构及内容的问句理解方法研究

袁志祥任冬冬洪旭东孙国华《计算机工程》2021,47(3):71-76,82

问句理解是模型将自然语言问句转换成SQL的重要基础。目前多数利用深度学习的模型仅是通过数据库结构,未结合数据库内容充分理解问句生成SQL查询。在SQLova模型的基础上,提出一种基于表结构和内容的问句理解方法。利用表结构和表内容关注机制获得问句更准确的语义表达式,通过子类分类任务填充SQL草图完成SQL查询。在阿里云首届中文NL2SQL挑战赛发布的中文数据集上进行测试,结果表明,结合数据库结构与内容的问句理解方法取得78%的准确率,比不结合表内容的模型高出1.8%,在WikiSQL数据集上比SQLova准确率高出1.4%,可以有效提高生成SQL查询的准确率。相似文献

3.

基于自裁剪异构图的NL2SQL模型

黄君扬王振宇梁家卿肖仰华《计算机工程》2022,48(9):71

自然语言转换为结构化查询语言（NL2SQL）是语义解析领域的重要任务,其核心为对数据库模式和自然语言问句进行联合学习。现有研究通过将整个数据库模式和自然语言问句联合编码构建异构图,使得异构图中引入大量无用信息,并且忽略了数据库模式中不同信息的重要性。为提高NL2SQL模型的逻辑与执行准确率,提出一种基于自裁剪异构图与相对位置注意力机制的NL2SQL模型（SPRELA）。采用序列到序列的框架,使用ELECTRA预训练语言模型作为骨干网络。引入专家知识,对数据库模式和自然语言问句构建初步异构图。基于自然语言问句对初步异构图进行自裁剪,并使用多头相对位置注意力机制编码自裁剪后的数据库模式与自然语言问句。利用树型解码器和预定义的SQL语法,解码生成SQL语句。在Spider数据集上的实验结果表明,SPRELA模型执行准确率达到71.1%,相比于相同参数量级别的RaSaP模型提升了1.1个百分点,能够更好地将数据库模式与自然语言问句对齐,从而理解自然语言查询中的语义信息。相似文献

4.

列数据库的SQL查询语句编译与优化

下载免费PDF全文

甄真陈虎张林亚《计算机工程》2013,39(6)

基于多核CPU和GPU异构平台的列数据库可用于海量数据和复杂查询,但其优化主要集中在底层,并且后端的执行序列只能通过手工硬编码生成,不能适应多样的SQL查询语句.针对该问题,设计并实现一个将SQL查询语句自动转化成执行序列的编译器,研究多个复杂表达式中的公共子表达式消除和原语依赖图合并方法.与手工编码的比较结果表明,该编译器能够提高算术表达式的计算速度,缩短执行SQL查询语句的时间. 相似文献

5.

数据库中文查询对偶学习式生成SQL语句研究

赵志超游进国何培蕾李晓武《中文信息学报》2023,(3):164-172

针对当前中文NL2SQL (Natural language to SQL)监督学习中需要大量标注数据问题,该文提出基于对偶学习的方式在少量训练数据集上进行弱监督学习,将中文查询生成SQL语句。该文同时使用两个任务来训练自然语言转化到SQL,再从SQL转化到自然语言,让模型学习到任务之间的对偶约束性,获取更多相关的语义信息。同时在训练时使用不同比例带有无标签的数据进行训练,验证对偶学习在NL2SQL解析任务上的有效性。实验表明,在不同中英文数据集ATIS、GEO以及TableQA中,本文模型与基准模型Seq2Seq、Seq2Tree、Seq2SQL、以及-dual等相比,百分比准确率至少增加2.1%,其中在中文TableQA数据集上采用对偶学习执行准确率(Execution Accuracy)至少提升5.3%,只使用60%的标签数据就能取得和监督学习使用90%的标签数据相似的效果。相似文献

6.

基于自然语言处理技术的政务智能搜索引擎应用探索

姚俊华汤代佳《软件工程》2023,(2):59-62+58

为适应问答系统智能化程度越来越高的特点,提出基于自然语言处理转化为SQL(StructuredQuery Language,结构化查询语言)语言技术的政务智能搜索引擎系统。用户通过输入自然语言问题直接获得相关数据,数据可以表格、图形等方式直观地显示。建立了融合SQL语法和增强列信息的算法模型SQLModel,利用NL2SQL(Natural LanguageProcessingtoStructuredQueryLanguage,自然语言转化为结构化查询语言)技术设计政务智能搜索引擎系统,并以某市的人口数据进行实验。实验结果表明,该技术可有效降低数据应用的复杂度,实现多维度复杂查询,降低业务部门数据搜索应用难度,提高政务数据搜索效率。相似文献

7.

融合LSTM的自然语言转结构化查询语句算法的研究与设计

孙红黄瓯严《小型微型计算机系统》2023,(1):63-67

自然语言转结构化查询语句(Natural Language to SQL,NL2SQL)是信息领域一个重要课题.目前前沿的NL2SQL工作都是针对英文数据集，而处理英文数据的方法直接应用到中文上往往难以取得很好的效果.本文首先对传统的SQLNet模型进行了改进，在其中融入了预训练模型，增强了其提取特征的能力；之后又分别对分类模型和条件值模型进行了改进：在分类模型中增加了LSTM进一步捕捉特征，在条件值模型中使用正则表达式等手段对特殊的条件子句进行了预处理.实验表明，本文对分类模型和条件值模型所做的改进都能有效提升模型的表达效果. 相似文献

8.

基于复述的中文自然语言接口

张俊驰胡婕刘梦赤《计算机应用》2016,36(5):1290-1295

针对传统以句法分析为主的数据库自然语言接口系统识别用户语义准确率不高,且需要大量人工标注训练语料的问题,提出了一种基于复述的中文自然语言接口(NLIDB)实现方法。首先提取用户语句中表征数据库实体词,建立候选树集及对应的形式化自然语言表达;其次由网络问答语料训练得到的复述分类器筛选出语义最相近的表达;最后将相应的候选树转换为结构化查询语句(SQL)。实验表明该方法在美国地理问答语料(GeoQueries880)、餐饮问答语料(RestQueries250)上的F1值分别达到83.4%、90%,均优于句法分析方法。通过对比实验结果发现基于复述方法的数据库自然语言接口系统能更好地处理用户与数据库的语义鸿沟问题。相似文献

9.

基于模糊算法的数据库查询工具的设计 总被引：16，自引：5，他引：11

周泓徐小良汪乐宇《计算机应用研究》2001,18(5):15-17

利用SQL进行数据库查询,可以完成复杂条件的数据查询,但是,它只能表示和处理精确数据,无法表达自然语言中的模糊。针对具体人员信息数据库,提出了基于人员年龄、身高与体重的模糊单词、模糊隶属函数、模糊算子的表达式,并提出了模糊SQL语句的自动生成方法。提出并实现的基于模糊算法的人员信息数据库查询工龄,已经应用于暂住人口IC卡管理信息系统,并对其它相关模糊数据库查询具有有较好的参考意义。相似文献

10.

数据库汉语自然语言查询模型研究 总被引：1，自引：1，他引：1

许龙飞唐世渭《计算机科学》1999,26(8):43-46

1.引言近年来国内数据库中文查询界面中,汉语查询模型主要有类关系代数表达式的中间语言转换模型,数据库E-R语义的汉语查询模型以及以条件为中心的语言理解模型等,作者仅就以上模型的长处和不足提出一种新的基于数据库E-R语义的查询模型。该模型的主要特点是采用数据库E-R语义理解模型,摆脱纯语言学理论的传统框架,将汉语查询句子与其指称的数据库模型的语义以及背景知识相结合,建立类SQL的表格式中间语言MQL, 相似文献

11.

基于Ontology的数据库自然语言查询接口的研究 总被引：3，自引：1，他引：2

李虎田金文王缓缓石勇《计算机科学》2010,37(6):200-205

提出了一种基于Ontology的关系数据库自然语言查询接口的系统模型及设计框架.采用WordNet作为基本数据库并在WordNet之上定义领域词库,可以提高语法分析的识别率;同时利用Ontlogly知识表达能力存储关系数据库概念模型,并对概论模型的内容进行扩充;另外对Ontology和WordNet的同义词集进行关联,可以提高语义的识别率.用户的输入查询语句通过语法分析、语义分析生成中间表达式语言DRS,然后通过模板技术转换成SQL,通过DBMS执行SQL并返回结果.实验证明,这种方案不但实用可行,而且通过逐步完善Ontology知识库的定义,可以大大提高查询的命中率;另外通过WordNet和Ontology定义领域词库和领域知识,提高了系统的可移植性.最后,所提供的方法可以很容易地移植到其他领域. 相似文献

12.

Seamlessly integrating similarity queries in SQL

M. C. N. Barioni H. L. Razente A. J. M. Traina C. Traina Jr 《Software》2009,39(4):355-384

Modern database applications are increasingly employing database management systems (DBMS) to store multimedia and other complex data. To adequately support the queries required to retrieve these kinds of data, the DBMS need to answer similarity queries. However, the standard structured query language (SQL) does not provide effective support for such queries. This paper proposes an extension to SQL that seamlessly integrates syntactical constructions to express similarity predicates to the existing SQL syntax and describes the implementation of a similarity retrieval engine that allows posing similarity queries using the language extension in a relational DBMS. The engine allows the evaluation of every aspect of the proposed extension, including the data definition language and data manipulation language statements, and employs metric access methods to accelerate the queries. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

13.

Enriching the conceptual basis for query formulation through relationship semantics in databases

《Information Systems》2001,26(6):445-475

The rapid increase in end-user computing calls into question the suitability of existing database query languages (DBQLs). Because the typical DB end-user is not a DB specialist, it is essential that DBQLs use concepts that are as close as possible to those in the end-users’ cognitive mental model and adopt interface techniques that are suited to end-users’ abilities. Concept-based query languages are well suited for this. This realization has motivated further research in conceptual, or semantic, query approaches. However, the primary focus in this field has been on semantic query optimization, not on query formulation. In this study, we address ourselves to the problem of formulation of queries using concepts. We propose a concept-based query language, called the conceptual query language (CQL), which allows for the conceptual abstraction of database queries and exploits the rich semantics of data models to ease and facilitate query formulation.The CQL approach uses the relationship semantics of semantic data models to render transparent the technical complexities of existing DB query languages. Association semantics are also used to automatically construct query graphs and pseudo-natural language explanations of queries, and to generate SQL codes. A set theoretic formalism for conceptual queries is developed and used. This paper discusses the design of CQL, its expressive power, its implementation, and the strategies for CQL query processing. The implementation of a CQL prototype is briefly discussed in this paper. User experiments were carried out extensively and showed the advantage of CQL over alternative languages such as SQL. 相似文献

14.

基于SQL的HBase查询的设计与实现

袁兆争 邵秀丽 闫凯境 李丹 郭建军 《计算机与现代化》2017,(7):20

在互联网和大数据时代下,SQL关系型数据库已不能适应与日俱增的数据量,HBase等NoSQL数据库变得极为重要。但HBase数据库操作较为复杂,本文设计并实现基于SQL的HBase查询,可以使HBase的使用者通过熟悉的SQL查询语句操作HBase数据库。首先构建SQL语言的编译器,将SQL语句转化成语法树,再将语法树转化为HBase的相关操作。使用协处理器处理SQL查询语句中的聚合函数和复杂表达式,并可以使用连接查询。相似文献

15.

Extending a Natural Language Interface with Geospatial Queries

Chintaphally V.R. Neumeier K. McFarlane J. Cothren J. Thompson C.W. 《Internet Computing, IEEE》2007,11(6):82-85

In this installation of architectural perspectives, we describe an extension of a menu-based natural language interface (MBNLI) to support geospatial queries. Our extension makes it easier for application analysts and even inexperienced users to phrase complex queries without knowing the relational database query language SQL, database schemas (table structures), spatial operators, or spatial indexes. 相似文献

16.

Detecting coherent explorations in SQL workloads

《Information Systems》2020

This paper presents a proposal aiming at better understanding a workload of SQL queries and detecting coherent explorations hidden within the workload. In particular, our work investigates SQLShare (Jain et al., 2016), a database-as-a-service platform targeting scientists and data scientists with minimal database experience, whose workload was made available to the research community. According to the authors of (Jain et al., 2016), this workload is the only one containing primarily ad-hoc hand-written queries over user-uploaded datasets. We analyzed this workload by extracting features that characterize SQL queries and we investigate three different machine learning approaches to use these features to separate sequences of SQL queries into meaningful explorations. The first approach is unsupervised and based only on similarity between contiguous queries. The second approach uses transfer learning to apply a model trained over a dataset where ground truth is available. The last approach uses weak labeling to predict the most probable segmentation from heuristics meant to label a training set. We ran several tests over various query workloads to evaluate and compare the proposed methods. 相似文献

17.

Rewriting rules to permeate complex similarity and fuzzy queries within a relational database system

Penzo W. 《Knowledge and Data Engineering, IEEE Transactions on》2005,17(2):255-270

In recent years, the availability of complex data repositories (e.g., multimedia, genomic, semistructured databases) has paved the way to new potentials as to data querying. In this scenario, similarity and fuzzy techniques have proven to be successful principles for effective data retrieval. However, most proposals are domain specific and lack of a general and integrated approach to deal with generalized complex queries, i.e., queries where multiple conditions are expressed, possibly on complex as well as on traditional data. To overcome such limitations, much work has been devoted to the development of middleware systems to support query processing on multiple repositories. On a similar line, We present a formal framework to permeate complex similarity and fuzzy queries within a relational database system. As an example, we focus on multimedia data, which is represented in an integrated view with common database data. We have designed an application layer that relies on an algebraic query language, extended with MM-tailored operators, and that maps complex similarity and fuzzy queries to standard SQL statements that can be processed by a relational database system, exploiting standard facilities of modern extensible RDBMS. To show the applicability of our proposal, we implemented a prototype that provides the user with rich query capabilities, ranging from traditional database queries to complex queries gathering a mixture of Boolean, similarity, and fuzzy predicates on the data. 相似文献

18.

Semantic interpretation of a database query language

Eva M. Mueckstein Galina Datskovsky Moerdler 《Data & Knowledge Engineering》1985,1(2):123-138

In this paper, we will discuss a system that semantically interprets a formal database accessing language and generates natural language from this interpretation. In the past, the major way of communication between a user and a database was by means of a formal language. One such language is the SQL query language. Even though constructed as a user friendly language, SQL exemplifies the same difficulties for users as do other formal languages, namely a fairly rigid syntax, the necessity of variable binding, the lack of pronouns, and in the case of erroneous queries error messages that do not provide much insight. To alleviate some of the formal language problems, yet utilize the power of the formal language, we set out to build a natural language ‘umbrella’ for the SQL user. Our goal was not to build a natural language query system, but rather to use semantic knowledge and natural language for paraphrasing the formal language (SQL) and producing error messages as a feedback mechanism. In this way we build a genuine help facility, which would not only aid the user in dealing with SQL, but also trap erroneous queries. 相似文献