首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
与印欧语言不同,汉语的句子往往是由多个分句组成的复句。但目前的中文语义角色的标注语料和标注系统并没有对现代汉语的这个特点给予充分的重视。由于数据稀疏的问题,对于与动词跨分句的论元还没有一个有效的识别方法,直接影响了汉语真实文本语义角色标注的研究。运用统计和规则结合的方法,对与动词跨分句的论元进行识别。先用一条基本的规则识别出大部分的动词的论元,再找到规则识别的薄弱点,运用统计决策树融合多种特征构造模型,以进一步提高识别的准确率。实验结果表明,对于与动词的跨分句的论元,仅仅规则识别的F值就达到了65.3%,使用决策树后,F值提高到67.2%。  相似文献   

3.
The study of spatial folding of peptides is a very difficult task needing time-consuming elaborations. The complexity of the problem demands tools that predict in a simple manner basic properties such as the secondary structure starting from the amino acid sequence, which contains all the information necessary for the determination fo the folding of a protein. The study of secondary structure is of considerable interest, in particular the prediction of regular structures, because these regions, like alpha-helices and beta-sheets, may form nucleation sites (M.J.E. Sternberg and J.M. Thornton, Nature 2H (1978) 15-20; B. Robson and R.H. Pain, Biochem. J. 155 (1976) 331-344). The aim of this paper is to propose a procedure for the secondary structure prediction, based on statistics (B. Robson and J. Garnier, Introduction to Proteins and Protein Engineering (Elsevier, Amsterdam, 1986); J. Garnier, D.J. Osguthorpe and B. Robson, J. Mol. Biol. 120 (1977) 97-120) and heuristic rules, also taking into account experimental data.  相似文献   

4.
计算机辅助质量(CAQ)是计算集成制造系统(CIMS)不可分割的重要组成部分,而质量控制图是其中的关键技术。分析了传统控制图应用于CIM系统存在的缺陷和不足,提出了适于小批量生产的均值(x^-)控制图,使其在产品批量较少或生产加工的初期阶段等环境下得到很好的应用,而且随着检测数据的增加,其控制界限与传统方法得到的控制界限相一致,同样适合于大批量的生产环境。  相似文献   

5.
Inventory management is an important area of production control. In 1999, Pfohl et al. [Pfohl, H.-C., Cullmann, O., & Stölzle, W. (1999). Inventory management with statistical process control: Simulation and evaluation. Journal of Business Logistics, 20, 101–120] developed a real-time inventory decision support system by using the individual control charts for monitoring the inventory level (i.e., stock quantity) and the market demand, in which a series of decision rules are provided to help the inventory manager to determine the time and the quantity to order. In the present paper, a real-time inventory decision system is proposed by incorporating Western Electric run rules into the decision rules of the system. Since the data of demand sometimes present a pattern of time series (i.e., autocorrelation may exist in the data of demand), in the proposed decision system the ARMA control chart is employed to monitor the market demand and the individual control chart is used to monitor the inventory level. A simulation study is conducted to investigate the effects of demand pattern and autocorrelation on the proposed inventory decision system and to verify the effectiveness of the system. The index “service level” is selected as the key indicator for the system performance. Based on the results of the simulation study, it is shown that the performance of the proposed inventory decision system is quite consistent with service level always greater than 90% for various demand patterns.  相似文献   

6.
基于统计和规则相结合的科技术语自动抽取研究   总被引:4,自引:0,他引:4  
科技术语自动抽取是中文信息处理领域的一个重要研究课题,在信息检索、机器翻译等领域,特别是在专利翻译中有着广泛应用。结合专利翻译任务,主要研究专利中科技术语的识别方法,在分析目前已有方法的基础之上,提出了一种使用条件随机场模型进行标注识别,并结合规则对错误识别结果进行后处理的科技术语识别方法。实验结果表明,提出的统计和规则相结合的识别方法是有效的,开放测试结果F值达到了84.4%。  相似文献   

7.
并列结构的自动识别是语言信息处理中的难点,采用统计和规则相结合的方法对并列结构的边界进行了识别。首先,根据连接词的位置,使用最大熵模型分别从左和从右识别出并列结构的左边界和右边界;接着,根据并列结构的特性对自动识别的左右边界使用预定义的规则进行后处理,得到最终左右边界。实验的训练集和测试分别包含12 396和1 219个并列结构。实验表明,该方法性能达到了78.1%,其中后处理加入规则的使用提高了3.4%。  相似文献   

8.
统计与规则相结合的维吾尔语句子边界识别   总被引:1,自引:0,他引:1       下载免费PDF全文
句子边界识别是词性标注和句法分析等自然语言处理系统的基础问题。提出了一种统计与规则相结合的维吾尔语句子边界识别方法,首先利用歧义段落分类算法分类段落,第二步对无歧义段落进行基于规则的句子边界识别,最后使用最大熵模型对有歧义段落进行句子边界识别。该方法有效利用规则弥补最大熵模型因数据稀疏而误判不存在任何歧义情况的不足,使用最大熵模型有效地消除歧义,提高算法的鲁棒性,召回率达到了98.77%。  相似文献   

9.
基于统计和规则的常用词的兼类识别研究   总被引:1,自引:0,他引:1  
词的兼类问题是汉语词性标注中的关键问题之一.针对常用词的兼类识别进行研究,综合考虑了影响兼类词识别的不同特征,分别使用条件随机场模型、最大熵模型和k最近邻等统计方法,根据兼类词本身的特点以及其在上下文句子中的关系,同时针对不同的方法采用词语信息、词性信息等不同的特征模板分别对训练语料进行特征抽取,并取得了较好的实验结果;对一些识别结果不够理想的词又尝试了规则的方法,构建兼类词的规则,不断进行测试,改进规则库,在相同的条件下,得到了优于统计方法的实验结果.  相似文献   

10.
A demerit control chart with linguistic weights   总被引:1,自引:0,他引:1  
A classical demerit control chart is used to monitor counts of several different categories of defects simultaneously in a complex product. The traditional recommendation is to plot the demerit statistic, a weighted sum of the number of defects of each category, on a control chart. Such approach assumed that the severe degree of the same category is equally treated and a crisp weight is assigned subjectively. Furthermore, the assignment of an actual and crisp weight to each category is somewhat difficult for process and quality engineers. A linguistic variable to represent the importance and severity is more suitable. Thus, on the basis of the fuzzy set theory, the fuzzy demerit control chart which uses linguistic weights to represent the severe degree of each category is proposed. The procedure of constructing the proposed chart is described in five steps. In addition, a fuzzy ranking method using -cuts is adopted to generate the crisp statistic and control limits in coordination with the custom of classical control charts. A guideline is suggested for deciding the values of and the width of control limits. By a numerical example, the results show that such approach can provide more realistic modeling to monitor the number of demerits per inspection unit and identify the process variation.This revised version was published in June 2005 with corrected page numbers.  相似文献   

11.
Nonparametric control charts can provide a robust alternative in practice to the data analyst when there is a lack of knowledge about the underlying distribution. A nonparametric exponentially weighted moving average (NPEWMA) control chart combines the advantages of a nonparametric control chart with the better shift detection properties of a traditional EWMA chart. A NPEWMA chart for the median of a symmetric continuous distribution was introduced by Amin and Searcy (1991) using the Wilcoxon signed-rank statistic (see Gibbons and Chakraborti, 2003). This is called the nonparametric exponentially weighted moving average Signed-Rank (NPEWMA-SR) chart. However, important questions remained unanswered regarding the practical implementation as well as the performance of this chart. In this paper we address these issues with a more in-depth study of the two-sided NPEWMA-SR chart. A Markov chain approach is used to compute the run-length distribution and the associated performance characteristics. Detailed guidelines and recommendations for selecting the chart’s design parameters for practical implementation are provided along with illustrative examples. An extensive simulation study is done on the performance of the chart including a detailed comparison with a number of existing control charts, including the traditional EWMA chart for subgroup averages and some nonparametric charts i.e. runs-rules enhanced Shewhart-type SR charts and the NPEWMA chart based on signs. Results show that the NPEWMA-SR chart performs just as well as and in some cases better than the competitors. A summary and some concluding remarks are given.  相似文献   

12.
为了适应Web新闻以指数趋势增长,传播迅速,且Web突发事件新闻在互联网上散布等特点,同时针对传统文本分类方法准确率和效率低,寻找特定主题的突发事件新闻信息难等问题,提出一种基于规则与统计相结合的Web突发事件新闻多层次自动分类方法。首先提取类别关键词形成规则库,然后利用分类规则将突发事件分成四大类,再用朴素贝叶斯分类方法将各大类突发事件新闻进行细分,从而形成了基于规则与统计的两层分类模型。实验结果表明,该分类方法的准确率和召回率都达到90%以上,分类效率也普遍高于传统的分类方法。  相似文献   

13.
为了提高控制图模式识别的精度, 将控制图模式的原始特征与形状特征相融合得到分类特征, 并采用支持向量机进行模式分类的控制图模式识别。融合所得特征既保持了控制图模式的原始特征所蕴涵的模式全局特性信息, 又通过引入形状特征对部分易混淆模式的局部几何特性进行强化, 使不同模式间的区分度得到有效提高; 而以支持向量机作为模式分类器保证方法在高维度特征和小样本条件下也能获得较好的识别性能。仿真实验结果表明所提方法的识别精度相比其他几种基于形状特征的控制图模式识别方法有明显提高。  相似文献   

14.
对介词用法自动识别的研究是现代汉语虚词用法知识库建设的重要组成部分.在已有工作的基础上,分析对比了规则方法与统计方法的优劣,提出一种规则与条件随机场统计模型相结合的介词用法自动识别算法.该算法在2000年2月-5月《人民日报》语料的介词用法自动识别测试中,准确率比单独使用规则方法和统计方法分别提高了14.64%及5.22%.  相似文献   

15.
数学问题自动求解是人工智能领域的一项重要工作。以应用题自动求解为目标,以高考入学考试数学试卷中的分层抽样应用题为研究对象,重点研究了分层抽样应用题的句子语义角色识别方法。根据分层抽样的原理,首先定义了分层抽样题意表征中的五种核心语义角色,分别为:总体、样本、总体中的层、样本中的层和实体之间的关系。基于这五种语义角色,应用题题意理解中的核心问题被转换为对应用题文本中的句子进行语义角色判定。提出了一种基于特征词与n-gram模型相结合的句子语义角色判定方法,对分层抽样应用题文本中的句子进行语义角色判定。根据测试集中的实验结果,应用题的整题识别准确率由基于特征词的判定方法的17.95%提高到64.1%。实验结果说明基于特征词与n-gram模型相结合的句子语义角色判定方法能够提高题意理解的准确率。  相似文献   

16.
中文领域术语自动抽取是中文信息处理中的一项基础性课题,并在自然语言生成、信息检索、文本摘要等领域中有广泛的应用。针对领域术语抽取问题,采用基于规则和多种统计策略相融合的方法,从词语度和领域度两个角度出发,提出一种领域术语的抽取算法并构建出相应的抽取系统。系统流程包括基于左右信息熵扩展的候选领域术语获取、基于词性搭配规则与边界信息出现概率知识库相结合的词语度筛选策略以及基于词频-逆文档频率(TF?IDF)的领域度筛选策略。运用此算法不但能抽取出领域的常见用词,还可以挖掘出领域新词。实验结果显示,基于如上方法构建的领域术语抽取系统的准确率为84.33%,所提方法能够有效支持中文领域术语的自动抽取。  相似文献   

17.
《微型机与应用》2020,(1):92-99
针对传统的有轨电车能耗分析方法缺少对异常数据的判断分析,难以有效发现能耗数据特点以及各数据之间潜在关系等问题,提出结合控制图与灰关联分析法的有轨电车能耗分析集成方法。利用控制图对能耗数据波动情况进行分析,并结合灰关联数据分析算法对影响能耗数据因素进行分析,通过利用已开通运营的广州有轨电车能耗数据进行仿真实验,实验结果表明有轨电车能耗分析法能够更好地对各项能耗数据波动情况进行监测,实现对能耗异常数据识别、找出影响能耗的相关因素,为有轨电车系统节能提供辅助决策,可为已运营的有轨电车系统能耗数据分析提供一定参考。  相似文献   

18.
Control charts based on generalized likelihood ratio test (GLRT) are attractive from both theoretical and practical points of view. Most of the existing works in the literature focusing on the detection of the process mean and variance are almost based on the assumption that the shifts remain constant over time. The case of the patterned mean and variance changes may not be well discussed. In this research, we propose a new control chart which integrates the exponentially weighted moving average (EWMA) procedure with the GLRT statistics to monitor the process with patterned mean and variance shifts. The attractive advantage of our control chart is its reference-free property. Due to the good properties of GLRT and EWMA procedures, our simulation results show that the proposed chart provides quite effective and robust detecting ability for various types of shifts. The implementation of our proposed control chart is illustrated by a real data example from chemical process control.  相似文献   

19.

Control charts are commonly used tools in statistical process control for the detection of shifts in process parameters. Shewhart-type charts are efficient for large shift values, whereas cumulative sum (CUSUM) charts are effective in detecting medium and small shifts. Control chart use commonly assumes that data are free of outliers and parameters are known or correctly estimated based on an in-control process. In practice, these assumptions are not often true because some processes occasionally have outliers. Monitoring the location parameter is usually based on mean charts, which are seriously affected by violations of these assumptions. In this paper we propose several CUSUM median control charts based on auxiliary variables, and offer comparisons with their corresponding mean control charts. To monitor the location parameter, we examined the performance of mean and median control charts in the presence and absence of outliers. Both symmetric and non-symmetric processes were studied to examine the properties of the proposed control charts to monitor the location parameter using CUSUM control charts. We used different run length measures to study in-control and out-of-control performances of CUSUM charts. Results revealed that our proposed control charts perform much better than the traditional charts in the presence of outliers. A real application of our study was provided using data on concrete compressive strength as it relates to the quality of cement manufacturing.

  相似文献   

20.
结合微博新词的构词规则自由度大和极其复杂的特点,针对传统的C/NC-value方法抽取的结果新词边界的识别准确率不高,以及低频微博新词无法正确识别的问题,提出了一种融合人工启发式规则、C/NC-value改进算法和条件随机场(CRF)模型的微博新词抽取方法。一方面,人工启发式规则是指对微博新词的分类和归纳总结,并从微博新词构词的词性(POS)、字符类别和表意符号等角度设计的微博新词的构词规则;另一方面,改进的C/NC-value方法通过引入词频、邻接熵和互信息等统计量来重构NC-value目标函数,并使用CRF模型训练和识别新词,最终达到提高新词边界识别准确率和低频新词识别精度的目的。实验结果显示,与传统方法相比,所提出的方法能有效地提高微博新词识别的F值。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号