共查询到20条相似文献,搜索用时 109 毫秒
1.
左右熵在自然语言处理领域有着广泛应用,但目前尚无有效方法实施大规模语料中海量模式的左右熵快速计算。提出了一种计算方法,对于某长度字串计算熵,首先按长度提取语料中的全部字串,使用外部排序和归并获取字串的出现频率,然后分别剔除首尾字符构造待计算字串的频率提供文件,最后使用文件记录频率对比来计算右熵和左熵。分析和实验表明,该方法的计算量同语料规模成线性关系,适于大规模语料中海量字串的左右熵计算。 相似文献
2.
3.
基于海量语料的热点新词识别是汉语自动处理领域的一项基础性课题,因要求快速处理大规模语料,且在新词检测中需要更多智力因素,在研究中存在较多困难。构建了一个基于海量语料的网络热点新词识别框架,整合了所提出的基于逐层剪枝算法的重复模式提取,基于统计学习模型的新词检测及基于组合特征的新词词性猜测等3个重要算法,用以提高新词识别的处理能力和识别效果。实验和数据分析表明,该框架能高效可靠地从大规模语料中提取重复模式,构造候选新词集合,并能有效实施新词检测和新词属性识别任务,处理效果达到了目前的较好水平。 相似文献
4.
现有的排序算法很难实现自定义顺序的字符串排序,提出一种自定义顺序的字符串快速排序方法.在应用连续编号定义字符排序顺序的基础上,使用哈希表结构将字符串转换成对应的整型数组,以字符的最大编号作为基数排序算法的新基数,实现字符串的基数排序.分析和实验表明,本文方法可有效实现自定义顺序的字符串排序,是一个时间和空间复杂度都是线性的排序算法,比快速排序(Quick Sort)具有更好的时间性能,且可以方便地推广到其它语言的字串排序中. 相似文献
5.
对中文字符串排序,最快算法的时间复杂度是O(nlgn)。基数排序算法是目前最快的排序方法之一,时间复杂度是O(dn),但其一般适用于相同长度的整型数据排序。提出了一种快速的变换方法,将字符串转换为与之等长的整型数组,使用基数排序算法对代表字串的整型数组排序,用以实现对字符串的快速排序。实验表明,提出的算法能快速地进行中文字符串排序,比快速排序算法具有更好的性能,且排序时间与数据规模之间是线性关系,算法的时间复杂度为O(dn)。 相似文献
6.
7.
8.
随着科技的发展,磁盘分区的备份与还原在生产和生活中得到了广泛的应用。将以Partclone开源软件为基础,提出基于链式描述符索引方式的分区备份算法,算法中将分区中相连的数据块合并成I/O簇,并以链式描述符作为I/O簇的索引,从而进行磁盘分区数据的备份与还原。该算法通过减少备份还原过程中的读写次数,缩短分区备份还原的时间。同时对该算法下的实验数据进行分析,并对链式描述符索引方式的优化进行讨论。 相似文献
9.
以DBSCAN算法为基础,提出一种基于四叉树的快速聚类算法。新算法选择处于核心点的中空球形邻域中的点作为种子点来扩展类,大大减少区域查询的次数,降低I/O开销;使用快速生成的四叉树进行区域查询,在提高查询效率的同时,有效缩短构造空间索引的时间。文中对二维模拟数据和真实数据进行测试,结果表明新算法是有效的。 相似文献
10.
分布式网络爬虫的广泛应用使得搜索引擎的数据规模呈几何式增长,面对数以TB甚至PB量级的数据,单机模式下的PageRank算法由于CPU、I/O和内存的开销过大导致效率低下。为此,提出一种基于MapReduce框架的并行PageRank算法。在算法的一次迭代过程中,利用Map函数对网页拓扑信息文件进行解析,使用Reduce函数计算网页得分,从而并行化PageRank算法的中间迭代过程。通过计算全局网页得分控制迭代次数,得到较精确的网页排序结果。实验结果表明,该算法在保持原有单机PageRank算法整体网页排序精度的基础上,具有较好的集群性能和较快的执行速度。 相似文献
11.
European Community policy and the market 总被引:1,自引:0,他引:1
C. Lloyd 《Journal of Computer Assisted Learning》1993,9(2):86-91
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven. 相似文献
12.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。 相似文献
13.
David Poole 《Computational Intelligence》1989,5(2):97-110
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given. 相似文献
14.
Watts S. Humphrey 《Annals of Software Engineering》2002,14(1-4):39-72
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical. 相似文献
15.
基于复小波噪声方差显著修正的SAR图像去噪 总被引:4,自引:1,他引:3
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。 相似文献
16.
R. NOSS 《Journal of Computer Assisted Learning》1987,3(1):2-12
Abstract This paper considers some results of a study designed to investigate the kinds of mathematical activity undertaken by children (aged between 8 and 11) as they learned to program in LOGO. A model of learning modes is proposed, which attempts to describe the ways in which children used and acquired understanding of the programming/mathematical concepts involved. The remainder of the paper is concerned with discussing the validity and limitations of the model, and its implications for further research and curriculum development. 相似文献
17.
C. Xiang 《控制理论与应用(英文版)》2014,12(3):217-217
正The demands of a rapidly advancing technology for faster and more accurate controllers have always had a strong influence on the progress of automatic control theory.In recent years control problems have been arising with increasing frequency in widely different areas,which cannot be addressed using conventional control techniques.The principal reason for this is the fact that a highly competitive economy is forcing systems to operate in regimes where 相似文献
18.
《浙江大学学报:C卷英文版》2014,(7)
正Aim The Journals of Zhejiang University-SCIENCE(A/B/C)areedited by the international board of distinguished Chinese andforeign scientists,and are aimed to present the latest devel-opments and achievements in scientific research in China andoverseas to the world’s scientific circles,especially to stimulateand promote academic exchange between Chinese and for-eign scientists everywhere. 相似文献
19.
Alan V. Di Vittorio 《Remote sensing of environment》2009,113(9):1948-2750
The relative concentrations of different pigments within a leaf have significant physiological and spectral consequences. Photosynthesis, light use efficiency, mass and energy exchange, and stress response are dependent on relationships among an ensemble of pigments. This ensemble also determines the visible characteristics of a leaf, which can be measured remotely and used to quantify leaf biochemistry and structure. But current remote sensing approaches are limited in their ability to resolve individual pigments. This paper focuses on the incorporation of three pigments—chlorophyll a, chlorophyll b, and total carotenoids—into the LIBERTY leaf radiative transfer model to better understand relationships between leaf biochemical, biophysical, and spectral properties.Pinus ponderosa and Pinus jeffreyi needles were collected from three sites in the California Sierra Nevada. Hemispheric single-leaf visible reflectance and transmittance and concentrations of chlorophylls a and b and total carotenoids of fresh needles were measured. These data were input to the enhanced LIBERTY model to estimate optical and biochemical properties of pine needles. The enhanced model successfully estimated reflectance (RMSE = 0.0255, BIAS = 0.00477, RMS%E = 16.7%), had variable success estimating transmittance (RMSE = 0.0442, BIAS = 0.0294, RMS%E = 181%), and generated very good estimates of carotenoid concentrations (RMSE = 2.48 µg/cm2, BIAS = 0.143 µg/cm2, RMS%E = 20.4%), good estimates of chlorophyll a concentrations (RMSE = 10.7 µg/cm2, BIAS = − 0.992 µg/cm2, RMS%E = 21.1%), and fair estimates of chlorophyll b concentrations (RMSE = 7.49 µg/cm2, BIAS = − 2.12 µg/cm2, RMS%E = 43.7%). Overall root mean squared errors of reflectance, transmittance, and pigment concentration estimates were lower for the three-pigment model than for the single-pigment model. The algorithm to estimate three in vivo specific absorption coefficients is robust, although estimated values are distorted by inconsistencies in model biophysics. The capacity to invert the model from single-leaf reflectance and transmittance was added to the model so it could be coupled with vegetation canopy models to estimate canopy biochemistry from remotely sensed data. 相似文献
20.
Philip Marks 《Cryptologia》2018,42(1):1-80
This article discusses the history and design of the special versions of the bombe key-finding machines used by Britain’s Government Code & Cypher School (GC&CS) during World War II to attack the Enigma traffic of the Abwehr (the German military intelligence service). These special bombes were based on the design of their more numerous counterparts used against the traffic of the German armed services, but differed from them in important ways that highlight the adaptability of the British bombe design, and the power and flexibility of the diagonal board. Also discussed are the changes in the Abwehr indicating system that drove the development of these machines, the ingenious ways in which they were used, and some related developments involving the bombes used by the U.S. Navy’s cryptanalytic unit (OP-20-G). 相似文献