首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Web is a universal repository of human knowledge and culture which has allowed unprecedented sharing of ideas and information in a scale never seen before. It can also be considered as a universal digital library interconnecting digital libraries in multiple domains and languages. Beside the advance of information technology, the global economy has also accelerated the development of inter-organizational information systems. Managing knowledge obtained in multilingual information systems from multiple geographical regions is an essential component in the contemporary inter-organization information systems. An organization cannot claim itself to be a global organization unless it is capable to overcome the cultural and language barriers in their knowledge management. Cross-lingual semantic interoperability is a challenge in multilingual knowledge management systems. Dictionary is a tool that is widely utilized in commercial systems to cross the language barrier. However, terms available in dictionary are always limited. As language is evolving, there are new words being created from time to time. For examples, there are new technical terms and name entities such as RFID and Baidu. To solve the problem of cross-lingual semantic interoperability, an associative constraint network approach is investigated to construct an automatic cross-lingual thesaurus. In this work, we have investigated the backmarking algorithm and the forward evaluation algorithm to resolve the constraint satisfaction problem represented by the associative constraint network. Experiments have been conducted and show that the forward evaluation algorithm outperforms the backmarking one in terms of precision and recall but the backmarking algorithm is more efficient than the forward evaluation algorithm. We have also benchmarked with our earlier technique, Hopfield network, and showed that the associate constraint network (either backmarking or forward evaluation) outperforms in precision, recall, and efficiency.  相似文献   

2.
跨语言文档聚类主要是将跨语言文档按照内容或者话题组织为不同的类簇。该文通过采用跨语言词相似度计算将单语广义向量空间模型(Generalized Vector Space Model, GVSM)拓展到跨语言文档表示中,即跨语言广义空间向量模型(Cross-Lingual Generalized Vector Space Model,CLGVSM),并且比较了不同相似度在文档聚类下的性能。同时提出了适用于GVSM的特征选择算法。实验证明,采用SOCPMI词汇相似度度量算法构造GVSM时,跨语言文档聚类的性能优于LSA。  相似文献   

3.
4.
当今句子摘要研究主要针对单语,即源端句子和目标端摘要短语属于同种语言,然而单语句子摘要严重制约了不同语言文本信息的快速获取。为解决该问题,提出一种跨语言句子摘要系统。借鉴回译思想,将单语句子摘要平行语料中的源端通过神经机器翻译系统翻译成另一种语言,将其与句子摘要平行语料中目标端的摘要短语共同构成跨语言的伪平行语料。在此基础上,利用对比注意力机制,实现目标端与源端序列中不相关信息的获取,解决了传统注意力机制中源端和目标端句子长度不匹配的问题。实验结果表明,与基于管道方法的单语句子摘要系统相比,该跨语言系统生成的摘要短语更流畅且符合人类语言表述方式,可达到接近单语的句子摘要水平。  相似文献   

5.
随着人们对互联网多语言信息需求的日益增长,跨语言词向量已成为一项重要的基础工具,并成功应用到机器翻译、信息检索、文本情感分析等自然语言处理领域。跨语言词向量是单语词向量的一种自然扩展,词的跨语言表示通过将不同的语言映射到一个共享的低维向量空间,在不同语言间进行知识转移,从而在多语言环境下对词义进行准确捕捉。近几年跨语言词向量模型的研究成果比较丰富,研究者们提出了较多生成跨语言词向量的方法。该文通过对现有的跨语言词向量模型研究的文献回顾,综合论述了近年来跨语言词向量模型、方法、技术的发展。按照词向量训练方法的不同,将其分为有监督学习、无监督学习和半监督学习三类方法,并对各类训练方法的原理和代表性研究进行总结以及详细的比较;最后概述了跨语言词向量的评估及应用,并分析了所面临的挑战和未来的发展方向。  相似文献   

6.
针对传统跨语言词嵌入方法在汉越等差异较大的低资源语言上对齐效果不佳的问题,提出一种融合词簇对齐约束的汉越跨语言词嵌入方法。通过独立的单语语料训练获取汉越单语词嵌入,使用近义词、同类词和同主题词3种不同类型的关联关系,充分挖掘双语词典中的词簇对齐信息以融入到映射矩阵的训练过程中,使映射矩阵进一步学习到不同语言相近词间具有的一些共性特征及映射关系,根据跨语言映射将两种语言的单语词嵌入映射至同一共享空间中对齐,令具有相同含义的汉语与越南语词嵌入在空间中彼此接近,并利用余弦相似度为空间中每一个未经标注的汉语单词查找对应的越南语翻译构建汉越对齐词对,实现跨语言词嵌入。实验结果表明,与传统有监督及无监督的跨语言词嵌入方法Multi_w2v、Orthogonal、VecMap、Muse相比,该方法能有效提升映射矩阵在非标注词上的泛化性,改善汉越低资源场景下模型对齐效果较差的问题,其在汉越双语词典归纳任务P@1和P@5上的对齐准确率相比最好基线模型提升了2.2个百分点。  相似文献   

7.
A generalization of the Little–Hopfield neural network model for associative memories is presented that considers the case of a continuum of processing units. The state space corresponds to an infinite dimensional euclidean space. A dynamics is proposed that minimizes an energy functional that is a natural extension of the discrete case. The case in which the synaptic weight operator is defined through the autocorrelation rule (Hebb rule) with orthogonal memories is analyzed. We also consider the case of memories that are not orthogonal. Finally, we discuss the generalization of the non deterministic, finite temperature dynamics.  相似文献   

8.
基于深度学习的跨语言情感分析模型需要借助预训练的双语词嵌入(Bilingual Word Embedding,BWE)词典获得源语言和目标语言的文本向量表示.为了解决BWE词典较难获得的问题,该文提出一种基于词向量情感特征表示的跨语言文本情感分析方法,引入源语言的情感监督信息以获得源语言情感感知的词向量表示,使得词向量...  相似文献   

9.
Modern Web search engines still have many limitations: search terms are not disambiguated, search terms in one query cannot be in different languages, the retrieved media items have to be in the same language as the search terms and search results are not integrated across a live stream of different media channels, including TV, online news and social media. The system described in this paper enables all of this by combining a media stream processing architecture with cross-lingual and cross-modal semantic annotation, search and recommendation. All those components were developed in the xLiMe project.  相似文献   

10.
We systematically investigate the computational complexity of constraint satisfaction problems for constraint languages over an infinite domain. In particular, we study a generalization of the well-established notion of maximal constraint languages   from finite to infinite domains. If the constraint language can be defined with an ωω-categorical structure, then maximal constraint languages are in one-to-one correspondence to minimal oligomorphic clones. Based on this correspondence, we derive general tractability and hardness criteria for the corresponding constraint satisfaction problems.  相似文献   

11.
该文提出了一种以商品评论为对象的基于语义融合的跨语言情感分类算法。该算法首先从短文本语义表示的角度出发,基于开源工具Word2Vec预先生成词嵌入向量来获得不同语言下的信息表示;其次,根据不同语种之间的词向量的统计关联性提出使用自联想记忆关系来融合提取跨语言文档语义;然后利用卷积神经网络的局部感知性和权值共享理论,融合自联想记忆模型下的复杂语义表达,从而获得不同长度的短语融合特征。深度神经网络将能够学习到任意语种语义的高层特征致密组合,并且输出分类预测。为了验证算法的有效性,将该模型与最新几种模型方法的实验结果进行了对比。实验结果表明,此模型适用于跨语言情感语料正负面情感分类,实验效果明显优于现有的其他算法。  相似文献   

12.
基于概念空间的文本检索系统   总被引:10,自引:3,他引:10  
当前信息检索存在着信息过载和词汇不匹配的问题。文章提出了一种新的检索方式缓解这两个问题。这种检索方法在文本聚类的基础上,基于概念空间并与传统的关键词检索相结合能够帮助用户快速、准确地定位所需要查找的信息。文章将对这种检索方式进行介绍,并且着重介绍利用共现分析以及Hopfield网络生成概念空间。  相似文献   

13.
Hopfield网络,又称联想记忆网络。文中根据Hopfleld神经网络构造一个应用于计算机代码编程中的联想存储器。联想记忆是该存储器的重要功能,它具有信息记忆和信息联想的特点,能够从不完整的或模糊的信息联想出存储在记忆中的某个完整清晰的信息模式。根据这一原理,用H0pfield联想存储器知识和eclipse插件机制来搭建嵌入在eclipse开发工具中一个知识可拓展的动态帮助插件,实现根据残缺不全的java代码联想到完整的java代码的功能,并进一步阐述Hopfield神经网络在计算机代码编程中的应用前景和发展方向。  相似文献   

14.
This article presents a simulation study for validation of an adaptation methodology for learning weights of a Hopfield neural network configured as a static optimizer. The quadratic Liapunov function associated with the Hopfield network dynamics is leveraged to map the set of constraints associated with a static optimization problem. This approach leads to a set of constraint-specific penalty or weighting coefficients whose values need to be defined. The methodology leverages a learning-based approach to define values of constraint weighting coefficients through adaptation. These values are in turn used to compute values of network weights, effectively eliminating the guesswork in defining weight values for a given static optimization problem, which has been a long-standing challenge in artificial neural networks. The simulation study is performed using the Traveling Salesman problem from the domain of combinatorial optimization. Simulation results indicate that the adaptation procedure is able to guide the Hopfield network towards solutions of the problem starting with random values for weights and constraint weighting coefficients. At the conclusion of the adaptation phase, the Hopfield network acquires weight values which readily position the network to search for local minimum solutions. The demonstrated successful application of the adaptation procedure eliminates the need to guess or predetermine the values for weights of the Hopfield network.  相似文献   

15.
Cooperative updating in the Hopfield model.   总被引:2,自引:0,他引:2  
We propose a new method for updating units in the Hopfield model. With this method two or more units change at the same time, so as to become the lowest energy state among all possible states. Since this updating algorithm is based on the detailed balance equation, convergence to the Boltzmann distribution is guaranteed. If our algorithm is applied to finding the minimum energy in constraint satisfaction and combinatorial optimization problems, then there is a faster convergence than those with the usual algorithm in the neural network. This is shown by experiments with the travelling salesman problem, the four-color problem, the N-queen problem, and the graph bi-partitioning problem. In constraint satisfaction problems, for which earlier neural networks are effective in some cases, our updating scheme works fine. Even though we still encounter the problem of ending up in local minima, our updating scheme has a great advantage compared with the usual updating scheme used in combinatorial optimization problems. Also, we discuss parallel computing using our updating algorithm.  相似文献   

16.
We classify the computational complexity of all constraint satisfaction problems where the constraint language is preserved by all permutations of the domain. A constraint language is preserved by all permutations of the domain if and only if all the relations in the language can be defined by boolean combinations of the equality relation. We call the corresponding constraint languages equality constraint languages. For the classification result we apply the universal-algebraic approach to infinite-valued constraint satisfaction, and show that an equality constraint language is tractable if it admits a constant unary polymorphism or an injective binary polymorphism, and is NP-complete otherwise. We also discuss how to determine algorithmically whether a given constraint language is tractable.  相似文献   

17.
基于WDM双环网,讨论了在其上实现Hopfield通信模式的波长分配问题,提出了一种路由策略及波长分配方案.在此基础上给出了实现Hopfield算法所需的波长数.  相似文献   

18.
We propose an energy formulation for homomorphic graph matching by the Hopfield network and a Lyapunov indirect method-based learning approach to adaptively learn the constraint parameter in the energy function. The adaptation scheme eliminates the need to specify the constraint parameter empirically and generates valid and better quality mappings than the analog Hopfield network with a fixed constraint parameter. The proposed Hopfield network with constraint parameter adaptation is applied to match silhouette images of keys and results are presented.  相似文献   

19.
Sentence alignment using P-NNT and GMM   总被引:2,自引:0,他引:2  
Parallel corpora have become an essential resource for work in multilingual natural language processing. However, sentence aligned parallel corpora are more efficient than non-aligned parallel corpora for cross-language information retrieval and machine translation applications. In this paper, we present two new approaches to align English–Arabic sentences in bilingual parallel corpora based on probabilistic neural network (P-NNT) and Gaussian mixture model (GMM) classifiers. A feature vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuation score, and cognate score values. A set of manually prepared training data was assigned to train the probabilistic neural network and Gaussian mixture model. Another set of data was used for testing. Using the probabilistic neural network and Gaussian mixture model approaches, we could achieve error reduction of 27% and 50%, respectively, over the length based approach when applied on a set of parallel English–Arabic documents. In addition, the results of (P-NNT) and (GMM) outperform the results of the combined model which exploits length, punctuation and cognates in a dynamic framework. The GMM approach outperforms Melamed and Moore’s approaches too. Moreover these new approaches are valid for any languages pair and are quite flexible since the feature vector may contain more, less or different features, such as a lexical matching feature and Hanzi characters in Japanese–Chinese texts, than the ones used in the current research.  相似文献   

20.
针对日渐丰富的少数民族语言资源进行管理、研究和使用有着重要的应用价值。为了解决语言差异引起的语言鸿沟,针对中朝两种语言环境下的跨语言文本分类任务,提出了双语主题词嵌入模型。该文将词嵌入模型与主题模型扩展到双语环境,并将两种模型相结合,解决了歧义性对跨语言文本分类精度带来的影响。首先,在大规模单词级别对齐平行句对中训练中朝单词的词嵌入向量;其次,利用主题模型对中朝分类语料进行表示,并获得中朝单词的含有主题信息的词嵌入向量;最后,将中朝单词的主题词嵌入向量输入至文本分类器,进行模型的训练与分类预测。实验结果表明,中朝跨语言文本分类任务的准确率达到了91.76%,已达到实际应用的水平,同时该文提出的模型可以对一词多义单词的多个词义有很好的表示。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号