共查询到19条相似文献,搜索用时 46 毫秒
1.
双语词典抽取作为机器翻译的基础是自然语言处理领域的重要任务.由于不需要任何监督信息,无监督双语词典抽取方法逐渐成为研究热点.无监督方法依赖于不同语言词向量之间的同构性,但是目前却少有提升词向量同构性的方法.本文提出了一种基于混合语料的同构性增强方法来提升不同语言词向量之间同构性,进而提升双语词典性能.该方法在中英维基百科上的抽取词典的性能有明显的提升. 相似文献
2.
现有微博文本情感分析方法多面向单一语种语料,如:中文语料.但是,中英文搭配使用的表达习惯已逐渐成为个体意见表达的重要形式.本文提出一种基于双语词典的多类情感分析方法,通过构建双语多类情感词典对微博文本进行多分类语义倾向性分析,以便更准确有效捕捉群体意见,及时发现社会舆论倾向.通过与多数投票算法、支持向量机算法、基于余弦距离的K近邻分类算法相比,本文提出的基于双语词典的多类情感分析模型具有良好的分类效果,其在分类准确率、F1值等方面都有明显提高. 相似文献
3.
4.
5.
为更好判定远程监督语句中的实体语义关系,实现对语言处理信息的准确提取,提出基于深度学习的远程监督关系抽取方法.借助远程监督方法,获取关系三元组中已存储的信息参量,再通过待学习数据标注的方式,完成远程监督关系的抽取数据集构建.在此基础上,设计监督执行框架,利用已定义的句子级别特征条件,实现对待抽取标签的学习处理,完成基于... 相似文献
6.
7.
8.
刘婉婉 《信息技术与信息化》2021,(1):241-243
神经机器翻译模型主要是在监督环境下学习模型参数,即编码器将源语言编码为连续的向量表示,解码器从这组连续向量表示中解码出目标语言.对于稀缺资源的语言来说,监督学习方法表现得并不理想.虽然迁移学习方法能够缓解上述问题,但是模型泛化能力较弱,得不到期望的译文.本文受迁移学习启发,提出一种无监督的元学习策略来构建翻译模型,将利... 相似文献
9.
11.
12.
针对传统短语对齐方法依赖外部资源,且较少涉及平行句对内在特征的问题,提出了融入双语词向量的韩汉名词短语对齐方法.利用平行语料,分别训练单语词向量再进行跨语言映射得到双语词向量,并构建了基于短语构成规律的短语抽取和融入双语词向量、短语长度和词性相似度的短语对齐模型.实验结果证明,融入韩汉双语词向量,能更有效地提取短语特征从而实现短语对齐. 相似文献
13.
最近邻搜索在大规模图像检索中变得越来越重要。在最近邻搜索中,许多哈希方法因为快速查询和低内存被提出。然而,现有方法在哈希函数构造过程中对数据稀疏结构研究的不足,本文提出了一种无监督的稀疏自编码的图像哈希方法。基于稀疏自编码的图像哈希方法将稀疏构造过程引入哈希函数的学习过程中,即通过利用稀疏自编码器的KL距离对哈希码进行稀疏约束以增强局部保持映射过程中的判别性,同时利用L2范数来哈希编码的量化误差。实验中用两个公共图像检索数据集CIFAR-10和YouTube Faces验证了本文算法相比其他无监督哈希算法的优越性。 相似文献
14.
词语是文本中的情感表达的最小单位,而词语语义的情感倾向性分析是文本情感分类的基础.利用中文情感词构建出一个基础情感词典来判断未知情感词的情感极性.本文即是在HOWNET情感词语集的基础上,利用义原相似度算法,构建了中文基础情感词典,并提出以信息融合方法,将此词典与同济大学的褒贬词典进行整合,建立了特定情感词与特定情感标注以及相应的情感权值的映射关系,实验结果表明,该方法取得不错的分类效果. 相似文献
15.
Aiming to solve the misclassification problems of unsupervised polarimetric Wishart classification algorithm based on Freeman decomposition, an unsupervised Polarimetric Synthetic Aperture Radar (SAR) Interferomery (PolInSAR) classification algorithm based on optimal coherence set parameters is studied and proposed. This algorithm uses the result of Freeman decomposition to divide the image into three basic categories including surface scattering, volume scattering, and double-bounce. Then, the PolInSAR optimal coherence set parameters are used to finely divide each of the three basic categories into 9 categories, and the whole image is divided into 27 categories. Because both the Freeman decomposition result and optimal coherence set parameters indicate specific scattering characteristics, the whole image is merged into 16 categories based on physical meaning. At last, the Wishart cluster is employed to obtain the final classification result. To preserve the purity of scattering characteristics, pixels with similar scattering characteristics are restricted to be classified with other pixels. The final classification results effectively resolve the misclassification problem, not only the buildings can be effectively distinguished from vegetation in urban areas, but also the road is well distinguished from grass. In this paper, the E-SAR PolInSAR data of German Aerospace Center (DLR), are used to verify the effectiveness of the algorithm. 相似文献
16.
17.
18.
Junchang Xin Zhongyang Wang Shuo Tian Zhiqiong Wang 《Multidimensional Systems and Signal Processing》2017,28(3):1013-1030
Unsupervised Extreme Learning Machine (US-ELM) is a machine learning method widely used. With good performance in anti-noise and data representation, as well as fast clustering speed, US-ELM is suitable for processing noise containing nuclear magnetic resonance (NMR) image. Therefore, in this paper, a brain NMR image segmentation approach based on US-ELM is proposed. Firstly, a median filter is adopted to reduce the influence of noise; Secondly, US-ELM maps the original data into the embedded space, which makes it increasingly effective to represent the characteristic of pixel points, and then uses the k-means method to perform the image segmentation, named NS-UE; After that, spatial fuzzy C-means (spFCM) provides a better solution for handling NMR image with noise caused by the intensity inhomogeneity than k-means does. As a result, an image segmentation approach based on US-ELM and spFCM (NS-UF) is proposed, so as to improve the effect of clustering in embedded space. Finally, extensive experiments on real data demonstrated the efficiency and effectiveness of our proposed approaches with various experimental settings. 相似文献