期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Semantics for Knowledge and Change of Awareness

Hans van Ditmarsch Tim French 《Journal of Logic, Language and Information》2014,23(2):169-195

We examine various logics that combine knowledge, awareness, and change of awareness. An agent can become aware of propositional propositions but also of other agents or of herself. The dual operation to becoming aware, forgetting, can also be modelled. Our proposals are based on a novel notion of structural similarity that we call awareness bisimulation, the obvious notion of modal similarity for structures encoding knowledge and awareness. 相似文献

2.

Shape similarity measurement for 3D mechanical part using D2 shape distribution and negative feature decomposition 总被引：1，自引：0，他引：1

Han-Chung ChengAuthor VitaeCheng-Hung LoAuthor Vitae Chih-Hsing ChuAuthor Vitae Yong Se KimAuthor Vitae 《Computers in Industry》2011,62(3):269-280

相似文献

3.

MatchSim: a novel similarity measure based on maximum neighborhood matching 总被引：1，自引：1，他引：0

Zhenjiang Lin Michael R. Lyu Irwin King 《Knowledge and Information Systems》2012,32(1):141-166

Measuring object similarity in a graph is a fundamental data- mining problem in various application domains, including Web linkage mining, social network analysis, information retrieval, and recommender systems. In this paper, we focus on the neighbor-based approach that is based on the intuition that ??similar objects have similar neighbors?? and propose a novel similarity measure called MatchSim. Our method recursively defines the similarity between two objects by the average similarity of the maximum-matched similar neighbor pairs between them. We show that MatchSim conforms to the basic intuition of similarity; therefore, it can overcome the counterintuitive contradiction in SimRank. Moreover, MatchSim can be viewed as an extension of the traditional neighbor-counting scheme by taking the similarities between neighbors into account, leading to higher flexibility. We present the MatchSim score computation process and prove its convergence. We also analyze its time and space complexity and suggest two accelerating techniques: (1) proposing a simple pruning strategy and (2) adopting an approximation algorithm for maximum matching computation. Experimental results on real-world datasets show that although our method is less efficient computationally, it outperforms classic methods in terms of accuracy. 相似文献

4.

Quantized ranking for permutation-based indexing

《Information Systems》2015

The K-Nearest Neighbor (K-NN) search problem is the way to find the K closest and most similar objects to a given query. The K-NN is essential for many applications such as information retrieval and visualization, machine learning and data mining. The exponential growth of data imposes to find approximate approaches to this problem. Permutation-based indexing is one of the most recent techniques for approximate similarity search. Objects are represented by permutation lists ordering their distances to a set of selected reference objects, following the idea that two neighboring objects have the same surrounding. In this paper, we propose a novel quantized representation of permutation lists with its related data structure for effective retrieval on single and multicore architectures. Our novel permutation-based indexing strategy is built to be fast, memory efficient and scalable. This is experimentally demonstrated in comparison to existing proposals using several large-scale datasets of millions of documents and of different dimensions. 相似文献

5.

A framework for the computerized assessment of university student essays

《Computers in human behavior》2006,22(3):381-388

This paper presents an approach for automatic grading of essays. Student essays are compared against a model or key essay provided by the teacher. The similarity between a student essay and the model essay is measured by the cosine of their contained angle in an n-dimensional semantic space. The model essay is preprocessed by removing stopwords, extracting keywords, assigning weights to keywords to reflect their importance and finally by linking every keyword to a subject-oriented synonym list. The student essay, by comparison, is preprocessed by removing stopwords and then by extracting keywords. The keywords extracted from the model essay and the keywords extracted from students essays together with weights provided by teacher are used to build feature vectors for teacher and students essays. The obtained grade depends on the similarity between these vectors (calculated by using the cosine formula). A simulator was implemented to test the viability of the proposed approach. It was fed with student essays (at the university level) gathered from database management course over three semesters. The results were very encouraging and the agreement between the auto-grader and human grader was as good as the agreement between human graders. 相似文献

6.

Evaluating students’ answerscripts based on interval-valued fuzzy grade sheets

Shyi-Ming Chen Hui-Yu Wang 《Expert systems with applications》2009,36(6):9839-9846

相似文献

7.

Biometric and Intelligent Self-Assessment of Student Progress system

A. Kaklauskas E.K. Zavadskas V. Pruskus A. Vlasenko M. Seniut G. Kaklauskas A. Matuliauskaite V. Gribniak 《Computers & Education》2010

All distance learning participants (students, professors, instructors, mentors, tutors and the rest) would like to know how well the students have assimilated the study materials being taught. The analysis and assessment of the knowledge students have acquired over a semester are an integral part of the independent studies process at the most advanced universities worldwide. A formal test or exam during the semester would cause needless stress for students. To resolve this problem, the authors of this article have developed a Biometric and Intelligent Self-Assessment of Student Progress (BISASP) System. The obtained research results are comparable with the results from other similar studies. This article ends with two case studies to demonstrate practical operation of the BISASP System. The first case study analyses the interdependencies between microtremors, stress and student marks. The second case study compares the marks assigned to students during the e-self-assessment, prior to the e-test and during the e-test. The dependence, determined in the second case study, between the student marks scored for the real examination and the marks based on their self-evaluation is statistically significant (the significance >0.99%). The original contribution of this article, compared to the research results published earlier, is as follows: the BISASP System developed by the authors is superior to the traditional self-assessment systems due to the use of voice stress analysis and a special algorithm, which permits a more detailed analysis of the knowledge attained by a student. 相似文献

8.

Jing Peng Chang-jie Tang Dong-qing Yang Jing Zhang Jian-jun Hu 《Applied Soft Computing》2009,9(1):209-218

In recent years, researchers have paid more and more attention on data mining of practical applications. Aimed to the problem of symptom classification of Chinese traditional medicine, this paper proposes a novel computing model based on the similarities among attributes of high dimension data to compute the similarity between any tuples. This model assumes data attributes as basic vectors of m dimensions and each tuple as a sum vector of all the attribute-vectors. Based on the transcendental concept similarity information among attributes, it suggests a novel distance algorithm to compute the similarity distance of any pair of attribute-vectors. In this method, the computing of similarity between any tuples are turned to the formulas of attribute-vectors and their projections of each other, and the similarity between any pair of tuples can be worked out by computing these vectors and formulas. This paper also presents a novel classification algorithm based on the similarity computing model and successfully applies the algorithm into the symptom classification of Chinese traditional medicine. The efficiency of the algorithm is proved by extensive experiments. 相似文献

9.

Metric information filtering

Paolo Ciaccia Marco Patella 《Information Systems》2011

The traditional problem of similarity search requires to find, within a set of points, those that are closer to a query point q, according to a distance function d. In this paper we introduce the novel problem of metric information filtering (MIF): in this scenario, each point x_i comes with its own distance function d_i and the task is to efficiently determine those points that are close enough, according to d_i, to a query point q. MIF can be seen as an extension of both the similarity search problem and of approaches currently used in content-based information filtering, since in MIF user profiles (points) and new items (queries) are compared using arbitrary, personalized, metrics. We introduce the basic concepts of MIF and provide alternative resolution strategies aiming to reduce processing costs. Our experimental results show that the proposed solutions are indeed effective in reducing evaluation costs. 相似文献

10.

基于代表性答案选择与注意力机制的短答案自动评分

谭红叶午泽鹏卢宇段庆龙李茹张虎《中文信息学报》2019,33(11):134-142

短答案自动评分是智慧教学中的一个关键问题。目前自动评分不准确的主要原因是: (1)预先给定的参考答案不能覆盖多样化的学生答题情况; (2)不能准确刻画学生答案与参考答案匹配情况。针对上述问题,该文采用基于聚类与最大相似度方法选择代表性学生答案构建更完备的参考答案,尽可能覆盖学生不同的答题情况;在此基础上,利用基于注意力机制的深度神经网络模型来提升系统对学生答案与参考答案匹配情况的刻画。相关数据集上的实验结果表明: 该文模型有效提升了自动评分的准确率。相似文献

11.

Computing inter-document similarity with Context Semantic Analysis

《Information Systems》2019

We propose a novel knowledge-based technique for inter-document similarity computation, called Context Semantic Analysis (CSA). Several specialized approaches built on top of specific knowledge base (e.g. Wikipedia) exist in literature, but CSA differs from them because it is designed to be portable to any RDF knowledge base. In fact, our technique relies on a generic RDF knowledge base (e.g. DBpedia and Wikidata) to extract from it a Semantic Context Vector, a novel model for representing the context of a document, which is exploited by CSA to compute inter-document similarity effectively. Moreover, we show how CSA can be effectively applied in the Information Retrieval domain. Experimental results show that: (i) for the general task of inter-document similarity, CSA outperforms baselines built on top of traditional methods, and achieves a performance similar to the ones built on top of specific knowledge bases; (ii) for Information Retrieval tasks, enriching documents with context (i.e., employing the Semantic Context Vector model) improves the results quality of the state-of-the-art technique that employs such similar semantic enrichment. 相似文献

12.

VideoGraph: a non-linear video representation for efficient exploration

Lei Zhang Qian-Kun Xu Lei-Zheng Nie Hua Huang 《The Visual computer》2014,30(10):1123-1132

In this paper we introduce VideoGraph, a novel non-linear representation for scene structure of a video. Unlike classical linear sequential organization, VideoGraph concentrates the video content across the time line by structuring scenes and materializes with two-dimensional graph, which enables non-linear exploration on the scenes and their transitions. To construct VideoGraph, we adopt a sub-shot induced method to evaluate the spatio-temporal similarity between shot segments of video. Then, scene structure is derived by grouping similar shots and identifying the valid transitions between scenes. The final stage is to represent the scene structure using a graph with respect to scene transition topology. Our VideoGraph can provide a condensed representation in the scene level and facilitate a non-linear manner to browse videos. Experimental results are presented to demonstrate the effectiveness and efficiency by using VideoGraph to explore and access the video content. 相似文献

13.

DNA sequence comparison by a novel probabilistic method 总被引：1，自引：0，他引：1

Chenglong Yu Stephen S.-T. Yau 《Information Sciences》2011,181(8):1484-1492

This paper proposes a novel method for comparing DNA sequences. By using a graphical representation, we are able to construct the probability distributions of DNA sequences. These probability distributions can then be used to make similarity studies by using the symmetrised Kullback-Leibler divergence. After presenting our method, we test it using six DNA sequences taken from the threonine operons of Escherichia coli K-12 and Shigella flexneri. Our approach is then used to study the evolution of primates using mitochondrial DNA data. Our method allows us to reconstruct a phylogenetic tree for primate evolution. In addition, we use our technique to analyze the classification and phylogeny of the Tomato Yellow Leaf Curl Virus (TYLCV) based on its whole genome sequences. These examples show that large volumes of DNA sequences can be handled more easily and more quickly by our approach than by the existing multiple alignment methods. Moreover, our method, unlike other approaches, does not require human intervention, because it can be applied automatically. 相似文献

14.

Evaluation of semantic similarity metrics applied to the automatic retrieval of medical documents: An UMLS approach

《Expert systems with applications》2016

One promise of current information retrieval systems is the capability to identify risk groups for certain diseases and pathologies based on the automatic analysis of vast amounts of Electronic Medical Records repositories. However, the complexity and the degree of specialization of the language used by the experts in this context, make this task both challenging and complex. In this work, we introduce a novel experimental study to evaluate the performance of the two semantic similarity metrics (Path and Intrinsic IC-Path, both widely accepted in the literature) in a real-life information retrieval situation. In order to achieve this goal and due to the lack of methodologies for this context in the literature, we propose a straightforward information retrieval system for the biomedical field based on the UMLS Metathesaurus and on semantic similarity metrics. In contrast with previous studies which focus on testbeds with limited and controlled sets of concepts, we use a large amount of information (101,712 medical documents extracted from TREC Medical Records Track 2011). Our results show that in real-life cases, both metrics display similar performance, Path (F-Measure = 0.430) e Intrinsic IC-Path (F-Measure = 0.427). Thereby we suggest that the use of Intrinsic IC-Path is not justified in real scenarios. 相似文献

15.

Trie-join: a trie-based method for efficient string similarity joins

Jianhua Feng Jiannan Wang Guoliang Li 《The VLDB Journal The International Journal on Very Large Data Bases》2012,21(4):437-461

A string similarity join finds similar pairs between two collections of strings. Many applications, e.g., data integration and cleaning, can significantly benefit from an efficient string-similarity-join algorithm. In this paper, we study string similarity joins with edit-distance constraints. Existing methods usually employ a filter-and-refine framework and suffer from the following limitations: (1) They are inefficient for the data sets with short strings (the average string length is not larger than 30); (2) They involve large indexes; (3) They are expensive to support dynamic update of data sets. To address these problems, we propose a novel method called trie-join, which can generate results efficiently with small indexes. We use a trie structure to index the strings and utilize the trie structure to efficiently find similar string pairs based on subtrie pruning. We devise efficient trie-join algorithms and pruning techniques to achieve high performance. Our method can be easily extended to support dynamic update of data sets efficiently. We conducted extensive experiments on four real data sets. Experimental results show that our algorithms outperform state-of-the-art methods by an order of magnitude on the data sets with short strings. 相似文献

16.

An efficient mechanism for processing similarity search queries in sensor networks

Yu-Chi Chung 《Information Sciences》2011,181(2):284-307

The similarity search problem has received considerable attention in database research community. In sensor network applications, this problem is even more important due to the imprecision of the sensor hardware, and variation of environmental parameters. Traditional similarity search mechanisms are both improper and inefficient for these highly energy-constrained sensors. A difficulty is that it is hard to predict which sensor has the most similar (or closest) data item such that many or even all sensors need to send their data to the query node for further comparison. In this paper, we propose a similarity search algorithm (SSA), which is a novel framework based on the concept of Hilbert curve over a data-centric storage structure, for efficiently processing similarity search queries in sensor networks. SSA successfully avoids the need of collecting data from all sensors in the network in searching for the most similar data item. The performance study reveals that this mechanism is highly efficient and significantly outperforms previous approaches in processing similarity search queries. 相似文献

17.

Word-level neutrosophic sentiment similarity

《Applied Soft Computing》2019

In the specializedliterature, there are many approaches developed for capturing textual measures: textual similarity, textual readability and textual sentiment. This paper proposes a new sentiment similarity measures between pairs of words using a fuzzy-based approach in which words are considered single-valued neutrosophic sets. We build our study with the aid of the lexical resource SentiWordNet 3.0 as our intended scope is to design a new word-level similarity measure calculated by means of the sentiment scores of the involved words. Our study pays attention to the polysemous words because these words are a real challenge for any application that processes natural language data. After our knowledge, this approach is quite new in the literature and the obtained results give us hope for further investigations. 相似文献

18.

Efficient processing of similarity search under time warping in sequence databases: an index-based approach

《Information Systems》2004,29(5):405-420

This paper discusses the effective processing of similarity search that supports time warping in large sequence databases. Time warping enables sequences with similar patterns to be found even when they are of different lengths. Prior methods for processing similarity search that supports time warping failed to employ multi-dimensional indexes without false dismissal since the time warping distance does not satisfy the triangular inequality. They have to scan the entire database, thus suffering from serious performance degradation in large databases. Another method that hires the suffix tree, which does not assume any distance function, also shows poor performance due to the large tree size.In this paper, we propose a novel method for similarity search that supports time warping. Our primary goal is to enhance the search performance in large databases without permitting any false dismissal. To attain this goal, we have devised a new distance function, D_tw−lb, which consistently underestimates the time warping distance and satisfies the triangular inequality. D_tw−lb uses a 4-tuple feature vector that is extracted from each sequence and is invariant to time warping. For the efficient processing of similarity search, we employ a multi-dimensional index that uses the 4-tuple feature vector as indexing attributes, and D_tw−lb as a distance function. We prove that our method does not incur false dismissal. To verify the superiority of our method, we have performed extensive experiments. The results reveal that our method achieves a significant improvement in speed up to 43 times faster with a data set containing real-world S&P 500 stock data sequences, and up to 720 times with data sets containing a very large volume of synthetic data sequences. The performance gain increases: (1) as the number of data sequences increases, (2) the average length of data sequences increases, and (3) as the tolerance in a query decreases. Considering the characteristics of real databases, these tendencies imply that our approach is suitable for practical applications. 相似文献

19.

Surface partial matching and application to archaeology

Arik ItskovichAyellet Tal 《Computers & Graphics》2011,35(2):334-341

Partial matching is a fundamental problem in shape analysis, a field that is recently gaining an increasing importance in computer graphics. This paper proposes a novel approach to performing partial matching of surfaces. Given two surfaces M_A and M_B, our goal is to find the best match to M_A within M_B. The key idea of our approach is to integrate feature-point similarity and segment similarity. Specifically, we introduce a probabilistic framework in which the segmentation and the correspondences of neighboring feature points allow us to enhance or moderate our certainty of a feature-point similarity. The utility of our algorithm is demonstrated in the domain of archaeology, where digital archiving is becoming ever more widespread. In this domain, automatic matching can serve as a worthy alternative to the expensive and time-consuming manual procedure that is used today. 相似文献

20.

Accuracy estimation of link-based similarity measures and its application

Yinglong ZHANG Cuiping LI Chengwang XIE Hong CHEN 《Frontiers of Computer Science》2016,10(1):113-123

Link-based similarity measures play a significant role in many graph based applications. Consequently, measuring node similarity in a graph is a fundamental problem of graph datamining. Personalized pagerank (PPR) and simrank (SR) have emerged as the most popular and influential link-based similarity measures. Recently, a novel link-based similarity measure, penetrating rank (P-Rank), which enriches SR, was proposed. In practice, PPR, SR and P-Rank scores are calculated by iterative methods. As the number of iterations increases so does the overhead of the calculation. The ideal solution is that computing similarity within the minimum number of iterations is sufficient to guarantee a desired accuracy. However, the existing upper bounds are too coarse to be useful in general. Therefore, we focus on designing an accurate and tight upper bounds for PPR, SR, and P-Rank in the paper. Our upper bounds are designed based on the following intuition: the smaller the difference between the two consecutive iteration steps is, the smaller the difference between the theoretical and iterative similarity scores becomes. Furthermore, we demonstrate the effectiveness of our upper bounds in the scenario of top-k similar nodes queries, where our upper bounds helps accelerate the speed of the query. We also run a comprehensive set of experiments on real world data sets to verify the effectiveness and efficiency of our upper bounds. 相似文献