首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Reliability-Based impact (R-impact) factor, defined as the cited half-life multiplied by the citation impact factor, measures both the citation impact and long-lasting impact of published journals. Currently there are several different ways to calculate the citation rates, each with limitations to be improved. This paper provides a new analysis approach for the ranking and citation of published journals. Because cited half-life is the number of publication years from the current year that account for 50% of the current citations published by a journal in its article references, it can evaluate the age of the majority of cited articles. Through using the cited half-life as the sample time variable, we can seek the normalized value of citations and the citable items. This value can avoid the impacts from different journals or different fields. A more radical improvement is suggested in the impact factor, and relative impact factor, which can measure effectively not only the short-time performance of the journals but also the long-time performance. Based on this, relative R-impact (RRI) is proposed to improve R-impact effectively.   相似文献   

2.
We define the emerging research field of applied data science as the knowledge discovery process in which analytic systems are designed and evaluated to improve the daily practices of domain experts. We investigate adaptive analytic systems as a novel research perspective of the three intertwining aspects within the knowledge discovery process in healthcare: domain and data understanding for physician- and patient-centric healthcare, data preprocessing and modelling using natural language processing and (big) data analytic techniques, and model evaluation and knowledge deployment through information infrastructures. We align these knowledge discovery aspects with the design science research steps of problem investigation, treatment design, and treatment validation, respectively. We note that the adaptive component in healthcare system prototypes may translate to data-driven personalisation aspects including personalised medicine. We explore how applied data science for patient-centric healthcare can thus empower physicians and patients to more effectively and efficiently improve healthcare. We propose meta-algorithmic modelling as a solution-oriented design science research framework in alignment with the knowledge discovery process to address the three key dilemmas in the emerging “post-algorithmic era” of data science: depth versus breadth, selection versus configuration, and accuracy versus transparency.  相似文献   

3.
Pattern discovery: a data driven approach to decision support   总被引:1,自引:0,他引:1  
Decision support nowadays is more and more targeted to large scale complicated systems and domains. The success of a decision support system relies mainly on its capability of processing large amounts of data and efficiently extracting useful knowledge from the data, especially knowledge which is previously unknown to the decision makers. With a large scale system, traditional knowledge acquisition models become inefficient and/or more biased, due to the subjectivity of the experts or the pre-assumptions of certain ideas or algorithmic procedures. Today, with the rapid development of computer technologies, the capability of collecting data has been greatly advanced. Data becomes the most valuable resource for an organization. We present a fundamental framework toward intelligent decision support by analyzing a large amount of mixed-mode data (data with a mixture of continuous and categorical values) in order to bridge the subjectivity and objectivity of a decision support process. By considering significant associations of artifacts (events) inherent in the data as patterns, we define patterns as statistically significant associations among feature values represented by joint events or hypercells in the feature space. We then present an algorithm which automatically discovers statistically significant hypercells (patterns) based on: 1) a residual analysis, which tests the significance of the deviation when the occurrence of a hypercell differs from its expectation, and 2) an optimization formulation to enable recursive discovery. By discovering patterns from data sets based on such an objective measure, the nature of the problem domain will be revealed. The patterns can then be applied to solve specific problems as being interpreted or inferred with.  相似文献   

4.
5.
Scholarly big data network is a complex network of citations from research community across the globe. An effective scholar assessment structure is essential for scholars, researchers, and universities. The research publications are an important factor in the university rankings. The fast growth of digital publishing and scholarly data is progressively challenging every day. These days, the scholarly data can be accessed effortlessly through various data analysis techniques. In this paper, a new framework is designed for big scholarly data, and an amoeboid approach article‐optimal citation flow (A‐OCF) is used to find the optimal flow of citations in the big scholarly data network. A novel modern metrics for article quality (MMAQ) metric is proposed to identify the quality of articles. The performance analysis uses different bibliometric measures, including the impact factor citations, conference proceedings citations, and other citations with the purpose of measuring the quality of cited articles. The scholar analytic results are equated with existing techniques. We have also analyzed central articles in a research area through the MMAQ metrics and tested it with benchmark data sets.  相似文献   

6.
We focus on domain adaptation, a branch of transfer learning that concentrates on transferring knowledge from one domain to another when the data distributions differ. Specifically, we investigate unsupervised domain adaptation methods, which have abundant labeled examples from a source domain and unlabeled examples from a target domain available. We aim to minimize the distribution divergences between the domains using optimal transport with subdomain adaptation. Previous methods have mainly focused on reducing global distribution discrepancies between the domains, but these approaches cannot capture fine-grained information and do not consider the structure or geometry of the data. To handle these limitations, we propose Optimal Transport via Subdomain Adaptation (OTSA). Our method utilizes the sliced Wasserstein metric to reduce transportation costs while preserving geometrical data information and the Local Maximum Discrepancy (LMMD) to compute the local discrepancy in each domain category, which helps capture relevant features. Experiments were conducted on six standard domain adaptation datasets, and our method outperformed the majority of baselines. Our approach increased the average accuracy when compared with baselines on the OfficeHome (67.7% to 68.31%), Office-Caltech10 (91.8% to 96.33%), IMAGECLEF-DA (87.9% to 89.9%), VisDA-2017 (79.6% to 81.83%), Office31 (88.17% to 89.11%), and PACS (69.08% to 83.72%) datasets, respectively.  相似文献   

7.
In some domains, such as isolating problems in computer networks and discovering stock market irregularities, there is more interest in patterns consisting of infrequent, but highly correlated items rather than patterns that occur frequently (as defined by minsup, the minimum support level). We describe m-pattern, a new pattern that is defined in terms of minp, the minimum probability of mutual dependence of items in the pattern. We show that all infrequent m-pattern can be discovered by an efficient algorithm that makes use of: (1) a linear algorithm to qualify an m-pattern; (2) an effective technique for candidate pruning based on a necessary condition for the presence of an m-pattern; and (3) a level-wise search for m-pattern discovery (which is possible because m-patterns are downward closed). Further, we consider frequent m-patterns, which are defined in terms of both minp and minsup. Using synthetic data, we study the scalability of our algorithm. Then, we apply our algorithm to data from a production computer network both to show the m-patterns present and to contrast with frequent patterns. We show that when minp=0, our algorithm is equivalent to finding frequent patterns. However, with a larger minp, our algorithm yields a modest number of highly correlated items, which makes it possible to mine for infrequent but highly correlated itemsets. To date, many actionable m-patterns have been discovered in production systems  相似文献   

8.
9.
基于共享知识模型的跨领域推荐算法   总被引:3,自引:0,他引:3       下载免费PDF全文
李林峰  刘真  魏港明  任爽  葛梦凡 《电子学报》2018,46(8):1947-1953
互联网的普及使得大量信息不断累积,推荐系统作为解决信息过载的有效手段,能够帮助人们迅速准确地筛选出感兴趣的内容.但是由于用户项目评分数据过于稀疏,新用户或新商品存在"冷启动"问题,使得传统的推荐算法计算复杂性过高、准确性较低.考虑到用户会在互联网不同领域使用各类应用,在不同领域积累了大量行为数据和评价信息.而从用户群体的角度来说,在不同领域间存在着用户群体的偏好相似性,因此如果通过在不同领域中共享代表偏好的知识模型,将有助于提升在新领域推荐的准确性,解决冷启动问题.本文提出了基于共享知识模型的跨领域推荐算法SKP (Sharing Knowledge Pattern),通过对各个领域中用户-项目的评分矩阵分解,得到用户的潜在特征矩阵和项目的潜在特征矩阵,对用户和项目的潜在特征分别聚类,得到了用户分组对项目分组的评分知识模型,最终利用目标领域的个性知识模型和各个领域的共性知识模型来得出推荐结果.本文对三个不同领域的数据集进行了分析和划分,并在物理集群环境下进行了实验.结果表明,通过利用数据稠密的辅助领域数据,本文提出的SKP算法与已有的单领域算法、跨领域算法相比,具有更高的准确率和更低的RMSE值.  相似文献   

10.
Artificial intelligence techniques for monitoring dangerous infections.   总被引:1,自引:0,他引:1  
The monitoring and detection of nosocomial infections is a very important problem arising in hospitals. A hospital-acquired or nosocomial infection is a disease that develops after admission into the hospital and it is the consequence of a treatment, not necessarily a surgical one, performed by the medical staff. Nosocomial infections are dangerous because they are caused by bacteria which have dangerous (critical) resistance to antibiotics. This problem is very serious all over the world. In Italy, almost 5-8% of the patients admitted into hospitals develop this kind of infection. In order to reduce this figure, policies for controlling infections should be adopted by medical practitioners. In order to support them in this complex task, we have developed a system, called MERCURIO, capable of managing different aspects of the problem. The objectives of this system are the validation of microbiological data and the creation of a real time epidemiological information system. The system is useful for laboratory physicians, because it supports them in the execution of the microbiological analyses; for clinicians, because it supports them in the definition of the prophylaxis, of the most suitable antibi-otic therapy and in monitoring patients' infections; and for epidemiologists, because it allows them to identify outbreaks and to study infection dynamics. In order to achieve these objectives, we have adopted expert system and data mining techniques. We have also integrated a statistical module that monitors the diffusion of nosocomial infections over time in the hospital, and that strictly interacts with the knowledge based module. Data mining techniques have been used for improving the system knowledge base. The knowledge discovery process is not antithetic, but complementary to the one based on manual knowledge elicitation. In order to verify the reliability of the tasks performed by MERCURIO and the usefulness of the knowledge discovery approach, we performed a test based on a dataset of real infection events. In the validation task MERCURIO achieved an accuracy of 98.5%, a sensitivity of 98.5% and a specificity of 99%. In the therapy suggestion task, MERCURIO achieved very high accuracy and specificity as well. The executed test provided many insights to experts, too (we discovered some of their mistakes). The knowledge discovery approach was very effective in validating part of the MERCURIO knowledge base, and also in extending it with new validation rules, confirmed by interviewed microbiologists and specific to the hospital laboratory under consideration.  相似文献   

11.
Improving clinical decision support through case-based data fusion   总被引:1,自引:0,他引:1  
This paper presents an information fusion technique based on a knowledge discovery model, and the case-based reasoning decision framework. Using signal data and database records from the heart disease risk estimation domain, three data fusion methods are discussed. Two of these methods combine information at the retrieval-outcome level, and one method merges data at the discovery-input level. The result of these three models are compared and evaluated against the performance of single-source models. It is shown that the methods that fuse information at the retrieval-outcome level are significantly superior.  相似文献   

12.
高士其的科普创作思想   总被引:1,自引:1,他引:0  
高士其是20世纪30年代成长起来的中国著名的科普作家。他是科学小品文这一重要科普创作体裁的集大成者,并且开创了现代科学诗这一新的体裁。作为科学文艺作家,他的作品不仅具有科学性、艺术性、思想性三个特点,而且将科学知识成功地本土化。他早期作品的思想性主要体现在具有很强的战斗性,与当时的时代背景紧密相关。  相似文献   

13.
The single-beat reconstruction of electrical cardiac sources from body-surface electrocardiogram data might become an important issue for clinical application. The feasibility and field of application of noninvasive imaging methods strongly depend on development of stable algorithms for solving the underlying ill-posed inverse problems. We propose a novel spatiotemporal regularization approach for the reconstruction of surface transmembrane potential (TMP) patterns. Regularization is achieved by imposing linearly formulated constraints on the solution in the spatial as well as in the temporal domain. In the spatial domain an operator similar to the surface Laplacian, weighted by a regularization parameter, is used. In the temporal domain monotonic nondecreasing behavior of the potential is presumed. This is formulated as side condition without the need of any regularization parameter. Compared to presuming template functions, the weaker temporal constraint widens the field of application because it enables the reconstruction of TMP patterns with ischemic and infarcted regions. Following the line of Tikhonov regularization, but considering all time points simultaneously, we obtain a linearly constrained sparse large-scale convex optimization problem solved by a fast interior point optimizer. We demonstrate the performance with simulations by comparing reconstructed TMP patterns with the underlying reference patterns.  相似文献   

14.
Using US patents as a surrogate measure of technological positions, the competitive positions of the industrial nations (the US, United Kingdom, France, West Germany, Canada, and Japan) in high-technology areas during the period from 1975 to 1988 are examined. High-technology industry is defined as one which requires a high proportion of R&D expenditure and employs a high proportion of scientists and engineers. High-tech industries are further subdivided into four categories (equipment, consumer durable, nondurable, and intermediate products) in terms of their market and/or use. How different countries have specialized in different product market areas within the high tech sector is also examined. To better understand the impact of the patients, the citations per patent for the different countries and the citation performance ratio, which is the share of country's most highly cited patents on a worldwide basis, are examined  相似文献   

15.
Because of possible multiple solutions allowed, the unwrapping of interferometric fringe patterns in the spatial domain is an ill-posed problem which needs some a priori knowledge of the ground morphology for the solution of ambiguities. This is especially true for interferometric SAR (Synthetic Aperture Radar) data. In this paper we propose a different approach to InSAR processing for retrieving the height of ground points independently from each other, unlike most conventional phase unwrapping procedures, which operate in the spatial domain. The basic idea is to repeat raw data focusing by using range sub-bands centered at different frequencies, in order to find a point history of the interferometric phase variation vs. frequency. We introduce the general framework of the method together with considerations on the theoretical limits of applicability, then we report results of our simulations related to a wide-band SAR system. We show that, under certain conditions, height values can be retrieved over a network of coherent and strong scatterers, even when enclosed into low-coherence areas.  相似文献   

16.
Due to the increasingly data-intensive clinical environment, physicians now have unprecedented access to detailed clinical information from a multitude of sources. However, applying this information to guide medical decisions for a specific patient case remains challenging. One issue is related to presenting information to the practitioner: displaying a large (irrelevant) amount of information often leads to information overload. Next-generation interfaces for the electronic health record (EHR) should not only make patient data easily searchable and accessible, but also synthesize fragments of evidence documented in the entire record to understand the etiology of a disease and its clinical manifestation in individual patients. In this paper, we describe our efforts toward creating a context-based EHR, which employs biomedical ontologies and (graphical) disease models as sources of domain knowledge to identify relevant parts of the record to display. We hypothesize that knowledge (e.g., variables, relationships) from these sources can be used to standardize, annotate, and contextualize information from the patient record, improving access to relevant parts of the record and informing medical decision making. To achieve this goal, we describe a framework that aggregates and extracts findings and attributes from free-text clinical reports, maps findings to concepts in available knowledge sources, and generates a tailored presentation of the record based on the information needs of the user. We have implemented this framework in a system called Adaptive EHR, demonstrating its capabilities to present and synthesize information from neurooncology patients. This paper highlights the challenges and potential applications of leveraging disease models to improve the access, integration, and interpretation of clinical patient data.  相似文献   

17.
This paper presents a transfer learning-based approach for induction motor fault diagnosis, where the Transfer principal component analysis (TPCA) is proposed to improve diagnostic performance of the induction motors under various working conditions. TPCA is developed to minimize the distribution difference between training and testing data by mapping cross-domain data into a shared latent space in which domain difference can be reduced. The trained model can achieve a good performance in testing data by using the learned features consisting of common latent principal components. Experimental results show that the proposed approach outperforms traditional machine learning techniques and can diagnose induction motor fault under various working conditions effectively.  相似文献   

18.
We present a general framework for resource discovery, composition and substitution in mobile ad-hoc networks, exploiting knowledge representation techniques. Key points of the proposed approach are: (1) reuse of discovery information at network layer in order to build a fully unified semantic-based discovery and routing framework; (2) use of semantic annotations in order to perform the orchestration of elementary resources for building personalized services adopting a concept covering procedure, and to allow the automatic substitution of no more suitable/available components. Using ns-2 simulator, we evaluated performances of the proposed framework with reference to a disaster recovery scenario. In particular, the impact of the number of available services and active clients has been investigated in various mobility conditions and for several service covering threshold levels. Obtained results show that: (1) the proposed framework is highly scalable, given that its overall performance is improved by increasing the number of active clients; (2) the traffic load due to clients is negligible; (3) also for a very small number of available service providers very high hit ratios can be reached; (4) increasing the number of providers can lead to hit ratios very close to 100% at the expense of an increased traffic load. Finally, the effectiveness of cross-layer interaction between routing and resource discovery protocols has been also evaluated and discussed.  相似文献   

19.
Citation represents the relationship between the cited and the citing document and vice versa. Citations are widely used to measure the different aspects of knowledge-based achievements such as institutional ranking, author ranking, the impact factor of the journal, research grants, and peer judgments. A fair evaluation of research required a quantitative and qualitative assessment of citations. To perform the qualitative analysis of citations, researchers tried to classify the citations into binary classes (i.e., important and non-important). To perform this task, researchers used metadata, content, citations count, cue words or phrases, sentiment analysis, keywords, and machine learning approaches for citation classification. However, the state-of-the-art results of binary classification are inadequate for the calculation of different aspects of the researcher and their work. Therefore, this research proposed an in-text citation sentiment analysis-based approach for binary classification which effectively enhanced the results of the state-of-the-art. In this research, different machine learning-based models are evaluated to determine the in-text citations sentiments. These sentiment results are further used for positive-negative, and neutral citation counts. Furthermore, the scores of cosine similarity between paper citation pairs are also calculated and used as a feature. This sentiment and cosine similarity scores are further used as features in binary classification. The classification is performed through SVM, KLR, and Random Forest. The proposed approach is evaluated and compared with two state-of-the-art approaches on the benchmark dataset. The proposed approach can achieve 0.83 f-measure with the improvement of 13.6% for dataset 1 and 0.67 with an improvement of 8% for dataset two with a random forest classification model.  相似文献   

20.
Automatic discovery of physical topology plays a crucial role in enhancing the manageability of modern metro Ethernet networks. Despite the importance of the problem, earlier research and commercial network management tools have typically concentrated on either discovering logical topology, or proprietary solutions targeting specific product families. Recent works have demonstrated that network topology can be determined using the standard simple network management protocol (SNMP) management information base (MIB), but these algorithms depend on address forwarding table (AFT) entries and can find only spanning tree paths in an Ethernet mesh network. A previous work by Breibart et al. requires that AFT entries be complete; however, that can be a risky assumption in a realistic Ethernet mesh network. In this paper, we have proposed a new physical topology discovery algorithm which works without complete knowledge of AFT entries. Our algorithm can discover a complete physical topology including inactive interfaces eliminated by the spanning tree protocol in metro Ethernet networks. The effectiveness of the algorithm is demonstrated by implementation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号