首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
根据中老年体检报告,运用Apriori算法挖掘各个指标之间的联系,为医生、患者提供诊断参考与建议。通过安徽省某三甲医院的体检数据,筛选出40岁及以上的中老年人群为研究对象,应用数据挖掘中关联规则的Apriori算法对超重、心电图、脂肪肝、血脂、血压、血糖、尿常规、吸烟、饮酒、总胆固醇等体检指标之间的关联关系进行分析研究。研究表明,体检者的个人不良习惯、超重、高龄、高血糖和脂肪肝等都密切相关,互相影响,提出中老年人群应加强对慢性疾病的预防,保持良好的作息习惯等相关建议。  相似文献   

2.
一种属性权重未知的区间概率风险型混合多属性决策方法   总被引:2,自引:0,他引:2  
针对指标权重未知的区间概率风险型混合多属性决策问题,提出一种基于熵权和投影理论的决策方法.首先,建立了语言变量和不确定语言变量与梯形模糊数的转化关系,将混合型数据转化成统一的梯形模糊数;然后,通过期望值将风险型决策矩阵转化为确定型决策矩阵,并用熵权法确定各指标权重,计算加权决策矩阵,根据各方案在正负理想方案上投影的相对接近度对方案进行排序;最后,通过应用案例说明了该方法的有效性.  相似文献   

3.
Qinghua  Zongxia  Daren 《Pattern recognition》2007,40(12):3509-3521
Feature subset selection has become an important challenge in areas of pattern recognition, machine learning and data mining. As different semantics are hidden in numerical and categorical features, there are two strategies for selecting hybrid attributes: discretizing numerical variables or numericalize categorical features. In this paper, we introduce a simple and efficient hybrid attribute reduction algorithm based on a generalized fuzzy-rough model. A theoretic framework of fuzzy-rough model based on fuzzy relations is presented, which underlies a foundation for algorithm construction. We derive several attribute significance measures based on the proposed fuzzy-rough model and construct a forward greedy algorithm for hybrid attribute reduction. The experiments show that the technique of variable precision fuzzy inclusion in computing decision positive region can get the optimal classification performance. Number of the selected features is the least but accuracy is the best.  相似文献   

4.
In an era of many diseases and increased longevity, more attention has been paid to chronic diseases that require constant health care. Under this circumstance, the development of research and development (R&D) for smart-device-based constant health care has drawn great attention. With the emergence of wearable devices, personal health devices (PHDs), and smartphones, various contents for constant health care have been developed. By using these devices, the users are able to collect personal health records (PHRs) that include data such as activity amount, heart rate, stress, and blood sugar. The range of the collected PHRs can be limited depending on the equipment or the surrounding environment. To overcome this problem, it is necessary to make a comparison with similar users in a cluster. Also, it is necessary to provide a service that can analyze and visually display the collected personal-health information. In this paper, we propose the mining of health-risk factors using the PHR similarity in a hybrid P2P network. This is a method of predicting a user’s health status using similarity-based data mining, where the PHRs are employed in a hybrid P2P environment consisting of a peer, a server, and a gateway. In a hybrid P2P environment, a user receives feedback on the result of a structured-data analysis. A peer searches for a different peer and gateway through a server and exchanges information. Depending on the data type, the PHR is divided into medical health examination, self-diagnosis, and personal-health data. The medical health examination contains the personal-health data that are generated regularly by a medical institution. Self-diagnosis represents the data of mental health, pains, and fatigue that can be changed often but cannot be collected by devices. Personal-health data mean the data that can be collected by individuals in everyday life. For the PHR-data analysis, an index is given to each attribute, and preprocessing is performed after a binary-code conversion. To predict a user’s health status, the PHR data are clustered on the basis of similarity in a hybrid P2P environment. The similarity between a user’s PHR and a PHR that is searched for in the network is measured. After the measurement, an index is given to the PHR that meets the minimum similarity and the PHR is incorporated into a Similarity PHR Group. The Similarity PHR Group flexibly changes depending on a user’s PHR status and the statuses of the users who have accessed the hybrid P2P network. A representative value of the Similarity PHR Group is extracted and is then compared with the user’s PHR to judge the user’s health status. The proposed method is suitable for a smart health service for chronic diseases requiring constant care, elderly health, and aftercare. This is a user-oriented health-care and promotion service wherein a user’s health status can be predicted through the mining of the health-risk factors of PHRs.  相似文献   

5.
为了解决当不完备混合决策系统中数据动态增加时,静态属性约简方法的计算复杂度高的问题,提出变精度下不完备混合数据的增量式属性约简方法。首先,在变精度模型下给出了利用条件熵度量属性的重要性程度;然后,详细分析和设计了当数据动态增加时条件熵的增量式更新变化情况和属性约简的更新机制;在此基础上,利用启发式贪心策略构造了增量式的属性约简算法,实现了不完备的数值型和符号型混合数据下属性约简的动态更新。通过UCI数据集中五个真实的混合型数据集的实验比较和分析,在约简效果方面,利用增量式属性约简算法处理Echocardiogram、Hepatitis、Autos、Credit和Dermatology数据集的增量规模为90%+10%时,数据集的原属性个数分别由12、19、25、17和34个约简至6、7、10、11和13个,分别占原属性集的50.0%、36.8%、40.0%、64.7%和38.2%;在执行时间方面,增量式算法在五个数据集的平均耗时分别为2.99 s、3.13 s、9.70 s、274.19 s和50.87 s,静态算法的平均耗时分别为284.92 s、302.76 s、1062.23 s、3510.79 s和667.85 s,且增量式算法的耗时与数据集的实例规模、属性个数和属性值类型的分布相关。实验结果表明,增量式属性约简算法在计算耗时方面要显著优于静态算法,且能有效剔除数据中的冗余属性。  相似文献   

6.
对于高维度小样本数据的分类问题,高维属性的复杂性限制了分类模型预测的准确率。为了进一步提高准确率,提出了基于线性回归和属性集成的分类算法。首先,采用线性回归为每一个属性构建属性线性分类器(Attri-bute Linear Classifier,ALC);其次,为了避免因ALC数量过多而导致准确率下降,利用经验风险最小化策略中的经验损失值作为评估标准来优选ALC;最后,应用多数投票法来集成被筛选的ALC。采用高维度小样本的基因表达数据集进行实验,结果显示该算法具有比逻辑回归、支持向量机和随机森林算法更高的准确率。  相似文献   

7.
Although multiple attribute decision making (MADM) problems with both individual attribute data of a single alternative and collaborative attribute data of pairwise alternatives exist in the real world, they have seldom been a focus of research. This paper proposes a MADM method using individual and collaborative attribute data in a fuzzy environment, in which experts use linguistic variables to express their opinions. In the method, first, the evaluation matrix of individual attributes date and the judgment matrix of collaborative attributes data are constructed. Then, the central dominance of one alternative outranking other all alternatives is defined for aggregating the collaborative data. From this, an integrated decision matrix incorporating individual and collaborative attribute data is constructed. Further, based on an extended TOPSIS, the fuzzy positive-ideal solution (FPIS) and the fuzzy negative-ideal solution (FNIS) are determined, and the relative closeness of each alternative to the FPIS and FNIS is calculated to determine the ranking order of all alternatives. Finally, two examples are used to illustrate the applicability of the proposed method.  相似文献   

8.
将粗糙集理论中属性重要度和依赖度的概念与分级聚类离散化算法相结合,提出了一种纳税人连续型属性动态的离散化算法。首先将纳税数据对象的每个连续型属性划分为2类,然后利用粗糙集理论计算每个条件属性对于决策属性的重要度,再通过重要度由大至小排序进行增类运算,最后将保持与原有数据对象集依赖度一致的分类结果输出。该算法能够动态地对数据对象进行类别划分,实现纳税人连续型属性的离散化。通过采用专家分析和关联分析的实验结果,验证了该算法具有较高的纳税人连续型属性离散化精度和性能。  相似文献   

9.
The publication of microdata is pivotal for medical research purposes, data analysis and data mining. These published data contain a substantial amount of sensitive information, for example, a hospital may publish many sensitive attributes such as diseases, treatments and symptoms. The release of multiple sensitive attributes is not desirable because it puts the privacy of individuals at risk. The main vulnerability of such approach while releasing data is that if an adversary is successful in identifying a single sensitive attribute, then other sensitive attributes can be identified by co-relation. A whole variety of techniques such as SLOMS, SLAMSA and others already exist for the anonymization of multiple sensitive attributes; however, these techniques have their drawbacks when it comes to preserving privacy and ensuring data utility. The extant framework lacks in terms of preserving privacy for multiple sensitive attributes and ensuring data utility. We propose an efficient approach (p, k)-Angelization for the anonymization of multiple sensitive attributes. Our proposed approach protects the privacy of the individuals and yields promising results compared with currently used techniques in terms of utility. The (p, k)-Angelization approach not only preserves the privacy by eliminating the threat of background join and non-membership attacks but also reduces the information loss thus improving the utility of the released information.  相似文献   

10.
一类混杂系统建模和优化控制的研究   总被引:2,自引:2,他引:0  
为解决混杂系统优化控制的计算复杂性问题,针对结合逻辑规则的工业过程混杂模型,采用结合约束程序的混合整数非线性规划算法,求解这种混杂模型的优化控制。计算实例表明,通过混杂建模方法,可以充分利用工业对象的机理模型以及操作工经验或专家经验,建立系统的更精确模型;结合约束程序混合整数非线性规划算法可以较迅速地求解混杂模型优化控制问题,从而使该方法可以用于工业过程实时控制中。  相似文献   

11.
An L-150 pilot Jameson flotation cell was instrumented and a distributed control system was developed. The parameters of a metallurgic phenomenological model were estimated from industrial data. A steady state simulator was built based on this nonlinear model.This hybrid system combines on-line measured operating variables with virtual variables, characterizing the feed. All these variables are fed on-line to a simulator to predict the characteristics of the concentrate and tailings.The expert system modifies the set points of the distributed control system, including two routines: expert feedback and feed forward control. Several cases for different feed conditions are discussed.  相似文献   

12.
Instance-based attribute identification in database integration   总被引:3,自引:0,他引:3  
Most research on attribute identification in database integration has focused on integrating attributes using schema and summary information derived from the attribute values. No research has attempted to fully explore the use of attribute values to perform attribute identification. We propose an attribute identification method that employs schema and summary instance information as well as properties of attributes derived from their instances. Unlike other attribute identification methods that match only single attributes, our method matches attribute groups for integration. Because our attribute identification method fully explores data instances, it can identify corresponding attributes to be integrated even when schema information is misleading. Three experiments were performed to validate our attribute identification method. In the first experiment, the heuristic rules derived for attribute classification were evaluated on 119 attributes from nine public domain data sets. The second was a controlled experiment validating the robustness of the proposed attribute identification method by introducing erroneous data. The third experiment evaluated the proposed attribute identification method on five data sets extracted from online music stores. The results demonstrated the viability of the proposed method.Received: 30 August 2001, Accepted: 31 August 2002, Published online: 31 July 2003Edited by L. Raschid  相似文献   

13.
A hierarchical model for object-oriented design quality assessment   总被引:1,自引:0,他引:1  
The paper describes an improved hierarchical model for the assessment of high-level design quality attributes in object-oriented designs. In this model, structural and behavioral design properties of classes, objects, and their relationships are evaluated using a suite of object-oriented design metrics. This model relates design properties such as encapsulation, modularity, coupling, and cohesion to high-level quality attributes such as reusability, flexibility, and complexity using empirical and anecdotal information. The relationship or links from design properties to quality attributes are weighted in accordance with their influence and importance. The model is validated by using empirical and expert opinion to compare with the model results on several large commercial object-oriented systems. A key attribute of the model is that it can be easily modified to include different relationships and weights, thus providing a practical quality assessment tool adaptable to a variety of demands  相似文献   

14.
增量式属性约简是一种针对动态数据集的新型属性约简方法。然而目前的增量式属性约简很少有对不完备混合型的信息系统进行研究。针对这类问题提出一种属性增加时的增量式属性约简算法。在不完备混合型信息系统下引入邻域容差关系。基于邻域容差关系的粒化单调性,提出信息系统属性增加时邻域容差条件熵的增量式更新方法,并提出了不完备混合型信息系统下的邻域容差条件熵增量式属性约简算法。实验分析表明了该算法的有效性。  相似文献   

15.
We present a general technique for dynamizing a class of problems whose underlying structure is a computation graph embedded in a tree. We introduce three fully dynamic data structures, called path attribute systems, tree attribute systems, and linear attribute grammars, which extend and generalize the dynamic trees of Sleator and Tarjan. More specifically, we associate values, called attributes, with the nodes and paths of a rooted tree. Path attributes form a path attribute system if they can be maintained in constant time under path concatenation. Node attributes form a tree attribute system if the tree attributes of the tail of a path Π can be determined in constant time from the path attributes of Π. A linear attribute grammar is a tree-based linear expression such that the values of a node μ are calculated from the values at the parent, siblings, and/for children of μ. We provide a framework for maintaining path attribute systems, tree attribute systems, and linear attribute grammars in a fully dynamic environment using linear space and logarithmic time per operation. Also, we demonstrate the applicability of our techniques by showing examples of graph and geometric problems that can be efficiently dynamized, including biconnectivity and triconnectivity queries, planarity testing, drawing trees and series-parallel digraphs, slicing floorplan compaction, point location, and many optimization problems on bounded tree-width graphs. Received May 13, 1994; revised October 12, 1995.  相似文献   

16.

This article presents a framework and case study of knowledge revision and maintenance. Doing so it relies on a real-world application of medical decision making, namely, acute abdominal pain in children (AAPC), and makes use of an integrated learning and knowledge revision system called MOBAL. The presented framework integrates expert knowledge with empirical knowledge. MOBAL serves as a vehicle of interaction between the expert user and the medical records, which brings empirical knowledge into the process. An iterative-cyclic framework is used to formalize knowledge revision and maintenance. Results are demonstrated using AAPC. The article focuses on expert user-revision system interaction while it offers a complete case study account of using MOBAL in a real-world setting.  相似文献   

17.
Business operation performance is related to corporation profitability and directly affects the choices of investment in the stock market. This paper proposes a hybrid method, which combines the ordered weighted averaging (OWA) operator and rough set theory after an attribute selection procedure to deal with multi-attribute forecasting problems with respect to revenue growth rate of the electronic industry. In the attribute selection step, four most-important attributes within 12 attributes collected from related literature are determined via five attribute selection methods as the input of the following procedure of the proposed method. The OWA operator can adjust the weight of an attribute based on the situation of a decision-maker and aggregate different attribute values into a single aggregated value of each instance, and then the single aggregated values are utilized to generate classification rules by rough set for forecasting operation performance.To verify the proposed method, this research collects the financial data of 629 electronic firms for public companies listed in the TSE (Taiwan Stock Exchange) and OTC (Over-the-Counter) market in 2004 and 2005 to forecast the revenue growth rate. The results show that the proposed method outperforms the listing methods.  相似文献   

18.
Since naïve Bayesian classifiers are suitable for processing discrete attributes, many methods have been proposed for discretizing continuous ones. However, none of the previous studies apply more than one discretization method to the continuous attributes in a data set for naïve Bayesian classifiers. Different approaches employ different information embedded in continuous attributes to determine the boundaries for discretization. It is likely that discretizing the continuous attributes in a data set using different methods can utilize the information embedded in the attributes more thoroughly and thus improve the performance of naïve Bayesian classifiers. In this study, we propose a nonparametric measure to evaluate the dependence level between a continuous attribute and the class. The nonparametric measure is then used to develop a hybrid method for discretizing continuous attributes so that the accuracy of the naïve Bayesian classifier can be enhanced. This hybrid method is tested on 20 data sets, and the results demonstrate that discretizing the continuous attributes in a data set by various methods can generally have a higher prediction accuracy.  相似文献   

19.
基于Markov逻辑网的两阶段数据冲突解决方法   总被引:1,自引:0,他引:1  
在数据集成中,如何准确地解决数据冲突是关系集成数据质量的关键问题.现有的方法主要针对单个属性进行冲突解决,由于没有区分不同属性的冲突程度,也没有考虑不同属性间冲突解决的相互影响,导致数据冲突解决的准确率不高.针对现有方法存在的不足,文中提出一种基于Markov逻辑网的两阶段数据冲突解决方法.该方法可以根据冲突程度对属性进行划分,并分两阶段进行处理:(1)在第1阶段,对于弱冲突属性,利用投票规则及事实之间相互印证等简单规则进行冲突解决;(2)在第2阶段,利用了第1阶段冲突解决的结果,在规则中加入数据源与事实之间的相互影响规则、数据源之间相互依赖规则及弱冲突属性对强冲突属性影响规则,对强冲突属性进行冲突解决.通过在大量真实数据上的实验结果证明,该方法能够有效地解决集成数据的冲突问题,具有较高的准确率.  相似文献   

20.
对医疗数据库中存在的离散数据进行检测时,由于缺少数据过滤等过程而导致检测执行时间较长、检测效率低、离散点检测率低等问题,为此提出基于层次化深度学习的医疗数据库离散数据检测算法.首先,采用动态网格划分法划分空间中的稀疏区域和稠密区域,降低数据检测的规模,缩短检测执行时间;然后,通过层次化深度学习过程融合专家知识和数据的属性取值分布信息,实现医疗数据库中离散数据的检测.实验结果表明,该算法可以在较短的时间内准确完成医疗数据库中离散数据的检测,且相较于传统算法来说更具有应用优势.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号