首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper aims to explore the true managerial efficiencies of the branches of a case bank in Taiwan. With 123 branches of the case bank comprising the sample, the study finds that, after the adjustment of environmental factors and statistical noise, managerial efficiency values from a three-stage data envelopment analysis (DEA) varies significantly from the traditional DEA model. This finding suggests that environmental variables have significant effect on branch efficiency. Moreover, scale inefficiency is the major cause of operating inefficiency in the case bank, and most branches are operating at the stage of increasing return to scale. With regards the branches’ business scope, those that operate loan and wealth management services have better managerial efficiency than those that focus on wealth management only. In terms of deposit amount, branches with a higher deposit amount generate better managerial efficiency. Finally, the results for regional location show no significant effect on branches’ managerial efficiency in Taiwan.  相似文献   

2.
《Computers & Geosciences》2006,32(9):1451-1460
A modelling concept is presented that enables a quantitative evaluation of transport and natural attenuation processes during bank filtration. The aim is to identify ranges of degradation rates for which bank filtration is effective or ineffective. Such modelling should accompany experimental work, as otherwise the meaning of determined degradation rates for a field situation remains uncertain. The presented concept is a combination of analytical and numerical methods, solving differential equations directly for the steady state. It is implemented using FEMLAB® code and demonstrates a typical idealized situation with a single well near a straight bank boundary. The method can be applied to confined, to unconfined and to partially confined/unconfined aquifers and may be extended for applications in more complex situations, including a clogging layer, galleries of pumping and recharge wells, etc.  相似文献   

3.
Very little research in knowledge discovery has studied how to incorporate statistical methods to automate linear correlation discovery (LCD). We present an automatic LCD methodology that adopts statistical measurement functions to discover correlations from databases’ attributes. Our methodology automatically pairs attribute groups having potential linear correlations, measures the linear correlation of each pair of attribute groups, and confirms the discovered correlation. The methodology is evaluated in two sets of experiments. The results demonstrate the methodology’s ability to facilitate linear correlation discovery for databases with a large amount of data.  相似文献   

4.
采用数据挖掘手段,基于某银行零售业的数据,分析了客户的投资偏好。采用CART决策树进行特征筛选,发现客户群体年龄大于30岁,资产处于5万以上且工作稳定的保守型客户更倾向于购买银行基金产品。此外,还构建了逻辑回归模型对客户购买基金的概率进行预测。结果表明,通过数据挖掘相关方法所筛选得到的客户群体有更高的购买概率,因此极大地提高了银行从业人员的工作效率。  相似文献   

5.
Analysis of cancer data: a data mining approach   总被引:1,自引:1,他引:0  
Abstract: Even though cancer research has traditionally been clinical and biological in nature, in recent years data driven analytic studies have become a common complement. In medical domains where data and analytics driven research is successfully applied, new and novel research directions are identified to further advance the clinical and biological studies. In this research, we used three popular data mining techniques (decision trees, artificial neural networks and support vector machines) along with the most commonly used statistical analysis technique logistic regression to develop prediction models for prostate cancer survivability. The data set contained around 120 000 records and 77 variables. A k-fold cross-validation methodology was used in model building, evaluation and comparison. The results showed that support vector machines are the most accurate predictor (with a test set accuracy of 92.85%) for this domain, followed by artificial neural networks and decision trees.  相似文献   

6.
数据挖掘是一门新兴的极具创新性的学科,在许多领域具有广泛的应用,在化学计量学领域的应用也越来越受到人们的重视。本文将数据挖掘技术应用于核磁共振谱的研究,通过相应的挖掘方法将核磁共振谱与有机化合物结构间的关系,表示为特征范围——子结构之间的关联规则,并给出相应的挖掘算法。  相似文献   

7.
Although the integration of engineering data within the framework of product data management systems has been successful in the recent years, the holistic analysis (from a systems engineering perspective) of multi-disciplinary data or data based on different representations and tools is still not realized in practice. At the same time, the application of advanced data mining techniques to complete designs is very promising and bears a high potential for synergy between different teams in the development process. In this paper, we propose shape mining as a framework to combine and analyze data from engineering design across different tools and disciplines. In the first part of the paper, we introduce unstructured surface meshes as meta-design representations that enable us to apply sensitivity analysis, design concept retrieval and learning as well as methods for interaction analysis to heterogeneous engineering design data. We propose a new measure of relevance to evaluate the utility of a design concept. In the second part of the paper, we apply the formal methods to passenger car design. We combine data from different representations, design tools and methods for a holistic analysis of the resulting shapes. We visualize sensitivities and sensitive cluster centers (after feature reduction) on the car shape. Furthermore, we are able to identify conceptual design rules using tree induction and to create interaction graphs that illustrate the interrelation between spatially decoupled surface areas. Shape data mining in this paper is studied for a multi-criteria aerodynamic problem, i.e. drag force and rear lift, however, the extension to quality criteria from different disciplines is straightforward as long as the meta-design representation is still applicable.  相似文献   

8.
E-learning systems provide a promising solution as an information exchanging channel. Improved technologies could mean faster and easier access to information but do not necessarily ensure the quality of this information; for this reason it is essential to develop valid and reliable methods of quality measurement and carry out careful information quality evaluations. This paper proposes an assessment model for information quality in e-learning systems based on the quality framework we proposed previously: the proposed framework consists of 14 quality dimensions grouped in three quality factors: intrinsic, contextual representation and accessibility. We use the relative importance as a parameter in a linear equation for the measurement scheme. Formerly, we implemented a goal-question-metrics approach to develop a set of quality metrics for the identified quality attributes within the proposed framework. In this paper, the proposed metrics were computed to produce a numerical rating indicating the overall information quality published in a particular e-learning system. The data collection and evaluation processes were automated using a web data extraction technique and results on a case study are discussed. This assessment model could be useful to e-learning systems designers, providers and users as it provides a comprehensive indication of the quality of information in such systems.  相似文献   

9.
The recent trends in collecting huge and diverse datasets have created a great challenge in data analysis. One of the characteristics of these gigantic datasets is that they often have significant amounts of redundancies. The use of very large multi-dimensional data will result in more noise, redundant data, and the possibility of unconnected data entities. To efficiently manipulate data represented in a high-dimensional space and to address the impact of redundant dimensions on the final results, we propose a new technique for the dimensionality reduction using Copulas and the LU-decomposition (Forward Substitution) method. The proposed method is compared favorably with existing approaches on real-world datasets: Diabetes, Waveform, two versions of Human Activity Recognition based on Smartphone, and Thyroid Datasets taken from machine learning repository in terms of dimensionality reduction and efficiency of the method, which are performed on statistical and classification measures.  相似文献   

10.
Sequential pattern mining is an important data mining problem with broad applications. However,it is also a challenging problem since the mining may have to generate or examine a combinatorially explosivenumber of intermediate subsequences. Recent studies have developed two major classes of sequential patternmining methods: (1) a candidate generation-and-test approach, represented by (i) GSP, a horizontal format-basedsequential pattern mining method, and (ii) SPADE, a vertical format-based method; and (2) a pattern-growthmethod, represented by PrefixSpan and its further extensions, such as gSpan for mining structured patterns. In this study, we perform a systematic introduction and presentation of the pattern-growth methodologyand study its principles and extensions. We first introduce two interesting pattern-growth algorithms, FreeSpanand PrefixSpan, for efficient sequential pattern mining. Then we introduce gSpan for mining structured patternsusing the same methodology. Their relative performance in l  相似文献   

11.
Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at finding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such “overfitting” by generalizing the model to allow for more behavior. This generalization is often driven by the representation language and very crude assumptions about completeness. As a result, parts of the model are “overfitting” (allow only for what has actually been observed) while other parts may be “underfitting” (allow for much more behavior without strong support for it). None of the existing techniques enables the user to control the balance between “overfitting” and “underfitting”. To address this, we propose a two-step approach. First, using a configurable approach, a transition system is constructed. Then, using the “theory of regions”, the model is synthesized. The approach has been implemented in the context of ProM and overcomes many of the limitations of traditional approaches.  相似文献   

12.
Longitudinal data refer to the situation where repeated observations are available for each sampled object. Clustered data, where observations are nested in a hierarchical structure within objects (without time necessarily being involved) represent a similar type of situation. Methodologies that take this structure into account allow for the possibilities of systematic differences between objects that are not related to attributes and autocorrelation within objects across time periods. A standard methodology in the statistics literature for this type of data is the mixed effects model, where these differences between objects are represented by so-called “random effects” that are estimated from the data (population-level relationships are termed “fixed effects,” together resulting in a mixed effects model). This paper presents a methodology that combines the structure of mixed effects models for longitudinal and clustered data with the flexibility of tree-based estimation methods. We apply the resulting estimation method, called the RE-EM tree, to pricing in online transactions, showing that the RE-EM tree is less sensitive to parametric assumptions and provides improved predictive power compared to linear models with random effects and regression trees without random effects. We also apply it to a smaller data set examining accident fatalities, and show that the RE-EM tree strongly outperforms a tree without random effects while performing comparably to a linear model with random effects. We also perform extensive simulation experiments to show that the estimator improves predictive performance relative to regression trees without random effects and is comparable or superior to using linear models with random effects in more general situations.  相似文献   

13.
One major challenge in the content-based image retrieval (CBIR) and computer vision research is to bridge the so-called “semantic gap” between low-level visual features and high-level semantic concepts, that is, extracting semantic concepts from a large database of images effectively. In this paper, we tackle the problem by mining the decisive feature patterns (DFPs). Intuitively, a decisive feature pattern is a combination of low-level feature values that are unique and significant for describing a semantic concept. Interesting algorithms are developed to mine the decisive feature patterns and construct a rule base to automatically recognize semantic concepts in images. A systematic performance study on large image databases containing many semantic concepts shows that our method is more effective than some previously proposed methods. Importantly, our method can be generally applied to any domain of semantic concepts and low-level features. Wei Wang received his Ph.D. degree in Computing Science and Engineering from the State University of New York (SUNY) at Buffalo in 2004, under Dr. Aidong Zhang's supervision. He received the B.Eng. in Electrical Engineering from Xi'an Jiaotong University, China in 1995 and the M.Eng. in Computer Engineering from National University of Singapore in 2000, respectively. He joined Motorola Inc. in 2004, where he is currently a senior research engineer in Multimedia Research Lab, Motorola Applications Research Center. His research interests can be summarized as developing novel techniques for multimedia data analysis applications. He is particularly interested in multimedia information retrieval, multimedia mining and association, multimedia database systems, multimedia processing and pattern recognition. He has published 15 research papers in refereed journals, conferences, and workshops, has served in the organization committees and the program committees of IADIS International Conference e-Society 2005 and 2006, and has been a reviewer for some leading academic journals and conferences. In 2005, his research prototype of “seamless content consumption” was awarded the “most innovative research concept of the year” from the Motorola Applications Research Center. Dr. Aidong Zhang received her Ph.D. degree in computer science from Purdue University, West Lafayette, Indiana, in 1994. She was an assistant professor from 1994 to 1999, an associate professor from 1999 to 2002, and has been a professor since 2002 in the Department of Computer Science and Engineering at the State University of New York at Buffalo. Her research interests include bioinformatics, data mining, multimedia systems, content-based image retrieval, and database systems. She has authored over 150 research publications in these areas. Dr. Zhang's research has been funded by NSF, NIH, NIMA, and Xerox. Dr. Zhang serves on the editorial boards of International Journal of Bioinformatics Research and Applications (IJBRA), ACMMultimedia Systems, the International Journal of Multimedia Tools and Applications, and International Journal of Distributed and Parallel Databases. She was the editor for ACM SIGMOD DiSC (Digital Symposium Collection) from 2001 to 2003. She was co-chair of the technical program committee for ACM Multimedia 2001. She has also served on various conference program committees. Dr. Zhang is a recipient of the National Science Foundation CAREER Award and SUNY Chancellor's Research Recognition Award.  相似文献   

14.
Motivated by a growing need for intelligent housing to accommodate ageing populations, we propose a novel application of intertransaction association rule (IAR) mining to detect anomalous behaviour in smart home occupants. An efficient mining algorithm that avoids the candidate generation bottleneck limiting the application of current IAR mining algorithms on smart home data sets is detailed. An original visual interface for the exploration of new and changing behaviours distilled from discovered patterns using a new process for finding emergent rules is presented. Finally, we discuss our observations on the emergent behaviours detected in the homes of two real world subjects.  相似文献   

15.
Frequent pattern mining (FPM) is an important data mining paradigm to extract informative patterns like itemsets, sequences, trees, and graphs. However, no practical framework for integrating the FPM tasks has been attempted. In this paper, we describe the design and implementation of the Data Mining Template Library (DMTL) for FPM. DMTL utilizes a generic data mining approach, where all aspects of mining are controlled via a set of properties. It uses a novel pattern property hierarchy to define and mine different pattern types. This property hierarchy can be thought of as a systematic characterization of the pattern space, i.e., a meta-pattern specification that allows the analyst to specify new pattern types, by extending this hierarchy. Furthermore, in DMTL all aspects of mining are controlled by a set of different mining properties. For example, the kind of mining approach to use, the kind of data types and formats to mine over, the kind of back-end storage manager to use, are all specified as a list of properties. This provides tremendous flexibility to customize the toolkit for various applications. Flexibility of the toolkit is exemplified by the ease with which support for a new pattern can be added. Experiments on synthetic and public dataset are conducted to demonstrate the scalability provided by the persistent back-end in the library. DMTL been publicly released as open-source software (), and has been downloaded by numerous researchers from all over the world.  相似文献   

16.
A data warehouse is an important decision support system with cleaned and integrated data for knowledge discovery and data mining systems. In reality, the data warehouse mining system has provided many applicable solutions in industries, yet there are still many problems causing users extra problems in discovering knowledge or even failing to obtain the real and useful knowledge they need. To improve the overall data warehouse mining process, we present an intelligent data warehouse mining approach incorporated with schema ontology, schema constraint ontology, domain ontology and user preference ontology. The structures of these ontologies are illustrated and how they benefit the mining process is also demonstrated by examples utilizing rule mining. Finally, we present a prototype multidimensional association mining system, which with intelligent assistance through the support of the ontologies, can help users build useful data mining models, prevent ineffective pattern generation, discover concept extended rules, and provide an active knowledge re-discovering mechanism.  相似文献   

17.
During restructuring processes, due to mergers and acquisitions, banks frequently face the problem of having redundant branches competing in the same market. In this work, we introduce a new Capacitated Branch Restructuring Model which extends the available literature in delocation models. It considers both closing down and long term operations׳ costs, and addresses the problem of resizing open branches in order to maintain a constant service level. We consider, as well, the presence of competitors and allow for ceding market share whenever the restructuring costs are prohibitively expensive.We test our model in a real life scenario, obtaining a reduction of about 40% of the network size, and annual savings over 45% in operation costs from the second year on. We finally perform a sensitivity analysis on critical parameters. This analysis shows that the final design of the network depends on certain strategic decisions concerning the redundancy of the branches, as well as their proximity to the demand nodes and to the competitor׳s branches. At the same time, this design is quite robust to changes in the parameters associated with the adjustments on service capacity and with the market reaction.  相似文献   

18.
The discovery of knowledge through data mining provides a valuable asset for addressing decision making problems. Although a list of features may characterize a problem, it is often the case that a subset of those features may influence more a certain group of events constituting a sub‐problem within the original problem. We propose a divide‐and‐conquer strategy for data mining using both the data‐based sensitivity analysis for extracting feature relevance and expert evaluation for splitting the problem of characterizing telemarketing contacts to sell bank deposits. As a result, the call direction (inbound/outbound) was considered the most suitable candidate feature. The inbound telemarketing sub‐problem re‐evaluation led to a large increase in targeting performance, confirming the benefits of such approach and considering the importance of telemarketing for business, in particular in bank marketing.  相似文献   

19.
20.
The prediction of bank performance is an important issue. The bad performance of banks may first result in bankruptcy, which is expected to influence the economics of the country eventually. Since the early 1970s, many researchers had already made predictions on such issues. However, until recent years, most of them have used traditional statistics to build the prediction model. Because of the vigorous development of data mining techniques, many researchers have begun to apply those techniques to various fields, including performance prediction systems. However, data mining techniques have the problem of parameter settings. Therefore, this study applies particle swarm optimization (PSO) to obtain suitable parameter settings for support vector machine (SVM) and decision tree (DT), and to select a subset of beneficial features, without reducing the classification accuracy rate. In order to evaluate the proposed approaches, dataset collected from Taiwanese commercial banks are used as source data. The experimental results showed that the proposed approaches could obtain a better parameter setting, reduce unnecessary features, and improve the accuracy of classification significantly.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号