共查询到20条相似文献,搜索用时 46 毫秒
1.
Sparsity of a classifier is a desirable condition for high-dimensional data and large sample sizes. This paper investigates the two complementary notions of sparsity for binary classification: sparsity in the number of features and sparsity in the number of examples. Several different losses and regularizers are considered: the hinge loss and ramp loss, and ? 2, ? 1, approximate ? 0, and capped ? 1 regularization. We propose three new objective functions that further promote sparsity, the capped ? 1 regularization with hinge loss, and the ramp loss versions of approximate ? 0 and capped ? 1 regularization. We derive difference of convex functions algorithms (DCA) for solving these novel non-convex objective functions. The proposed algorithms are shown to converge in a finite number of iterations to a local minimum. Using simulated data and several data sets from the University of California Irvine (UCI) machine learning repository, we empirically investigate the fraction of features and examples required by the different classifiers. 相似文献
2.
We offer an efficient approach based on difference of convex functions (DC) optimization for self-organizing maps (SOM). We consider SOM as an optimization problem with a nonsmooth, nonconvex energy function and investigated DC programming and DC algorithm (DCA), an innovative approach in nonconvex optimization framework to effectively solve this problem. Furthermore an appropriate training version of this algorithm is proposed. The numerical results on many real-world datasets show the efficiency of the proposed DCA based algorithms on both quality of solutions and topographic maps. 相似文献
3.
The ridge logistic regression has successfully been used in text categorization problems and it has been shown to reach the same performance as the Support Vector Machine but with the main advantage of computing a probability value rather than a score. However, the dense solution of the ridge makes its use unpractical for large scale categorization. On the other side, LASSO regularization is able to produce sparse solutions but its performance is dominated by the ridge when the number of features is larger than the number of observations and/or when the features are highly correlated. In this paper, we propose a new model selection method which tries to approach the ridge solution by a sparse solution. The method first computes the ridge solution and then performs feature selection. The experimental evaluations show that our method gives a solution which is a good trade-off between the ridge and LASSO solutions. 相似文献
4.
We present a methodology for managing outsourcing projects from the vendor's perspective, designed to maximize the value to both the vendor and its clients. The methodology is applicable across the outsourcing lifecycle, providing the capability to select and target new clients, manage the existing client portfolio and quantify the realized benefits to the client resulting from the outsourcing agreement. Specifically, we develop a statistical analysis framework to model client behavior at each stage of the outsourcing lifecycle, including: (1) a predictive model and tool for white space client targeting and selection— opportunity identification (2) a model and tool for client risk assessment and project portfolio management— client tracking, and (3) a systematic analysis of outsourcing results, impact analysis, to gain insights into potential benefits of IT outsourcing as a part of a successful management strategy. Our analysis is formulated in a logistic regression framework, modified to allow for non-linear input–output relationships, auxiliary variables, and small sample sizes. We provide examples to illustrate how the methodology has been successfully implemented for targeting, tracking, and assessing outsourcing clients within IBM global services division. Scope and purposeThe predominant literature on IT outsourcing often examines various aspects of vendor–client relationship, strategies for successful outsourcing from the client perspective, and key sources of risk to the client, generally ignoring the risk to the vendor. However, in the rapidly changing market, a significant share of risks and responsibilities falls on vendor, as outsourcing contracts are often renegotiated, providers replaced, or services brought back in house. With the transformation of outsourcing engagements, the risk on the vendor's side has increased substantially, driving the vendor's financial and business performance and eventually impacting the value delivery to the client. As a result, only well-ran vendor firms with robust processes and tools that allow identification and active management of risk at all stages of the outsourcing lifecycle are able to deliver value to the client. This paper presents a framework and methodology for managing a portfolio of outsourcing projects from the vendor's perspective, throughout the entire outsourcing lifecycle. We address three key stages of the outsourcing process: (1) opportunity identification and qualification (i.e. selection of the most likely new clients), (2) client portfolio risk management during engagement and delivery, and (3) quantification of benefits to the client throughout the life of the deal. 相似文献
5.
Monitoring gene expression profiles is a novel approach to cancer diagnosis. Several studies have showed that the sparse logistic regression is a useful classification method for gene expression data. Not only does it give a sparse solution with high accuracy, it provides the user with explicit probabilities of classification apart from the class information. However, its optimal extension to more than two classes is not obvious. In this paper, we propose a multiclass extension of sparse logistic regression. Analysis of five publicly available gene expression data sets shows that the proposed method outperforms the standard multinomial logistic model in prediction accuracy as well as gene selectivity. 相似文献
7.
An omnibus test for testing a generalized version of the martingale difference hypothesis (MDH) is proposed. This generalized hypothesis includes the usual MDH, testing for conditional moments constancy such as conditional homoscedasticity (ARCH effects) or testing for directional predictability. A unified approach for dealing with all of these testing problems is proposed. These hypotheses are long standing problems in econometric time series analysis, and typically have been tested using the sample autocorrelations or in the spectral domain using the periodogram. Since these hypotheses cover also nonlinear predictability, tests based on those second order statistics are inconsistent against uncorrelated processes in the alternative hypothesis. In order to circumvent this problem pairwise integrated regression functions are introduced as measures of linear and nonlinear dependence. The proposed test does not require to chose a lag order depending on sample size, to smooth the data or to formulate a parametric alternative model. Moreover, the test is robust to higher order dependence, in particular to conditional heteroskedasticity. Under general dependence the asymptotic null distribution depends on the data generating process, so a bootstrap procedure is considered and a Monte Carlo study examines its finite sample performance. Then, the martingale and conditional heteroskedasticity properties of the Pound/Dollar exchange rate are investigated. 相似文献
8.
This paper presents Visper, a novel object-oriented framework that identifies and enhances common services and programming primitives, and implements a generic set of classes applicable to multiple programming models in a distributed environment. Groups of objects, which can be programmed in a uniform and transparent manner, and agent-based distributed system management, are also featured in Visper. A prototype system is designed and implemented in Java, with a number of visual utilities that facilitate program development and portability. As a use case, Visper integrates parallel programming in an MPI-like message-passing paradigm at a high level with services such as checkpointing and fault tolerance at a lower level. The paper reports a range of performance evaluation on the prototype and compares it to related works 相似文献
9.
For semiparametric models, one of the key issues is to reduce the predictors’ dimension so that the regression functions can be efficiently estimated based on the low-dimensional projections of the original predictors. Many sufficient dimension reduction methods seek such principal projections by conducting the eigen-decomposition technique on some method-specific candidate matrices. In this paper, we propose a sparse eigen-decomposition strategy by shrinking small sample eigenvalues to zero. Different from existing methods, the new method can simultaneously estimate basis directions and structural dimension of the central (mean) subspace in a data-driven manner. The oracle property of our estimation procedure is also established. Comprehensive simulations and a real data application are reported to illustrate the efficacy of the new proposed method. 相似文献
10.
An extensive review for the recent developments of multiple criteria linear programming data mining models is provided in
this paper. These researches, which include classification and regression methods, are introduced in a systematic way. Some
applications of these methods to real-world problems are also involved in this paper. This paper is a summary and reference
of multiple criteria linear programming methods that might be helpful for researchers and applications in data mining. 相似文献
11.
Multivariate adaptive regression splines (MARS) provide a flexible statistical modeling method that employs forward and backward search algorithms to identify the combination of basis functions that best fits the data and simultaneously conduct variable selection. In optimization, MARS has been used successfully to estimate the unknown functions in stochastic dynamic programming (SDP), stochastic programming, and a Markov decision process, and MARS could be potentially useful in many real world optimization problems where objective (or other) functions need to be estimated from data, such as in surrogate optimization. Many optimization methods depend on convexity, but a non-convex MARS approximation is inherently possible because interaction terms are products of univariate terms. In this paper a convex MARS modeling algorithm is described. In order to ensure MARS convexity, two major modifications are made: (1) coefficients are constrained, such that pairs of basis functions are guaranteed to jointly form convex functions and (2) the form of interaction terms is altered to eliminate the inherent non-convexity. Finally, MARS convexity can be achieved by the fact that the sum of convex functions is convex. Convex-MARS is applied to inventory forecasting SDP problems with four and nine dimensions and to an air quality ground-level ozone problem. 相似文献
12.
In this paper, we present a neural network for solving the nonlinear convex programming problem in real time by means of the projection method. The main idea is to convert the convex programming problem into a variational inequality problem. Then a dynamical system and a convex energy function are constructed for resulting variational inequality problem. It is shown that the proposed neural network is stable in the sense of Lyapunov and can converge to an exact optimal solution of the original problem. Compared with the existing neural networks for solving the nonlinear convex programming problem, the proposed neural network has no Lipschitz condition, no adjustable parameter, and its structure is simple. The validity and transient behavior of the proposed neural network are demonstrated by some simulation results. 相似文献
13.
In this paper we propose a long-step logarithmic barrier function method for convex quadratic programming with linear equality constraints. After a reduction of the barrier parameter, a series of long steps along projected Newton directions are taken until the iterate is in the vicinity of the center associated with the current value of the barrier parameter. We prove that the total number of iterations is O( nL) or O( nL), depending on how the barrier parameter is updated.On leave from Eötvös University, Budapest and partially supported by OTKA 2116. 相似文献
14.
Pattern Analysis and Applications - Incomplete data are often neglected when designing machine learning methods. A popular strategy adopted by practitioners to circumvent this consists of taking a... 相似文献
16.
We propose a new framework for the syntax and semantics of Weak Hereditarily Harrop logic programming with constraints, based on resolution over τ-categories: finite product categories with canonical structure. Constraint information is directly built-in to the notion of signature via categorical syntax. Many-sorted equational are a special case of the formalism which combines features of uniform logic programming languages (moduels and hypothetical implication) with those of constraint logic programming. Using the cannoical structure supplied by τ-categories, we define a diagrammatic generalization of formulas, goals, programs and resolution proofs up to equality (rather than just up to isomorphism). We extend the Kowalski-van Emden fixed point interpretation, a cornerstone of declarative semantics, to an operational, non-ground, categorical semantics based on indexing over sorts and programs. We also introduce a topos-theoretic declarative semantics and show soundness and completeness of resolution proofs and of a sequent calculus over the categorical signature. We conclude with a discussion of semantic perspectives on uniform logic programming. 相似文献
17.
Lawry's label semantics for modeling and computing with linguistic information in natural language provides a clear interpretation of linguistic expressions and thus a transparent model for real‐world applications. Meanwhile, annotated logic programs (ALPs) and its fuzzy extension AFLPs have been developed as an extension of classical logic programs offering a powerful computational framework for handling uncertain and imprecise data within logic programs. This paper proposes annotated linguistic logic programs (ALLPs) that embed Lawry's label semantics into the ALP/AFLP syntax, providing a linguistic logic programming formalism for development of automated reasoning systems involving soft data as vague and imprecise concepts occurring frequently in natural language. The syntax of ALLPs is introduced, and their declarative semantics is studied. The ALLP SLD‐style proof procedure is then defined and proved to be sound and complete with respect to the declarative semantics of ALLPs. © 2010 Wiley Periodicals, Inc. 相似文献
19.
Applied Intelligence - Feature selection on a network structure can not only discover interesting variables but also mine out their intricate interactions. Regularization is often employed to... 相似文献
|