首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The point-feature cartographic label placement problem (PFCLP) is a NP-Hard problem arising in design of maps and other graphic objects. For the sake of a better map legibility it is important to avoid overlaps in the process of labeling. This paper examines the PFCLP in the legibility context and proposes a dispersion approach for the problem. It is considered that when all points must to be labeled and overlaps are inevitable, the map can be more readable if overlapping labels are placed more distant from each other. The PFCLP is modeled as a dispersion problem on two mathematical formulations based on binary integer linear programming. Computational tests have provided good results on several generated instances.  相似文献   

2.
We present a synchronized routing and scheduling problem that arises in the forest industry, as a variation of the log-truck scheduling problem. It combines routing and scheduling of trucks with specific constraints related to the Canadian forestry context. This problem includes aspects such as pick-up and delivery, multiple products, inventory stock, multiple supply points and multiple demand points. We developed a decomposition approach to solve the weekly problem in two phases. In the first phase we use a MIP solver to solve a tactical model that determines the destinations of full truckloads from forest areas to woodmills. In the second phase, we make use of two different methods to route and schedule the daily transportation of logs: the first one consists in using a constraint-based local search approach while the second one is a hybrid approach involving a constraint programming based model and a constraint-based local search model. These approaches have been implemented using COMET2.0. The method, was tested on two industrial cases from forest companies in Canada.  相似文献   

3.
The discovery of knowledge through data mining provides a valuable asset for addressing decision making problems. Although a list of features may characterize a problem, it is often the case that a subset of those features may influence more a certain group of events constituting a sub‐problem within the original problem. We propose a divide‐and‐conquer strategy for data mining using both the data‐based sensitivity analysis for extracting feature relevance and expert evaluation for splitting the problem of characterizing telemarketing contacts to sell bank deposits. As a result, the call direction (inbound/outbound) was considered the most suitable candidate feature. The inbound telemarketing sub‐problem re‐evaluation led to a large increase in targeting performance, confirming the benefits of such approach and considering the importance of telemarketing for business, in particular in bank marketing.  相似文献   

4.
Algorithms for feature selection in predictive data mining for classification problems attempt to select those features that are relevant, and are not redundant for the classification task. A relevant feature is defined as one which is highly correlated with the target function. One problem with the definition of feature relevance is that there is no universally accepted definition of what it means for a feature to be ‘highly correlated with the target function or highly correlated with the other features’. A new feature selection algorithm which incorporates domain specific definitions of high, medium and low correlations is proposed in this paper. The proposed algorithm conducts a heuristic search for the most relevant features for the prediction task.  相似文献   

5.
Given a graph with its vertex set partitioned into a set of groups, nonnegative costs associated to its edges, and nonnegative prizes associated to its vertices, the prize‐collecting generalized minimum spanning tree problem consists in finding a subtree of this graph that spans exactly one vertex of each group and minimizes the sum of the costs of the edges of the tree less the prizes of the selected vertices. It is a generalization of the NP‐hard generalized minimum spanning tree optimization problem. We propose a GRASP (greedy randomized adaptive search procedure) heuristic for its approximate solution, incorporating path‐relinking for search intensification and a restart strategy for search diversification. The hybridization of the GRASP with path‐relinking and restarts heuristic with a data mining strategy that is applied along with the GRASP iterations, after the elite set is modified and becomes stable, contributes to making the heuristic more robust. The computational experiments show that the heuristic developed in this work found very good solutions for test problems with up to 439 vertices. All input data for the test instances and detailed numerical results are made available from Mendeley Data.  相似文献   

6.
Clustering problems are applicable to several areas of science. Approaches and algorithms are as varied as the applications. From a graph theory perspective, clustering can be generated by partitioning an input graph into a vertex-disjoint union of cliques (clusters) through addition and deletion of edges. Finding the minimum number of edges additions and deletions required to cluster data that can be represented as graphs is a well-known problem in combinatorial optimization, often referred to as cluster editing problem. However, many real-world clustering applications are characterized by overlapping clusters, that is, clusters that are non-disjoint. In these situations cluster editing cannot be applied to these problems. Literature concerning a relaxation of the cluster editing, where clusters can overlap, is scarce. In this work, we propose the overlapping cluster editing problem, a variation of the cluster editing where the goal is to partition a graph, also by editing edges, into maximal cliques that are not necessarily disjoint. In addition, we also present three slightly different versions of a hybrid heuristic to solve this problem. Each hybrid heuristic is based on coupling two metaheuristicsthat, together, generate a set of clusters; and one of three mixed-integer linear programming models, also introduced in this paper, that uses these clusters as input. The objective with the metaheuristics is to limit the solution exploration space in the models’ resolution, therefore reducing its computational time.Tests results show that the all proposed hybrid heuristic versions are able to generate good-quality overlapping cluster editing solutions. In particular, one version of the hybrid heuristic achieved, at a low computational cost, the best results in 51 of 112 randomly-generated graphs. Although the other two hybrid heuristic versions have harder to solve models, they obtained reasonable results in medium-sized randomly-generated graphs. In addition, the hybrid heuristic achieved good results identifying labeled overlapping clusters in a supervised data set experiment. Furthermore, we also show that, with our new problem definition, clustering a vertex in more than one cluster can reduce the edges editing cost.  相似文献   

7.
This paper proposes the application of association rule mining to improve quizzes and courses. First, the paper shows how to preprocess quiz data and how to create several data matrices for use in the process of knowledge discovery. Next, the proposed algorithm that uses grammar‐guided genetic programming is described and compared with both classical and recent soft‐computing association rule mining algorithms. Then, different objective and subjective rule evaluation measures are used to select the most interesting and useful rules. Experiments have been carried out by using real data of university students enrolled on an artificial intelligence practice Moodle's course on the CLIPS programming language. Some examples of these rules are shown, together with the feedback that they provide to instructors making decisions about how to improve quizzes and courses. Finally, starting with the information provided by the rules, the CLIPS quiz and course have been updated. These innovations have been evaluated by comparing the performance achieved by students before and after applying the changes using one control group and two different experimental groups.  相似文献   

8.
We consider the problem of finding k‐bipartite subgraphs, called “clusters,” in a bipartite graph , such that each vertex i of S appears in exactly one of the subgraphs, every vertex j of T appears in each cluster in which at least one of its neighbors appears, and the total number of edges needed to complete each cluster (i.e. to become a biclique) is minimized. This problem has been shown to be strongly NP‐hard and is known as the k‐clustering minimum biclique completion problem. It has been applied to bundling channels for multicast transmissions. The application consists of finding k‐multicast sessions that partition the set of demands, given a set of demands of services from clients. Each service should belong to a single multicast session, while each client can appear in more than one session. We extend previous work by developing a branch‐and‐price algorithm that embeds a new metaheuristic based on variable neighborhood infeasible search and a branching rule that exploits the problem structure. The metaheuristic can also efficiently solve the pricing subproblem. In addition to the random instances used in the literature, we present structured instances generated using the MovieLens dataset collected by the GroupLens Research Project. Extensive computational results show that our branch‐and‐price algorithm outperforms the approaches proposed in the literature.  相似文献   

9.
Automatic test data generation is a very popular domain in the field of search‐based software engineering. Traditionally, the main goal has been to maximize coverage. However, other objectives can be defined, such as the oracle cost, which is the cost of executing the entire test suite and the cost of checking the system behavior. Indeed, in very large software systems, the cost spent to test the system can be an issue, and then it makes sense by considering two conflicting objectives: maximizing the coverage and minimizing the oracle cost. This is what we did in this paper. We mainly compared two approaches to deal with the multi‐objective test data generation problem: a direct multi‐objective approach and a combination of a mono‐objective algorithm together with multi‐objective test case selection optimization. Concretely, in this work, we used four state‐of‐the‐art multi‐objective algorithms and two mono‐objective evolutionary algorithms followed by a multi‐objective test case selection based on Pareto efficiency. The experimental analysis compares these techniques on two different benchmarks. The first one is composed of 800 Java programs created through a program generator. The second benchmark is composed of 13 real programs extracted from the literature. In the direct multi‐objective approach, the results indicate that the oracle cost can be properly optimized; however, the full branch coverage of the system poses a great challenge. Regarding the mono‐objective algorithms, although they need a second phase of test case selection for reducing the oracle cost, they are very effective in maximizing the branch coverage. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

10.
Several classes of scientific and commercial applications require the execution of a large number of independent tasks. One highly successful and low‐cost mechanism for acquiring the necessary computing power for these applications is the ‘public‐resource computing’, or ‘desktop Grid’ paradigm, which exploits the computational power of private computers. So far, this paradigm has not been applied to data mining applications for two main reasons. First, it is not straightforward to decompose a data mining algorithm into truly independent sub‐tasks. Second, the large volume of the involved data makes it difficult to handle the communication costs of a parallel paradigm. This paper introduces a general framework for distributed data mining applications called Mining@home. In particular, we focus on one of the main data mining problems: the extraction of closed frequent itemsets from transactional databases. We show that it is possible to decompose this problem into independent tasks, which however need to share a large volume of the data. We thus introduce a data‐intensive computing network, which adopts a P2P topology based on super peers with caching capabilities, aiming to support the dissemination of large amounts of information. Finally, we evaluate the execution of a pattern extraction task on such network. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

11.
Breast cancer, has been a significant cancer type for women on the society. Early diagnosis and timely medical treatment are important key factors spreading to the other tissues and permitting long‐time survival of patients. Since the existing methods have several serious shortfalls, microwave imaging method for the diagnosis of early stage tumors has been interested by different scientific research groups in terms of moderating endogenous the electrical property difference between healthy tissue and malignancies. In this article, both an ultra‐wideband bow‐tie antenna with enhanced bandwidth and a 3D breast model which has different electrical properties which are permittivity and conductivity is created in simulation tool to solve electromagnetic field values. Return loss, VSWR, and radiation pattern characteristics, which are significant antenna parameters, are simulated and obtained whether the antenna possess an efficient characteristic or not. Electric field values over the breast tissue in which there is a tumor or not tumor are evaluated. In this article, above‐mentioned values of frequency bandwidth, dielectric constant of antenna's substrate, electric field, and tumor information were consisted in their dataset. This dataset obtained from the Bow‐Tie Antenna was used to detect the breast cancer with one of the data mining method, which is K‐Nearest Neighbor Algorithm.  相似文献   

12.
Modified sequential k‐means clustering concerns a k‐means clustering problem in which the clustering machine utilizes output similarity in addition. While conventional clustering methods commonly recognize similar instances at features‐level modified sequential clustering takes advantage of response, too. To this end, the approach we pursue is to enhance the quality of clustering by using some proper information. The information enables the clustering machine to detect more patterns and dependencies that may be relevant. This allows one to determine, for instance, which fashion products exhibit similar behaviour in terms of sales. Unfortunately, conventional clustering methods cannot tackle such cases, because they handle attributes solely at the feature level without considering any response. In this study, we introduce a novel approach underlying minimum conditional entropy clustering and show its advantages in terms of data analytics. In particular, we achieve this by modifying the conventional sequential k‐means algorithm. This modified clustering approach has the ability to reflect the response effect in a consistent manner. To verify the feasibility and the performance of this approach, we conducted several experiments based on real data from the apparel industry.  相似文献   

13.
This paper focuses on iterated local search heuristics for the maximum cut‐clique (MCC, or clique neighborhood) problem. Given an undirected graph G = (V,E) and a clique C of G, the cut‐clique is the set of edges running between C and V\C, establishing the cut (C,V\C). The MCC in G is to find a clique with the largest number of edges in the neighborhood of the clique, also known as the maximum edge‐neighborhood clique. This problem has been recently introduced in the literature together with a number of applications, namely, in cell biology instances. However, it has only been addressed so far by exact methods. In this paper, we introduce the first approximate algorithms for tackling the MCC problem, compare the results with the exact methodologies, and explore a new application within marketing analysis, which provide a new alternative perspective for mining market basket problems.  相似文献   

14.
15.
Social online learning environments provide new recommendation opportunities to meet users' needs. However, current educational recommender systems do not usually take advantage of these opportunities. To progress on this issue, we have proposed a knowledge engineering approach based on human–computer interaction (i.e. user‐centred design as defined by the standard ISO 9241‐210:2010) and artificial intelligence techniques (i.e. data mining) that involve educators in the process of eliciting educational oriented recommendations. To date, this approach differs from most recommenders in education in focusing on identifying relevant actions to be recommended on e‐learning services from a user‐centric perspective, thus widening the range of recommendation types. This approach has been used to identify 32 recommendations that consider several types of actions, which focus on promoting active participation of learners and on strengthening the sharing of experiences among peers through the usage of the social services provided by the learning environment. The paper describes where data mining techniques have been applied to complement the user‐centred design methods to produce social oriented recommendations in online learning environments.  相似文献   

16.
In this paper, the waste collection problem (WCP) of a city in the south of Spain is addressed as a multiobjective routing problem that considers three objectives. From the company's perspective, the minimization of the travel cost is desired as well as that of the total number of vehicles. Additionally, from the employee's point of view, a set of balanced routes is also sought. Four variants of a multiobjective hybrid algorithm are proposed. Specifically, a GRASP (greedy randomized adaptive search procedure) with a VND (variable neighborhood descent) is combined. The best GRASP–VND algorithm found is applied in order to solve the real‐world WCP of a city in the south of Spain.  相似文献   

17.
The discrete unit commitment problem with min‐stop ramping constraints optimizes the daily production of thermal power plants, subject to an operational reactivity of thermal units in a 30‐minute delay. Previously, mixed integer programming (MIP) formulations aimed at an exact optimization approach. This paper derives matheuristics to face the short time limit imposed by the operational constraints. Continuous relaxations guide the search for feasible solutions exploiting tailored variable fixing strategies. Parallel matheuristics are derived considering complementary strategies in parallel. Tests were performed on more than 600 real‐life instances. Our parallel matheuristic provides high‐quality solutions and outperforms the MIP approach in the time limits imposed by the industrial application. This paper illustrates a special interest for matheuristics in industrial highly constrained problems: many tailored neighborhood searches can be derived from an MIP formulation, and their combination in a parallel scheme improves the solution quality as well as the consistency of the heuristic.  相似文献   

18.
The assignment and scheduling problem is inherently multiobjective. It generally involves multiple conflicting objectives and large and highly complex search spaces. The problem allows the determination of an efficient allocation of a set of limited and shared resources to perform tasks, and an efficient arrangement scheme of a set of tasks over time, while fulfilling spatiotemporal constraints. The main objective is to minimize the project makespan as well as the total cost. Finding a good approximation set is the result of trade‐offs between diversity of solutions and convergence toward the Pareto‐optimal front. It is difficult to achieve such a balance with NP‐hard problems. In this respect, and in order to efficiently explore the search space, a hybrid bidirectional ant‐based approach is proposed in this paper, which is an improvement of a bi‐colony ant‐based approach. Its main characteristic is that it combines a solution construction developed for a more complicated problem with a Pareto‐guided local search engine.  相似文献   

19.
The integration of data from various electronic health record (EHR) systems presents a critical conflict in the sharing and exchanging of patient information across a diverse group of health‐oriented organizations. Patient health records in each system are annotated with ontologies utilizing different health‐care standards, creating ontology conflicts both at the schema and data levels. In this study, we introduce the concept of semantic ontology mapping for the facilitation and interoperability of heterogeneous EHR systems. This approach proposes a means of detecting and resolving the data‐level conflicts that generally exist in the ontology mapping process. We have extended the semantic bridge ontology in support of ontology mapping at the data level and generated the required mapping rules to reconcile data from different ontological sources into a canonical format. As a result, linked‐patient data are generated and made available in a semantic query engine to facilitate user queries of patient data across heterogeneous EHR systems.  相似文献   

20.
Iterated local search (ILS) is a powerful framework for developing efficient algorithms for the permutation flow‐shop problem (PFSP). These algorithms are relatively simple to implement and use very few parameters, which facilitates the associated fine‐tuning process. Therefore, they constitute an attractive solution for real‐life applications. In this paper, we discuss some parallelization, parametrization, and randomization issues related to ILS‐based algorithms for solving the PFSP. In particular, the following research questions are analyzed: (a) Is it possible to simplify even more the parameter setting in an ILS framework without affecting performance? (b) How do parallelized versions of these algorithms behave as we simultaneously vary the number of different runs and the computation time? (c) For a parallelized version of these algorithms, is it worthwhile to randomize the initial solution so that different starting points are considered? (d) Are these algorithms affected by the use of a “good‐quality” pseudorandom number generator? In this paper, we introduce the new ILS‐ESP (where ESP is efficient, simple, and parallelizable) algorithm that is specifically designed to take advantage of parallel computing, allowing us to obtain competitive results in “real time” for all tested instances. The ILS‐ESP also uses “natural” parameters, which simplifies the calibration process. An extensive set of computational experiments has been carried out in order to answer the aforementioned research questions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号