The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three components, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by certain users to certain items has important implications in helping users search for desired information. In this paper, we develop a dual mining framework to explore tagging behavior. This framework is centered around two opposing measures, similarity and diversity, applied to one or more tagging components, and therefore enables a wide range of analysis scenarios such as characterizing similar users tagging diverse items with similar tags or diverse users tagging similar items with diverse tags. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We design four sets of efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality. 相似文献
Buoyancy driven convection in a square cavity induced by two mutually orthogonal arbitrarily placed heated thin plates is studied numerically under isothermal and isoflux boundary conditions. The flow is assumed to be two-dimensional. The coupled governing equations were solved by the finite difference method using the Alternating Direction Implicit technique and Successive Over Relaxation method. The steady state results are depicted in terms of streamline and isotherm plots. It is found that the resulting convection pattern is stronger for the isothermal boundary condition. A better overall heat transfer can be achieved by placing one of the plates far away from the center of the cavity for isothermal boundary condition and near the center of the cavity for isoflux boundary condition. 相似文献
For a long time, legal entities have developed and used crime prediction methodologies. The techniques are frequently updated based on crime evaluations and responses from scientific communities. There is a need to develop type-based crime prediction methodologies that can be used to address issues at the subgroup level. Child maltreatment is not adequately addressed because children are voiceless. As a result, the possibility of developing a model for predicting child abuse was investigated in this study. Various exploratory analysis methods were used to examine the city of Chicago’s child abuse events. The data set was balanced using the Borderline-SMOTE technique, and then a stacking classifier was employed to ensemble multiple algorithms to predict various types of child abuse. The proposed approach successfully predicted crime types with 93% of accuracy, precision, recall, and F1-Score. The AUC value of the same was 0.989. However, when compared to the Extra Trees model (17.55), which is the second best, the proposed model’s execution time was significantly longer (476.63). We discovered that Machine Learning methods effectively evaluate the demographic and spatial-temporal characteristics of the crimes and predict the occurrences of various subtypes of child abuse. The results indicated that the proposed Borderline-SMOTE enabled Stacking Classifier model (BS-SC Model) would be effective in the real-time child abuse prediction and prevention process. 相似文献
Sentiment Analysis (SA) is one of the subfields in Natural Language Processing (NLP) which focuses on identification and extraction of opinions that exist in the text provided across reviews, social media, blogs, news, and so on. SA has the ability to handle the drastically-increasing unstructured text by transforming them into structured data with the help of NLP and open source tools. The current research work designs a novel Modified Red Deer Algorithm (MRDA) Extreme Learning Machine Sparse Autoencoder (ELMSAE) model for SA and classification. The proposed MRDA-ELMSAE technique initially performs preprocessing to transform the data into a compatible format. Moreover, TF-IDF vectorizer is employed in the extraction of features while ELMSAE model is applied in the classification of sentiments. Furthermore, optimal parameter tuning is done for ELMSAE model using MRDA technique. A wide range of simulation analyses was carried out and results from comparative analysis establish the enhanced efficiency of MRDA-ELMSAE technique against other recent techniques. 相似文献
The World Wide Web(WWW) comprises a wide range of information, and it is mainly operated on the principles of keyword matching which often reduces accurate information retrieval. Automatic query expansion is one of the primary methods for information retrieval, and it handles the vocabulary mismatch problem often faced by the information retrieval systems to retrieve an appropriate document using the keywords. This paper proposed a novel approach of hybrid COOT-based Cat and Mouse Optimization (CMO) algorithm named as hybrid COOT-CMO for the appropriate selection of optimal candidate terms in the automatic query expansion process. To improve the accuracy of the Cat and Mouse Optimization (CMO) algorithm, the parameters are tuned with the help of the Coot algorithm. The best suitable expanded query is identified from the available expanded query sets also known as candidate query pools. All feasible combinations in this candidate query pool should be obtained from the top retrieved documents. Benchmark datasets such as the GOV2 Test Collection, the Cranfield Collections, and the NTCIR Test Collection are utilized to assess the performance of the proposed hybrid COOT-CMO method for automatic query expansion. This proposed method surpasses the existing state-of-the-art techniques using many performance measures such as F-score, precision, and mean average precision (MAP).
Aluminum metal matrix composites (AMMCs) explicitly show better physical and mechanical properties as compared to aluminum alloys and results in a more preferred material for a wide range of applications. The addition of reinforcements embargo AMMCs employment to industry requirements by increasing order of machining complexity. However, it can be machined with a high order of surface integrity by nonconventional approaches like abrasive water jet machining. Hybrid aluminum alloy composites were reinforced by B4C (5–15?vol%) and solid lubricant hBN (15?vol%) particles and fabricated using a liquid metallurgy route. This research article deals with the experimental investigation on the effect of process parameters such as mesh size, abrasive flow rate, water pressure and work traverse speed of abrasive water jet machining on hybrid AA6061-B4C-hBN composites. Water jet pressure and traverse speed have been proved to be the most significant parameters which influenced the responses like kerf taper angle and surface roughness. Increase in reinforcement particles affects both the kerf taper angle and surface roughness. SEM images of the machined surface show that cutting wear mechanism was largely operating in material removal. 相似文献
Artificial intelligent tools like genetic algorithm, artificial neural network (ANN) and fuzzy logic are found to be extremely
useful in modeling reliable processes in the field of computer integrated manufacturing (for example, selecting optimal parameters
during process planning, design and implementing the adaptive control systems). When knowledge about the relationship among
the various parameters of manufacturing are found to be lacking, ANNs are used as process models, because they can handle
strong nonlinearities, a large number of parameters and missing information. When the dependencies between parameters become
noninvertible, the input and output configurations used in ANN strongly influence the accuracy. However, running of a neural
network is found to be time consuming. If genetic algorithm-based ANNs are used to construct models, it can provide more accurate
results in less time. This article proposes a genetic algorithm-based ANN model for the turning process in manufacturing Industry.
This model is found to be a time-saving model that satisfies all the accuracy requirements. 相似文献
The wavelet transform (WT) is used to represent all possible types of transients in vibration signals generated by faults in a gear box. It is shown that the transform provides a powerful tool for condition monitoring and fault diagnosis. The vibration signal of a spur bevel gear box in different conditions is used to demonstrate the application of various wavelets in feature extraction. In present work, a discrete wavelet, Daubechies wavelets (db1–db15) is used for feature extraction and their relative effectiveness in feature extraction is compared. The major steps in pattern classification are feature extraction and classification. This paper investigates the use of discrete wavelets for feature extraction and a Decision Tree for classification. J48 Decision Tree algorithm has been used for feature selection as well as for classification. This paper illustrates the powerfulness and flexibility of the discrete wavelet transform to decompose linear and non-linear processing of vibration signal. 相似文献
As developers face an ever-increasing pressure to engineer secure software, researchers are building an understanding of security-sensitive bugs (i.e. vulnerabilities). Research into mining software repositories has greatly increased our understanding of software quality via empirical study of bugs. Conceptually, however, vulnerabilities differ from bugs: they represent an abuse of functionality as opposed to insufficient functionality commonly associated with traditional, non-security bugs. We performed an in-depth analysis of the Chromium project to empirically examine the relationship between bugs and vulnerabilities. We mined 374,686 bugs and 703 post-release vulnerabilities over five Chromium releases that span six years of development. We used logistic regression analysis, ranking analysis, bug type classifications, developer experience, and vulnerability severity metrics to examine the overarching question: are bugs and vulnerabilities in the same files? While we found statistically significant correlations between pre-release bugs and post-release vulnerabilities, we found the association to be weak. Number of features, source lines of code, and pre-release security bugs are, in general, more closely associated with post-release vulnerabilities than any of our non-security bug categories. In further analysis, we examined sub-types of bugs, such as stability-related bugs, and the associations did not improve. Even the files with the most severe vulnerabilities (by measure of CVSS or bounty payouts) did not show strong correlations with number of bugs. These results indicate that bugs and vulnerabilities are empirically dissimilar groups, motivating the need for security engineering research to target vulnerabilities specifically. 相似文献