Face datasets are considered a primary tool for evaluating the efficacy of face recognition methods. Here we show that in
many of the commonly used face datasets, face images can be recognized accurately at a rate significantly higher than random
even when no face, hair or clothes features appear in the image. The experiments were done by cutting a small background area
from each face image, so that each face dataset provided a new image dataset which included only seemingly blank images. Then,
an image classification method was used in order to check the classification accuracy. Experimental results show that the
classification accuracy ranged between 13.5% (color FERET) to 99% (YaleB). These results indicate that the performance of
face recognition methods measured using face image datasets may be biased. Compilable source code used for this experiment
is freely available for download via the Internet. 相似文献
The algorithm selection problem is defined as identifying the best-performing machine learning (ML) algorithm for a given combination of dataset, task, and evaluation measure. The human expertise required to evaluate the increasing number of ML algorithms available has resulted in the need to automate the algorithm selection task. Various approaches have emerged to handle the automatic algorithm selection challenge, including meta-learning. Meta-learning is a popular approach that leverages accumulated experience for future learning and typically involves dataset characterization. Existing meta-learning methods often represent a dataset using predefined features and thus cannot be generalized across different ML tasks, or alternatively, learn a dataset’s representation in a supervised manner and therefore are unable to deal with unsupervised tasks. In this study, we propose a novel learning-based task-agnostic method for producing dataset representations. Then, we introduce TRIO, a meta-learning approach, that utilizes the proposed dataset representations to accurately recommend top-performing algorithms for previously unseen datasets. TRIO first learns graphical representations for the datasets, using four tools to learn the latent interactions among dataset instances and then utilizes a graph convolutional neural network technique to extract embedding representations from the graphs obtained. We extensively evaluate the effectiveness of our approach on 337 datasets and 195 ML algorithms, demonstrating that TRIO significantly outperforms state-of-the-art methods for algorithm selection for both supervised (classification and regression) and unsupervised (clustering) tasks.
We present a new concept—Wikiometrics—the derivation of metrics and indicators from Wikipedia. Wikipedia provides an accurate representation of the real world due to its size, structure, editing policy and popularity. We demonstrate an innovative “mining” methodology, where different elements of Wikipedia – content, structure, editorial actions and reader reviews – are used to rank items in a manner which is by no means inferior to rankings produced by experts or other methods. We test our proposed method by applying it to two real-world ranking problems: top world universities and academic journals. Our proposed ranking methods were compared to leading and widely accepted benchmarks, and were found to be extremely correlative but with the advantage of the data being publically available. 相似文献
Ensemble methods combine several individual pattern classifiers in order to achieve better classification. The challenge is to choose the minimal number of classifiers that achieve the best performance. An ensemble that contains too many members might incur large storage requirements and even reduce the classification performance. The goal of ensemble pruning is to identify a subset of ensemble members that performs at least as good as the original ensemble and discard any other members.In this paper, we introduce the Collective-Agreement-based Pruning (CAP) method. Rather than ranking individual members, CAP ranks subsets by considering the individual predictive ability of each member along with the degree of redundancy among them. Subsets whose members highly agree with the class while having low inter-agreement are preferred. 相似文献
Projection matrices from projective spaces
have long been used in multiple-view geometry to model the perspective projection created by the pin-hole camera. In this work we introduce higher-dimensional mappings
for the representation of various applications in which the world we view is no longer rigid. We also describe the multi-view constraints from these new projection matrices (where k > 3) and methods for extracting the (non-rigid) structure and motion for each application. 相似文献
The solubility of Mg in alumina was measured using wavelength-dispersive spectroscopy mounted on a scanning electron microscope. Careful calibration of the microscope's working conditions was performed in order to optimize the detection limit and accuracy. Measurements were conducted on water-quenched and furnace-cooled samples, without any thermal or chemical etching to avoid alteration of the bulk concentration. The results indicate the solubility limit of Mg in alumina to be 132±11 ppm at 1600°C. 相似文献
This article presents a general approach for employing lesion analysis to address the fundamental challenge of localizing functions in a neural system. We describe functional contribution analysis (FCA), which assigns contribution values to the elements of the network such that the ability to predict the network's performance in response to multilesions is maximized. The approach is thoroughly examined on neurocontroller networks of evolved autonomous agents. The FCA portrays a stable set of neuronal contributions and accurate multilesion predictions that are significantly better than those obtained based on the classical single lesion approach. It is also used for a detailed synaptic analysis of the neurocontroller connectivity network, delineating its main functional backbone. The FCA provides a quantitative way of measuring how the network functions are localized and distributed among its elements. Our results question the adequacy of the classical single lesion analysis traditionally used in neuroscience and show that using lesioning experiments to decipher even simple neuronal systems requires a more rigorous multilesion analysis. 相似文献
In this work a granular cementitious composite has been developed, tailoring its performance to a low compressive strength and high deformation and energy dissipation capacity, which can be required to the material when employed in post-installed screeds for protection of structures and infrastructures against accidental actions such as impact and blast. The required level of performance can be achieved by uniform grain size distribution, paste content as low as minimum theoretical void ratio and low paste strength: it is believed that the synergy between the aforementioned three requirements can allow for energy dissipation capacity after paste cracking due to both rearrangement of grain meso-structure and, in case, grain crushing. After the mix design concept and optimization of the material composition, illustrated in the first part of this companion paper study, the mechanical performance of the composite under static and impact compressive loadings has been thoroughly characterized, as affected by mix-design variables, such as paste volume fraction, water to cement ratio and aggregate size. The reliability will thus be thoroughly checked, of the employed material concept, and the influence will also be investigated, if any, of specimen shape, size and boundary conditions. 相似文献
In business applications such as direct marketing, decision-makers are required to choose the action which best maximizes
a utility function. Cost-sensitive learning methods can help them achieve this goal. In this paper, we introduce Pessimistic
Active Learning (PAL). PAL employs a novel pessimistic measure, which relies on confidence intervals and is used to balance
the exploration/exploitation trade-off. In order to acquire an initial sample of labeled data, PAL applies orthogonal arrays
of fractional factorial design. PAL was tested on ten datasets using a decision tree inducer. A comparison of these results
to those of other methods indicates PAL’s superiority. 相似文献