首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Neighborhood-based methods have been proposed to satisfy both the performance and accuracy in recommendation systems. It is difficult, however, to satisfy them together because there is a tradeoff between them especially in a big data environment. In this paper, we present a novel method, called a CE method, using the notion of category experts in order to leverage the tradeoff between performance and accuracy. The CE method selects a few users as experts in each category and uses their ratings rather than ordinary neighbors’. In addition, we suggest CES and CEP methods, variants of the CE method, to achieve higher accuracy. The CES method considers the similarity between the active user and category expert in ratings prediction, and the CEP method utilizes the active user’s preference (interest) on each category. Finally, we combine all the approaches to create a CESP method, considering similarity and preference simultaneously. Using real-world datasets from MovieLens and Ciao, we show that our proposal successfully leverages the tradeoff between the performance and accuracy and outperforms existing neighborhood-based recommendation methods in coverage. More specifically, the CESP method provides 5% improved accuracy compared to the item-based method while performing 9 times faster than the user-based method.  相似文献   

2.
Face alignment and reconstruction are classical problems in the computer vision field, one of the greatest difficulties of which is the limited number of facial images with landmark points. The 300 W-LP dataset is the most commonly used for the existing methods of single-view 3D Morphable Model (3DMM)-based reconstruction; however, the model performance is limited by the small variety of facial images in this dataset. In this work, a 3D facial image dataset with landmark points generated by the rotate-and-render method is proposed. The key innovation of the proposed method is that the back-and-forth rotation of faces in 3D space and then re-rendering them to the 2D plane can provide strong self-supervision. The recent advances in 3D face modeling and high-resolution generative adversarial networks (GANs) are leveraged to constitute the blocks. To obtain more precise facial landmark points, the 3D dense face alignment (3DDFA) model is used to label the generated images and filter the landmark points. Finally, the 3DDFA model is retrained using the proposed dataset, and an improved result is achieved.  相似文献   

3.
Statistical methods, and in particular machine learning, have been increasingly used in the drug development workflow. Among the existing machine learning methods, we have been specifically concerned with genetic programming. We present a genetic programming-based framework for predicting anticancer therapeutic response. We use the NCI-60 microarray dataset and we look for a relationship between gene expressions and responses to oncology drugs Fluorouracil, Fludarabine, Floxuridine and Cytarabine. We aim at identifying, from genomic measurements of biopsies, the likelihood to develop drug resistance. Experimental results, and their comparison with the ones obtained by Linear Regression and Least Square Regression, hint that genetic programming is a promising technique for this kind of application. Moreover, genetic programming output may potentially highlight some relations between genes which could support the identification of biological meaningful pathways. The structures that appear more frequently in the “best” solutions found by genetic programming are presented.  相似文献   

4.
The Internet has been flooded with spam emails, and during the last decade there has been an increasing demand for reliable anti-spam email filters. The problem of filtering emails can be considered as a classification problem in the field of supervised learning. Theoretically, many mature technologies, for example, support vector machines (SVM), can be used to solve this problem. However, in real enterprise applications, the training data are typically collected via honeypots and thus are always of huge amounts and highly biased towards spam emails. This challenges both efficiency and effectiveness of conventional technologies. In this article, we propose an undersampling method to compress and balance the training set used for the conventional SVM classifier with minimal information loss. The key observation is that we can make a trade-off between training set size and information loss by carefully defining a similarity measure between data samples. Our experiments show that the SVM classifier provides a better performance by applying our compressing and balancing approach.  相似文献   

5.
Multimedia Tools and Applications - Visual speech recognition is a method that comprehends speech from speakers lip movements and the speech is validated only by the shape and lip movement....  相似文献   

6.
针对UF-growth算法构造大量树节点和分支的局限性, 且不断计算候选数据项支持度的不足, 提出压缩UF-tree算法。压缩UF-tree算法改变建树条件:事务中数据项与树中某个分支节点的数据项匹配时, 将该数据项合并到分支中; 否则, 从该分支节点创建新的分支, 叶节点保存当前事务编号。构建单项数据项的概率向量, 搜索树分支产生候选项, 通过事务编号和概率向量计算候选数据项的支持度进而挖掘频繁项。通过实验对比与分析, 压缩UF-tree算法可行且更高效。  相似文献   

7.
Data mining techniques are widely used in many fields. One of the applications of data mining in the field of the Bioinformatics is classification of tissue samples. In the present work, a wavelet power spectrum based approach has been presented for feature selection and successful classification of the multi class dataset. The proposed method was applied on SRBCT and the breast cancer datasets which are multi class cancer datasets. The selected features are almost those selected in previous works. The method was able to produce almost 100% accurate classification results. The method is very simple and robust to noise. No extensive preprocessing is required. The classification was performed with comparatively very lesser number of features than those used in the original works. No information is lost due to the initial pruning of the data usually performed using a threshold in other methods. The method utilizes the inherent nature of the data in performing various tasks. So, the method can be used for a wide range of data.  相似文献   

8.
由于高维数据聚类的现实意义日益增强,而Parzen窗估计法仅对低维数据集聚类能获得良好的结果,随着维数增加,效率降低,因此对Parzen窗进行加权改进,通过多次仿真实验确定加权函数,将高维数据投射至低维空间,对其聚类,逐步投向高维空间,对结果矩阵进行优化处理,得到更为优良的聚类效果。  相似文献   

9.
Touchless interaction has received considerable attention in recent years with benefit of removing barriers of physical contact. Several approaches are available to achieve mid-air interactions. However, most of these techniques cause discomfort when the interaction method is not direct manipulation. In this paper, gestures based on unimanual and bimanual interactions with different tools for exploring CT volume dataset are designed to perform the similar tasks in realistic applications. Focus + context approach based on GPU volume ray casting by trapezoid-shaped transfer function is used for visualization and the level-of-detail technique is adopted for accelerating interactive rendering. Comparing the effectiveness and intuitiveness of interaction approach with others by experiments, ours has a better performance and superiority with less completion time. Moreover, the bimanual interaction with more advantages is timesaving when performing continuous exploration task.  相似文献   

10.
目的 感知图像哈希又称图像摘要或是图像指纹,是一种有效的图像认证技术,近年来受到了广泛的关注。该技术通过将图像的感知鲁棒特征转化为固定长度的哈希序列,来实现图像版权认证。然而,该领域始终缺乏一个比较通用的数据集,已有数据集所使用的图像内容保留操作和真实场景差异较大,使得训练得到的神经网络架构在应对复杂的图像编辑操作时效果显著下降。方法 针对感知图像哈希任务,面向实际图像内容认证场景构建了一个新的数据集。首先,将现实中常见的图像内容保留操作进行总结和分类,设计了48种单一、复合的图像内容保留操作来生成感知相似图像;然后,根据感知图像哈希的定义,选择与待认证图像语义相似但是感知内容不同的图像作为感知不相似图像,增加了该数据集的辨别难度;最终建立了一个包含116 400幅图像的感知哈希图像数据集。结果 由于本文提出的数据集使用的图像内容保留操作更加复杂,不相似图像也更加难以辨别,使得在该数据集上训练得到的深度神经网络具有较好的泛化能力,即这些神经网络即使不进行重新训练或是微调,也可以在其他数据集上取得较好的认证性能。同时,在该数据集上训练得到的神经网络在不同数据集上性能差别较小,体现了本文数据集具有较好的稳定性。结论 设计了一个针对感知哈希的图像数据集,大量的对比实验表明了该数据集的有效性,该工作可对感知图像哈希领域的发展起到促进作用。  相似文献   

11.
Due to the enhancing of life quality, increasing of chronic diseases, changing lifestyles, and an expanding life expectancy, rapid population aging requires a new business model that promotes happiness and emphasizes a healthy body and mind through the “anytime, anywhere well-being” lifestyle. Recently, lifecare systems using IoT devices are being released as products that are influential on the overall society, and their effectiveness is continuously proven. In addition, based on peer-to-peer (P2P) networking, diverse companies are conducting investments and research to develop devices as well as solutions that connect to these devices. Accordingly, in this study, a mining-based lifecare-recommendation method using a peer-to-peer dataset and adaptive decision feedback is proposed. In addition to collecting PHRs, the proposed method measures life-logs such as dietary life, life pattern, sleep pattern, life behavior, and job career; the P2P-dataset preprocessed index information; and biometric information using a wearable device. It uses the Open API to collect the health-weather and life-weather index data from public data, and it uses a smart-band-type wearable device known as a biosensor to measure the heart rate, daily activity, and body temperature. It monitors the current status and conditions through the classification of life data, and it mines big data and uses a decision tree to analyze the association rules and correlations, as well as to discover new knowledge patterns. In the peer-to-peer networking, a lifecare recommendation model that uses adaptive decision feedback has been developed for the peer-to-peer platform. This adaptive decision feedback reflects an individual’s importance or sensory level. Accordingly, it proposes more individualized and flexible results and can be configured to support intellectual lifecare. A mining-based lifecare-recommendation mobile service can also be developed to enhance the quality of life, as it provides user-based health management and reduces the medical expenses; accordingly, it enhances the service satisfaction and quality in the lifecare field.  相似文献   

12.
矩形二叉树实现海量地形数据组织与管理   总被引:1,自引:0,他引:1       下载免费PDF全文
针对海量地形数据组织与管理,提出了一种矩形二叉树模型和无指针数据索引相结合的地形数据组织方法。矩形二叉树模型很好地保证数据的连续性,依据二叉树编码设计的无指针数据索引实现数据块的快速准确定位。此外,简化视见体的可见性判断和数据预取策略,实现可见数据的动态管理。实验结果表明文中的方法能够实现真实感海量地形数据的实时可视化。  相似文献   

13.
Applied Intelligence - Coronary artery disease (CAD) is one of the most lethal diseases which is major cause of deaths around the globe. CAD is among such diseases with mortality rate approximately...  相似文献   

14.
In this paper, the classification of the two binary bioinformatics datasets, leukemia and colon tumor, is further studied by using the recently developed neural network-based finite impulse response extreme learning machine (FIR-ELM). It is seen that a time series analysis of the microarray samples is first performed to determine the filtering properties of the hidden layer of the neural classifier with FIR-ELM for feature identification. The linear separability of the data patterns in the microarray datasets is then studied. For improving the robustness of the neural classifier against noise and errors, a frequency domain gene feature selection algorithm is also proposed. It is shown in the simulation results that the FIR-ELM algorithm has an excellent performance for the classification of bioinformatics data in comparison with many existing classification algorithms.  相似文献   

15.
Understanding pair-wise activities is an essential step towards studying complex group and crowd behaviors in video. However, such research is often hampered by a lack of datasets that concentrate specifically on Atomic Pair Actions; [Here, we distinguish between the atomic motion of individual objects and the atomic motion of pairs of objects. The term action in Atomic Pair Action means an atomic interaction movement of two objects in video; a pair activity, then, is composed of multiple actions by a pair or multiple pairs of interacting objects ( and ). Please see Section 1 for details.] in addition, the general dearth in computer vision of a standardized, structured approach for reproducing and analyzing the efficacy of different models limits the ability to compare different approaches. In this paper, we introduce the ISI Atomic Pair Actions dataset, a set of 90 videos that concentrate on the Atomic Pair Actions of objects in video, namely converging, diverging, and moving in parallel. We further incorporate a structured, end-to-end analysis methodology, based on workflows, to easily and automatically allow for standardized testing of state-of-the-art models, as well as inter-operability of varied codebases and incorporation of novel models. We demonstrate the efficacy of our structured framework by testing several models on the new dataset. In addition, we make the full dataset (the videos, along with their associated tracks and ground truth, and the exported workflows) publicly available to the research community for free use and extension at <http://research.sethi.org/ricky/datasets/>.  相似文献   

16.
Pattern Analysis and Applications - We present a new landmark detection problem on the upper body of a clothed person for tailoring purposes. This is a landmark detection problem unknown in the...  相似文献   

17.
System reconstruction refers to the following problem: given a behaviour system, viewed as an overall system, determine which sets of its subsystems are adequate for reconstructing the given system with an acceptable degree of approximation. The reconstruction problem can be solved by finding multiple variable relationships in complex data. To deal with this problem, we apply the idea of qualitative reasoning as discussed in artificial intelligence. The basic idea is to convert continuous data into discrete data (e.g. through clustering) so that qualitative rules can be constructed. In this paper a method of compressing data (called the Lebesgue discretization) is proposed to establish relationships between the dependent variable and other variables. Qualitative rules can be constructed from this relationship and then stored in a knowledge base. This technique also facilitates automated knowledge acquisition.  相似文献   

18.
可处理混合属性的任意形状聚类   总被引:1,自引:1,他引:0       下载免费PDF全文
聚类是数据挖掘中一个非常活跃的研究分支,任意形状的聚类则是一个有待研究的开放问题。提出一种包含分类属性取值频率信息的类间差异性度量和一种对象与类的相似度定义,在此基础上提出一种能处理任意形状的聚类算法,可处理混合属性数据集。在人造数据集和真实数据集上检验了提出的算法,并与相关算法进行了对比,实验结果表明,提出的算法是有效可行的。  相似文献   

19.
遥感高光谱数据是一种具有空间聚集特性的高维数据。对PT方法进行改进使之与iDistance的索引机制相适应,并融合这两种不同的空间划分策略,提出一种适用于高光谱数据的索引结构。该索引是一种度量空间的高维索引,采用两级空间划分,在处理光谱相似性查询时可同时完成针对距离和空间方位的数据过滤。实验证明该索引可以有效降低I/O和距离计算次数,具有较高的剪枝效率,适用于高光谱数据相似性查询。  相似文献   

20.

Today, the importance of digital images as a medium for social communication is growing rapidly. Sometimes, an image needs to be authenticated by verifying its source camera model or device. Recently, deep networks have become very successful at visual pattern recognition. With this motivation, several investigators have explored the possibility of using convolutional neural networks (CNNs) for camera source identification. In this paper, we use selective preprocessing, instead of a indiscriminate one, in order not to hinder the CNN’s strong ability to learn useful features for this kind of forensic task. To generate a consistent and balanced dataset, we limit the maximum number of original images to 200 per camera model, and we discard vertically taken images. Using a relatively simple deep network structure, the proposed method achieved a better prediction accuracy—95.0%—than GoogleNet and other existing methods. Also, challenging camera models such as the Sony DSC H50 and W170 can be classified with the quite high prediction accuracies of 87.9% and 83.1%, respectively.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号