World Wide Web is a continuously growing giant, and within the next few years, Web contents will surely increase tremendously. Hence, there is a great requirement to have algorithms that could accurately classify Web pages. Automatic Web page classification is significantly different from traditional text classification because of the presence of additional information, provided by the HTML structure. Recently, several techniques have been arisen from combinations of artificial intelligence and statistical approaches. However, it is not a simple matter to find an optimal classification technique for Web pages. This paper introduces a novel strategy for vertical Web page classification, which is called Classification using Multi-layered Domain Ontology (CMDO). It employs several Web mining techniques, and depends mainly on proposed multi-layered domain ontology. In order to promote the classification accuracy, CMDO implies a distiller to reject pages related to other domains. CMDO also employs a novel classification technique, which is called Graph Based Classification (GBC). The proposed GBC has pioneering features that other techniques do not have, such as outlier rejection and pruning. Experimental results have shown that CMDO outperforms recent techniques as it introduces better precision, recall, and classification accuracy. 相似文献
Geologists interpret seismic data to understand subsurface properties and subsequently to locate underground hydrocarbon resources. Channels are among the most important geological features interpreters analyze to locate petroleum reservoirs. However, manual channel picking is both time consuming and tedious. Moreover, similar to any other process dependent on human intervention, manual channel picking is error prone and inconsistent. To address these issues, automatic channel detection is both necessary and important for efficient and accurate seismic interpretation. Modern systems make use of real-time image processing techniques for different tasks. Automatic channel detection is a combination of different mathematical methods in digital image processing that can identify streaks within the images called channels that are important to the oil companies. In this paper, we propose an innovative automatic channel detection algorithm based on machine learning techniques. The new algorithm can identify channels in seismic data/images fully automatically and tremendously increases the efficiency and accuracy of the interpretation process. The algorithm uses deep neural network to train the classifier with both the channel and non-channel patches. We provide a field data example to demonstrate the performance of the new algorithm. The training phase gave a maximum accuracy of 84.6% for the classifier and it performed even better in the testing phase, giving a maximum accuracy of 90%. 相似文献
The need of human beings for better social media applications has increased tremendously. This increase has necessitated the need for a digital system with a larger storage capacity and more processing power. However, an increase in multimedia content size reduces the overall processing performance. This occurs because the process of storing and retrieving large files affects the execution time. Therefore, it is extremely important to reduce the multimedia content size. This reduction can be achieved by image and video compression. There are two types of image or video compression: lossy and lossless. In the latter compression, the decompressed image is an exact copy of the original image, while in the former compression, the original and the decompressed image differ from each other. Lossless compression is needed when every pixel matters. This can be found in autoimage processing applications. On the other hand, lossy compression is used in applications that are based on human visual system perception. In these applications, not every single pixel is important; rather, the overall image quality is important. Many video compression algorithms have been proposed. However, the balance between compression rate and video quality still needs further investigation. The algorithm developed in this research focuses on this balance. The proposed algorithm exhibits diversity of compression stages used for each type of information such as elimination of redundant and semi redundant frames, elimination by manipulating consecutive XORed frames, reducing the discrete cosine transform coefficients based on the wanted accuracy and compression ratio. Neural network is used to further reduce the frame size. The proposed method is a lossy compression type, but it can reach the near-lossless type in terms of image quality and compression ratio with comparable execution time.
Integral performance indices as quantitative measures of the performance of a system are commonly used to evaluate the performance of designed control systems. In this paper, it is pointed out that due to existence of non-exponential modes in the step response of a fractional-order control system having zero steady state error, integral performance indices of such a system may be infinite. According to this point, some simple conditions are derived to guarantee the finiteness of different integral performance indices in a class of fractional-order control systems. Finally, some numerical examples are presented to show the applicability of the analytical achievements of the paper. 相似文献
Data available in software engineering for many applications contains variability and it is not possible to say which variable helps in the process of the prediction. Most of the work present in software defect prediction is focused on the selection of best prediction techniques. For this purpose, deep learning and ensemble models have shown promising results. In contrast, there are very few researches that deals with cleaning the training data and selection of best parameter values from the data. Sometimes data available for training the models have high variability and this variability may cause a decrease in model accuracy. To deal with this problem we used the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) for selection of the best variables to train the model. A simple ANN model with one input, one output and two hidden layers was used for the training instead of a very deep and complex model. AIC and BIC values are calculated and combination for minimum AIC and BIC values to be selected for the best model. At first, variables were narrowed down to a smaller number using correlation values. Then subsets for all the possible variable combinations were formed. In the end, an artificial neural network (ANN) model was trained for each subset and the best model was selected on the basis of the smallest AIC and BIC value. It was found that combination of only two variables’ ns and entropy are best for software defect prediction as it gives minimum AIC and BIC values. While, nm and npt is the worst combination and gives maximum AIC and BIC values. 相似文献
Parallel machines are extensively used to increase computational speed in solving different scientific problems. Various topologies with different properties have been proposed so far and each one is suitable for specific applications. Pyramid interconnection networks have potentially powerful architecture for many applications such as image processing, visualization, and data mining. The major advantage of pyramids which is important for image processing systems is hierarchical abstracting and transferring the data toward the apex node, just like the human being vision system, which reach to an object from an image. There are rapidly growing applications in which the multidimensional datasets should be processed simultaneously. For such a system, we need a symmetric and expandable interconnection network to process data from different directions and forward them toward the apex. In this paper, a new type of pyramid interconnection network called Non-Flat Surface Level (NFSL) pyramid is proposed. NFSL pyramid interconnection networks constructed by L-level A-lateral-base pyramids that are named basic-pyramids. So, the apex node is surrounded by the level-one surfaces of NFSL that are the first nearest level of nodes to apex in the basic pyramids. Two topologies which are called NFSL-T and NFSL-Q originated from Trilateral-base and Quadrilateral-base basic-pyramids are studied to exemplify the proposed structure. To evaluate the proposed architecture, the most important properties of the networks are determined and compared with those of the standard pyramid networks and its variants. 相似文献
The world of information technology is more than ever being flooded with huge amounts of data, nearly 2.5 quintillion bytes every day. This large stream of data is called big data, and the amount is increasing each day. This research uses a technique called sampling, which selects a representative subset of the data points, manipulates and analyzes this subset to identify patterns and trends in the larger dataset being examined, and finally, creates models. Sampling uses a small proportion of the original data for analysis and model training, so that it is relatively faster while maintaining data integrity and achieving accurate results. Two deep neural networks, AlexNet and DenseNet, were used in this research to test two sampling techniques, namely sampling with replacement and reservoir sampling. The dataset used for this research was divided into three classes: acceptable, flagged as easy, and flagged as hard. The base models were trained with the whole dataset, whereas the other models were trained on 50% of the original dataset. There were four combinations of model and sampling technique. The F-measure for the AlexNet model was 0.807 while that for the DenseNet model was 0.808. Combination 1 was the AlexNet model and sampling with replacement, achieving an average F-measure of 0.8852. Combination 3 was the AlexNet model and reservoir sampling. It had an average F-measure of 0.8545. Combination 2 was the DenseNet model and sampling with replacement, achieving an average F-measure of 0.8017. Finally, combination 4 was the DenseNet model and reservoir sampling. It had an average F-measure of 0.8111. Overall, we conclude that both models trained on a sampled dataset gave equal or better results compared to the base models, which used the whole dataset. 相似文献
This paper deals with defining the concept of agent-based time delay margin and computing its value in multi-agent systems controlled by event-triggered based controllers. The agent-based time delay margin specifying the time delay tolerance of each agent for ensuring consensus in event-triggered controlled multi-agent systems can be considered as complementary for the concept of (network) time delay margin, which has been previously introduced in some literature. In this paper, an event-triggered control method for achieving consensus in multi-agent systems with time delay is considered. It is shown that the Zeno behavior is excluded by applying this method. Then, in a multi-agent system controlled by the considered event-triggered method, the concept of agent-based time delay margin in the presence of a fixed network delay is defined. Moreover, an algorithm for computing the value of the time delay margin for each agent is proposed. Numerical simulation results are also provided to verify the obtained theoretical results. 相似文献