An optimized sequential pattern matching methodology for sequence classification |
| |
Authors: | Themis P Exarchos Markos G Tsipouras Costas Papaloukas Dimitrios I Fotiadis |
| |
Affiliation: | (1) Department of Medical Physics, Medical School, University of Ioannina, 45110 Ioannina, Greece;(2) Department of Computer Science, Unit of Medical Technology and Intelligent Information Systems, University of Ioannina, PO Box 1186, 45110 Ioannina, Greece;(3) Department of Biological Applications and Technology, University of Ioannina, 45110 Ioannina, Greece |
| |
Abstract: | In this paper we present a novel methodology for sequence classification, based on sequential pattern mining and optimization
algorithms. The proposed methodology automatically generates a sequence classification model, based on a two stage process.
In the first stage, a sequential pattern mining algorithm is applied to a set of sequences and the sequential patterns are
extracted. Then, the score of every pattern with respect to each sequence is calculated using a scoring function and the score
of each class under consideration is estimated by summing the specific pattern scores. Each score is updated, multiplied by
a weight and the output of the first stage is the classification confusion matrix of the sequences. In the second stage an
optimization technique, aims to finding a set of weights which minimize an objective function, defined using the classification
confusion matrix. The set of the extracted sequential patterns and the optimal weights of the classes comprise the sequence
classification model. Extensive evaluation of the methodology was carried out in the protein classification domain, by varying
the number of training and test sequences, the number of patterns and the number of classes. The methodology is compared with
other similar sequence classification approaches. The proposed methodology exhibits several advantages, such as automated
weight assignment to classes using optimization techniques and knowledge discovery in the domain of application.
|
| |
Keywords: | Sequential pattern mining Sequential pattern matching Sequence classification Optimization |
本文献已被 SpringerLink 等数据库收录! |
|