KEYWORD EXTRACTION STRATEGY FOR ITEM BANKS TEXT CATEGORIZATION |
| |
Authors: | Atorn Nuntiyagul Kanlaya Naruedomkul Nick Cercone Damras Wongsawang |
| |
Affiliation: | Institute for Innovation and Development of Learning Process, Mahidol University, Thailand; Department of Mathematics, Faculty of Science, Mahidol University, Thailand; Faculty of Computer Science, Dalhousie University, Canada; Department of Computer Science, Faculty of Science, Mahidol University, Thailand |
| |
Abstract: | We proposed a feature selection approach, Patterned Keyword in Phrase ( PKIP ), to text categorization for item banks. The item bank is a collection of textual question items that are short sentences. Each sentence does not contain enough relevant words for directly categorizing by the traditional approaches such as "bag-of-words." Therefore, PKIP was designed to categorize such question item using only available keywords and their patterns. PKIP identifies the appropriate keywords by computing the weight of all words. In this paper, two keyword selection strategies are suggested to ensure the categorization accuracy of PKIP. PKIP was implemented and tested with the item bank of Thai high primary mathematics questions. The test results have proved that PKIP is able to categorize the question items correctly and the two keyword selection strategies can extract the very informative keywords. |
| |
Keywords: | item bank feature selection patterned keywords in phrase text categorization keyword extraction |
|
|