[Abstract]

Text Categorization with Considering Temporal Patterns of Term Usages

Hidenao Abe and Shusaku Tsumoto
Department of Medical Informatics, Shimane University, School of Medicine, Japan



In document categorization method by using similarity measures based on word vectors, it is important to determine key words to characterize each document. However, conventional methods select the key words based on their frequency or/and particular importance index such as tf-idf. In this paper, we propose a method to characterize each document by using temporal clusters of technical term usages. The method obtains document clusters based on the similarity between the document that are characterized by the temporal patterns of an importance index for considering temporal differences of the term usages In the experiment, we compare document categorization results based on document clustering by using the two types of feature sets about two sets of bibliographical documents. By regarding to the experimental results, we discuss the usefulness of the temporal patterns of term usages to characterize the documents.