[Abstract]

Clustering Documents Using Structural Similarity Based on Case Sets ---Applied for Technological Problems from Patents---

Hitomi Yanaka and Yukio Ohsawa (Department of Systems Innovation, Faculty of Engineering, University of Tokyo, Japan)



The description of technological problems in patent documents is important to understand the motivation of the invented technology. Understanding the motivation helps us to analyze trends of the technologies contained in a set of patent documents. Here, we approach the classification of documents based on analogy of structures of the problem descriptions. The purpose of this study is to develop a method for patent classification, with the use of hierarchical clustering based on the structural similarity of problems to be solved by the patented invention. First, we present an approach for extracting predicate-argument structures in the contents of patents. Second, we propose the similarity function to measure the structural similarity between the case sets. The result of the questionnaire survey showed that the structural similarity between patent documents can be calculated with the use of the predicate-argument structures. Furthermore, the survey indicated that comprehension of document structures can be increased by reading the documents reconstructed by the predicate-argument structures.