Generating Links for Patent Documents: an Automatic Approach using Computational Intelligence
Abstract
Patents are organized into classification systems, which assist offices and users in the process of seeking and retrieving such documents. A wide variety of users use the patent systems and the information contained in these documents. However, patents are complex legal documents with a significant number of technical and descriptive details, which makes it difficult to identify and analyze the information contained in these documents. An automatic link system associated with some of the terms found in the patents would provide quick access to the concepts contained in specific knowledge bases. This work presents results of a project in which the objective is the automatic generation of links in patent documents. The experiments were conducted with four subgroups of the United States Patent and Trademark Office (USPTO), which uses the Cooperative Patent Classification (CPC) system. As the patent documents did not have keywords, the meaningful terms were selected using the algorithm X2, for which the contents of the entire patent document were used. Some keywords with more than one meaning were disambiguated using a especific algorithm, generating a file with useful information used in the experiments. The links were generated based on Wikipedia articles and the USPTO patent database. The use of the patent database as a possible destination for the link is intended to cover cases in which Wikipedia has no articles on certain terms and also to provide an alternative source that may assist readers in understanding those documents. It is expected, with the creation of automated links, to make it easier to access concepts related to the terms presented by the documents and to understand the information disclosed by the inventors.