
In the last few decades, the number of patents registered has increased very rapidly. This book discusses some of the challenges in searching for patent information and some of the most advanced searching mechanisms available today. The methods of patent searching, the classifications of patents, the automatic tools associated with patent searching, current research in patent searching, and subject-specific patent searching are discussed.
The book, a collection of papers, is divided into five parts. The first part is an introduction. Chapter 1 introduces patent searching with respect to types of patent searches, such as searching for evidence of use, novelty, infringement, and validity. It also introduces classification-based, full-text, and citation searches. The chapter ends by discussing special patent searches, including those for chemical structures, biotechnology, and engineering drawings. Chapter 2 discusses various aspects of information retrieval (IR) systems, such as choice based on user evaluation, system characteristics (indexing, query, and result presentation), and models.
The second part addresses evaluation. Chapter 3 describes system-based evaluations for general IR systems and evaluation measures based on precision and recall ratios, which are calculated from the number of relevant documents retrieved from test data. It also describes evaluation models for user efforts. Chapter 4 is fully devoted to the CLEF-IP track, which investigates IR techniques in the patents domain. The detailed description includes the summary and evaluation of submissions for the track. Chapter 5 describes chemistry-related patent retrieval. The Text Retrieval Conference--Chemistry (TREC-CHEM) track results are described in detail. Chapter 6 discusses the evaluation of real patent retrievals and potentially misleading evaluations from laboratory tests.
The third part discusses high-recall problems and solutions. Chapter 7 develops a measure called “retrievability.” A document with low retrievability is likely to be very difficult to find. The chapter presents details of the experiments done on retrievability. Chapter 8 discusses the effectiveness of the result sets of retrievals. The effectiveness measure depends on recalls and precision in the sample documents. The sampling approaches--and the limitations--are discussed in detail.
Chapter 9 presents logical modeling of a patent search. It describes the representation of patent documents using probabilistic object relational content modeling (PORCM) and a method of searching for such documents. Chapter 10 addresses a method for improving patent claim searches by decomposition, describing ways of decomposing independent claims and dependent claims; it ends by evaluating the method. Chapter 11 presents a graphical user interface prototype implementation for patent searching and discusses various features of the interface, including support from multiple retrieval facilities.
The fourth part of the book is on classifications. Chapter 12 describes various aspects of automated patent classification, including objective, classification-induced issues (distribution and language based), technologies, evaluation, accuracy, and scalability. Chapter 13 proposes a natural language technique to categorize the patent documents. The technique centers on the concept of “aboutness” dependency, in which “aboutness” is what the document is about. The experimental results are also discussed. Chapter 14 presents results of some experiments on classifications; the experiments consist of the examination of classification hierarchies, the weighting of class, and primary classification codes.
The fifth part focuses on the semantic search for patents. Chapter 15 describes a tool called GATE Mimir, which implements annotations in the documents such as section annotations, reference annotations, and measurement annotations. Chapter 16 describes a new search tool for searching scientific documents; the tool utilizes natural language processing techniques. Chapter 17 discusses chemical structure searching. The discussion features representation, specific chemical molecule searching, similarity searching, and so on. Chapter 18 examines a patent taxonomy integration and interaction framework that uses the Wikipedia science ontology and enriches it. Chapter 19 proposes automatic translation of scholarly terms into patent terms and discusses experimental results for the method. The last chapter discusses possible future directions in patent searching.
Each paper in this collection can be read independently. The book smoothly flows from the introduction of patent searching through problems of patent searching and proposals for solving these problems. A normal understanding of intellectual property and searching is enough to understand most parts of the book. The book serves as an introduction and a reference for anybody interested in patents; it can be used as a textbook for patent information retrieval. Researchers in information retrieval will find the papers and bibliographies useful.