Symbol spotting is the problem of recognizing graphic symbols, such as pictographs, logos, and other kinds of specialized graphical notation, that may be used in engineering drawings, architectural drawings, blueprints, and cartography. “Spotting” indicates that the problem is restricted to the location of such information, without analyzing the entire context. As one aspect of the general problem of graphics recognition, it receives a certain amount of attention in research concerned with computer vision and pattern recognition, and in specialized conferences such as the Workshops on Graphics Recognition. Symbol spotting is important to information retrieval (IR), document analysis, and modern cartography.
In this monograph, the authors describe a framework and approaches to various facets of the symbol spotting problem, in the context of its application to categorizing documents. The book begins with an introductory chapter that defines the nature and scope of the subject, followed by a survey of current techniques. Chapter 3 deals with spotting techniques that use photometric descriptors and recognition techniques derived from object recognition; these techniques are applied to classifying documents by recognizing the logos on them. Chapter 4 describes the use of vector shape signatures to identify elements present in engineering and architectural drawings. Chapter 5 discusses an approach based on string matching, which is designed to be less sensitive to artifacts and distortions introduced during digitization and vectorization. A relational indexing mechanism and symbol similarity queries are covered in chapter 6. Metrics and methodologies for performance analysis are addressed in chapter 7; it describes analysis using common IR metrics, such as precision and recall, adapting them for graphical information and symbol spotting, and then describing methodologies for evaluation. A concluding chapter provides a summary of the book’s concepts and techniques, along with an analysis of their limitations. An appendix describes the image databases used in the work.
Though symbol spotting may seem like a narrowly defined problem, it can be expected to play an increasingly important role in IR and processing, as one of the building blocks for future applications in document management, IR, and document analysis. This monograph is a clearly written and accessible survey and framework for the field, and it should prove useful to researchers and students in computer vision, pattern recognition, and related fields. It would also be instructive for advanced practitioners in these areas and the broader application domains of document management and IR. Although readers with a basic knowledge of computer vision will get the most from this book, it is not essential. References are provided at the end of each chapter for researchers who need or want to refer to the primary literature. In short, this book is a concise, readable, and informative resource on its specialized problem domain and related areas.