It is well know that algebraic or graphical constructs can be used to represent objects and relationships in automatic document processing. For example, given a collection of n documents identified by a set of m index terms, the assignment of terms to documents can be represented by a matrix of dimension n by m, where each row stands for a document and each column stands for the assigment of one particular term to the various document of the collection. In particular, the ij-th matrix element then represents the weight, or importance, of term j assigned to document i, a weight of O being used to represent the complete absence of a term from a given document [1, 2].
Given such an n by m document-term matrix, one can then build on n by n document-document relationship matrix by pairwise comparison of the similarity between distinct rows of the original matrix. Similarly, one can obtain an m by m term-term relationship matrix by pairwise comparison of distinct columns of the original matrix. The document-document similarity matrix can, in turn, be used to obtain document clusterings; the term-term matrix is used to generate term similarity, or thesaurus, classes.
Obviously, a matrix can be transformed into a grah so that graph manipulations can be used to replace the matrix operations described earlier. The paper under revie attempts to go further by considering a large variety of additional algebraic transformations -- for example, matrix transpositions, inversions, multiplications, permutations, and so on -- and suggests that different algebraic forms represent different aspects or qualities of the information. All of this may well be true, but this reviewer is disappointed in not finding in this study any particular example or methodology. Hence, this is mostly an “idea” paper which makes a (possibly reasonable) suggestion that is never pursued or examined in any depth.
.revie