Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Best of 2012 Recommended by Editor Recommended by Reviewer Recommended by Reader
Search
Unsupervised learning of morphology
Hammarström H., Borin L.  Computational Linguistics 37 (2): 309-350, 2011. Type: Article
Date Reviewed: Jan 16 2012

Work on the induction of morphological information from texts is surveyed in this paper. This overview considers only systems that accept as input raw text--that is, unannotated natural language text--and produce as output a description of the morphological structure of the language using as little supervision as possible.

For the purposes of the paper, the authors define a hierarchy of morphological analysis that has as its base a “justification”--a linguistically informed motivation for the morphological description of the language and, at the top, a list of the affixes of the language. The actual segmentation of words into stem and affixes sits in the middle of this hierarchy.

Following a general introduction to the subject, the paper proceeds to a historical survey and motivation for unsupervised learning of morphology (ULM), starting with the work of Zellig Harris. This is followed by a section titled “Trends and Techniques in ULM,” which contains a table that forms a road map (described as brief, even though it covers more than two dense pages) to many of the early studies of ULM. This is followed by surveys of four principle approaches based on the following: border and frequency--where segmentation borders are deduced on the basis of substrings that occur with a variety of adjacent substrings; group and abstract--where words are first grouped according to some metric such as edit distance; features and classes--where a word is viewed as a set of features, for example, n-grams; and phonological categories and separation--where the phonemes of a word may be classed into categories such as vowels and consonants. The authors point out that, regrettably, there has been little cross-fertilization between these approaches.

The penultimate section discusses, among other topics, the language dependence of ULM and ULM’s relation to semantics, and addresses this question: Is ULM of any use? A brief subsection on future directions suggests areas where high-accuracy systems might emerge.

The authors conclude that ULM has made progress, but that there is a long way to go. The paper contains an extensive bibliography with over 250 entries. For anyone interested in finding out more about ULM, this paper is an excellent starting place.

Reviewer:  J. P. E. Hodgson Review #: CR139780 (1206-0619)
Bookmark and Share
  Editor Recommended
Featured Reviewer
 
 
Web-Based Services (H.3.5 ... )
 
 
Learning (I.2.6 )
 
 
Linguistics (J.5 ... )
 
 
Natural Language Processing (I.2.7 )
 
 
Online Information Services (H.3.5 )
 
Would you recommend this review?
yes
no
Other reviews under "Web-Based Services": Date
Foundations for the Web of information and services: a review of 20 years of semantic Web research
Fensel D.,  Springer Publishing Company, Incorporated, New York, NY, 2011. 416 pp. Type: Book (978-3-642197-96-3)
Aug 24 2012
The science of service systems
Demirkan H., Spohrer J., Krishna V.,  Springer Publishing Company, Incorporated, New York, NY, 2011. 382 pp. Type: Book (978-1-441982-69-8)
May 16 2012
Business aspects of Web services
Weinhardt C., Blau B., Conte T., Filipova-Neumann L., Meinl T., Michalk W.,  Springer Publishing Company, Incorporated, New York, NY, 2011. 208 pp. Type: Book (978-3-642224-46-1)
Apr 18 2012
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2013 ThinkLoud, Inc.
Terms of Use
| Privacy Policy