Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Enriching documents with examples: a corpus mining approach
Kim J., Lee S., Hwang S., Kim S. ACM Transactions on Information Systems31 (1):1-27,2013.Type:Article
Date Reviewed: Aug 13 2013

For a compiler fan, it’s great seeing a system that parses code into abstract syntax trees (ASTs) to find, relate, and then generate semantically relevant and illustrative code samples. This paper describes a new data mining approach, called eXoaDocs, which automatically generates and relates these code samples to application programming interface (API) program descriptions, resulting in enriched example-based programming documents.

By automatically creating semantically relevant code samples, the authors’ system omits irrelevant code, but also organizes based on various criteria such as representativeness, frequency, conciseness, and correctness. Their browser also supports popularity ranking to help end users find the best code examples.

This extensive paper provides detailed descriptions of their algorithms for organizing code samples, while contrasting clustering, ranking, and hybrid approaches. Although other successful documentation approaches rely on manually developed, high-quality examples, when dealing with massive magnitudes of code, an automation approach would be valuable. eXoaDocs is compared to other code search engines and documentation approaches. As a test, it was run on the extensive Java Development Kit (JDK) 5 source. Illustrative code documentation samples were generated for 75 percent of the code (27,000 methods). In contrast, the traditional Java documents (JavaDocs) toolset only generated illustrative samples for 2 percent of the same code.

To validate their approach, a user study was conducted where numerous students were given sample problems to program. Those that had access to the eXoaDocs semantic examples had measurable productivity gains.

The authors also nicely identify areas where the validity of their approach could be threatened, but it looks like this approach could play a role in future code documentation and browsing tools.

Reviewer:  Scott Moody Review #: CR141459 (1310-0928)
Bookmark and Share
  Featured Reviewer  
 
Data Mining (H.2.8 ... )
 
 
Search Process (H.3.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Data Mining": Date
Feature selection and effective classifiers
Deogun J. (ed), Choubey S., Raghavan V. (ed), Sever H. (ed) Journal of the American Society for Information Science 49(5): 423-434, 1998. Type: Article
May 1 1999
Rule induction with extension matrices
Wu X. (ed) Journal of the American Society for Information Science 49(5): 435-454, 1998. Type: Article
Jul 1 1998
Predictive data mining
Weiss S., Indurkhya N., Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998. Type: Book (9781558604032)
Feb 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy