Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
FPGA acceleration of RankBoost in Web search engines
Xu N., Cai X., Gao R., Zhang L., Hsu F. ACM Transactions on Reconfigurable Technology and Systems1 (4):1-19,2009.Type:Article
Date Reviewed: Jun 1 2009

Web search engines have changed our lives. The possibility of introducing a search term and having millions of results in less than a second puts an amazing amount of knowledge in our hands. One of the main problems in facing this incredible amount of data is classifying it according to a criterion. This criterion is usually called the relevance of the results, and its visible part is the ordered results returned by the Web search engine.

Obviously, the algorithms used to classify all this information are very complex and require enormous computational power. All of these algorithms, including the MSN one presented in this work and the ultra-secret Google one, are machine learning algorithms that use a large-scale training set. As the authors point out, processing this data can take days, so they decided to create hardware accelerators to speed up the processing.

The paper is divided into four sections. In Sections 1 and 2, the authors describe the RankBoost algorithm for Web search relevance, which will be accelerated later using custom hardware. The other sections give an overview of two accelerator boards and the hardware designed to run in the field-programmable gate array (FPGA).

Sections 1 and 2 introduce and describe the RankBoost algorithm. The presentation is based on pseudocode and a set of references to other theoretical work. For those of us who are not mathematicians or experts in artificial intelligence (AI), it is quite hard to follow. The explanations tend to be detailed, but they lack a lot of information that is not at all trivial. Considering that the journal this paper is published in focuses on reconfigurable hardware, it would have been better to have a lighter introduction emphasizing the part to be implemented in the hardware, rather than a very detailed one that is incomplete and difficult to understand. As far as I am concerned, this is the only negative point of the paper.

The next sections are clear. They present the implementation of the M3.int algorithm, which is one of the steps of the RankBoost Web search relevance algorithm’s WeakLearn procedure. After a detailed explanation of the algorithm, the authors present pseudocode that adds and accumulates, in order to compute an integral of the histogram. Since this part of the RankBoost algorithm uses 99 percent of central processing unit (CPU) time, they implement it in the FPGA.

Section 4 shows the internal architecture of the M3.int accelerator on the STAR-III board, developed by the authors. The M3.int accelerator is very simple; it is just a lot of floating point adders, working in parallel, that input data from a double data rate (DDR) memory through a peripheral component interconnect (PCI) local bus interface, and perform several additions to calculate the integral histogram. Obviously, the DDR size and speed and the PCI interface are bottlenecks of the system, so the authors develop a new board named FAR2. The FAR2 board is similar, but it includes 16 GB of DDR2 memory instead of STAR-III’s 2 GB DDR, and it implements a PCIe interface up to 4 GB per second, to send and receive data from the host at high rates. The FPGA hardware is the same, except for one modification to the DDR memory interface to support DDR2 memory and add the PCIe controller.

The extensive results section presents many experiments and a performance model to estimate the speed of the accelerator. The results show that the training speed of the MSN search engine is accelerated by two or three orders of magnitude, which is an excellent result. The STAR-III board obtains 170.6 times acceleration and the FAR2 board achieves an acceleration rate of 1,800 times.

The main contribution of the paper is the application of the FPGA technology to a new field. The FPGA acceleration of Web services is a very interesting trend and should become the starting point of a new field of investigation for the community. From a technological point of view, the work is not very exciting. Since it is an example of excellent engineering work, the internal architecture of the system is very simple and does not present any advances in FPGA design.

Reviewer:  Javier Castillo Review #: CR136887 (1001-0043)
Bookmark and Share
  Reviewer Selected
 
 
Algorithms Implemented In Hardware (B.7.1 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Algorithms Implemented In Hardware": Date
The performance of multilective VLSI algorithms
Savage J. Journal of Computer and System Sciences 29(2): 243-273, 1984. Type: Article
Dec 1 1985
Proving systolic systems correct
Hennessy M. ACM Transactions on Programming Languages and Systems 8(3): 344-387, 1986. Type: Article
Jul 1 1987
Algorithms for iterative array multiplication
Nakamura S. IEEE Transactions on Computers 35(9): 713-719, 1986. Type: Article
Jul 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy