Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Distrust seed set propagation algorithm to detect web spam
Goh K., Patchmuthu R., Singh A. Journal of Intelligent Information Systems49 (2):213-235,2017.Type:Article
Date Reviewed: Dec 28 2017

The Internet has become an integral part of the infrastructure of modern society. There have been over one billion websites on the web. To locate webpages closely related to one’s interests, people commonly employ handy search engines. While web search engines are very helpful for users, there are web spammers who try to manipulate search engine ranking algorithms in order to raise webpage position in search results. Web spam wastes not only the time of users, but also search engine resources. In the worst case, it can lead users to malicious content that can install malware on the victim’s machine.

Web spam detection methods have been developed for about two decades. Spam detection algorithms can be categorized as link-based and content-based. With the assumption that all pages under a spam host are spam, the technique presented in this paper operates at the host level. By placing a distrust seed (or a set of distrust seeds) to propagate a host network, the proposed algorithm iteratively updates the normalized distrust score for each node. The distrust scores are initialized to zero, except from the distrust seeds. The authors reported their experiments on two datasets: WEBSPAM-UK 2006 and WEBSPAM-UK 2007. With the distrust seed propagation on three available web spam detection algorithms, they claimed that the experiments identified 17.73 percent and 8.59 percent more spam hosts on the two test datasets than without it. The improvements are somewhat remarkable.

This paper presents a specific technique, distrust seed propagation, for web spam detection. It is worthwhile to read, especially for practitioners who work in the field. The battle between web spam and its detection will never stop, just like the endless development of spears and shields with increasing sophistication. Beyond web spam, there are email spam, phishing, and other online attacks through the Internet. Surveys consistently indicate that Internet spam has been a major problem. Antispam algorithms are very much needed, not only for efficiency and productivity, but also for Internet security and privacy.

Reviewer:  Chenyi Hu Review #: CR145738 (1802-0099)
Bookmark and Share
  Featured Reviewer  
 
Information Search And Retrieval (H.3.3 )
 
 
Search Process (H.3.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Information Search And Retrieval": Date
Nested transactions in a combined IRS-DBMS architecture
Schek H. (ed)  Research and development in information retrieval (, King’s College, Cambridge,701984. Type: Proceedings
Nov 1 1985
An integrated fact/document information system for office automation
Ozkarahan E., Can F. (ed) Information Technology Research Development Applications 3(3): 142-156, 1984. Type: Article
Oct 1 1985
Access methods for text
Faloutsos C. ACM Computing Surveys 17(1): 49-74, 1985. Type: Article
Jan 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy