The authors of this paper attract the attention of the readers in a specific way, by claiming that “much of the health-related information and advice available online is inaccurate and/or misleading.” The paper states: “Scores of medical institution web sites are for organizations that do not exist and more than 90 percent of online pharmacy web sites are fraudulent.” The authors point to a World Health Organization (WHO) report, which claims that “approximately half the drugs sold on the web are counterfeit, resulting in thousands of deaths.” No doubt the shocked reader will want to learn more about this serious problem.
The paper starts by defining the main types of fake web sites: spoofs, which mimic legitimate web sites to steal ID information; concocted sites, which copy legitimate commercial entities with the aim to commit failure-to-ship fraud; and web spam sites, which use links or web content to rank some commercial incentives. To address the lack of initiatives to protect users from these hazards, the authors present recursive trust labeling (RTL), an adaptive learning algorithm. This algorithm detects fake medical web sites using a content and graph classifier that exploits the typical linkage tendencies seen on medical portals. Experimental results reveal that the algorithm outperforms adaptive learning methods and meta-learning strategies.
The paper provides an interesting overview of existing “content-based methods for fake web site detection”; presents single-class propagation algorithms; and discusses the “unsupervised and dual-class propagation methods employed in this study.” The authors also discuss RTL, the “content classifier’s fraud cue extractor,” and the content-based site classifier. The test bed is described in great detail before reaching the evaluation process. The overall accuracy of RTL exceeds that of ten other content methods by ten percent, and bests the graph algorithms in detecting fake web sites by almost 40 percent.
The paper is very interesting. I recommended it not only to academia, medical students, and the research community working in the field, but especially to patients to help them avoid the worst mistakes that can happen.