Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Rethinking regex engines to address ReDoS
Davis J.  ESEC/FSE 2019 (Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Tallinn, Estonia, Aug 26-30, 2019)1256-1258.2019.Type:Proceedings
Date Reviewed: Dec 31 2020

Regular expressions (regexes) are widely used across computer science applications and software components. However, not all programming languages are immune to regular expression denial-of-service (ReDoS) attacks. These algorithmic complexity attacks target the core of the regex engines by chewing up the resources with regex evaluations that take super-linear time to operate, thereby denying invaluable resources to legitimate clients. The paper discusses the addition of a caching mechanism to Spencer’s regex engine implementation that practically reduces most super-linear regex evaluations to linear evaluations. This drastically reduces the attack surface area for ReDoS and also improves performance for most evaluations. Software designers delivering modules that rely heavily on regex evaluations should find these conclusions interesting.

Popular programming languages like Python, JavaScript, and Ruby use regex engines that implement Spencer’s algorithm. The worst-case time complexity for Spencer’s is super-linear for 19 to 38 percent of regular expressions. Nevertheless, 95 percent of these super-linear regexes can be evaluated linearly. On the one hand, language developers prefer Spencer’s for extensibility; software engineers, on the other hand, rely on heavyweight regexes. Hence there is a need to make these regex engines safe from ReDoS.

The paper describes the method and prototype results for adding a cache “in a simple backtracking regex engine implementation published by Cox.” While Spencer’s method blows up with super-linear regexes, the cache-based implementation performs linearly in the worst case. Though a significant number of real regexes depict worst-case super-linear behavior, it is entirely possible to support these at linear time with a modest space cost.

Reviewer:  Subash Tirupachur Comerica Review #: CR147150 (2105-0117)
Bookmark and Share
 
Security and Protection (D.4.6 )
 
 
General (D.2.0 )
 
Would you recommend this review?
yes
no
Other reviews under "Security and Protection": Date
Practical UNIX security
Garfinkel S., Spafford G., O’Reilly & Associates, Inc., Sebastopol, CA, 1991. Type: Book (9780937175729)
Jun 1 1992
Trusted products evaluation
Chokhani S. Communications of the ACM 35(7): 64-76, 1992. Type: Article
Oct 1 1993
An experience using two covert channel analysis techniques on a real system design
Haigh J., Kemmerer R., McHugh J., Young W. IEEE Transactions on Software Engineering SE-13(2): 157-168, 1987. Type: Article
Nov 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy