Deep learning systems, driven by the success of deep neural networks (DNNs), are used in a wide variety of application areas, including safety-critical areas such as medical diagnostics and autonomous vehicles. However, DNN testing lags behind the testing of more conventional safety-critical systems.
This paper presents and evaluates a set of testing criteria for deep learning systems that measure testing adequacy. Experimental results indicate that the criteria are capable of distinguishing the effectiveness of different adversarial techniques and also have the potential to enable analysis of the internal states of DNNs.
This paper is likely to be of interest to readers concerned with DNN experiments, as well as readers with an interest in developing engineering processes for safety- or business-critical systems that incorporate deep learning systems. The work presented provides a promising basis on which to develop automated testing for such systems.