This paper presents a method designed to improve the execution time of an algorithm which has a decision table as its input and an optimal decision tree as its output. The author describes an algorithm based on dynamic programming to construct an optimal decision tree which requires O(Nlog 3log N) comparison operations. It is the author’s objective to present a method of viewing the optimal decision tree problem which will reduce the execution time of the algorithm by 80 percent.
The algorithm improvement is accomplished by introducing the notion of an optimal variable. An optimal variable is a variable which is at the root of an optimal decision tree. In relation to optimal variables, quasi-decisive (q-d) variables, variables which are decisive except for a single branch of their decision tree, are defined. The optimal variable theorem states that q-d variables are optimal under the strong equivalence assumption made for q-d variables. A proof by induction of the optimal variable theorem is presented. The optimal variable theorem is then used to reduce the constant size of the execution time of the optimal decision tree construction algorithm.
The author also discusses related topics, such as the extension of the optimal variable theorem to m-ary decision trees and the definition of an algorithm for the optimization of quasi-decisive logic chains, which requires O(N log N) operations.
The author achieves his objectives in the paper by providing a clear presentation, several examples, and a detailed explanation of how the optimal variable theorem is defined, presented, and applied. However, because of the technical nature of the material, a thorough understanding of the data structures, mathematics, and analysis of algorithm principles involved is required in order to read this paper. The paper is well referenced; however, a prior familiarity with most of the references is not required of the reader.