Gregg et al. present results indicating that it might be more efficient to implement a Java Virtual Machine (JVM) by translating the bytecode instructions from a stack-based instruction set to a register-based set. To test their ideas, they develop a translator within the CVM Java implementation, and count executed instructions and bytecode loads for the Standard Performance Evaluation Corporation benchmarks. Their results show that the translated programs execute 34.88 percent fewer instructions, though the number of bytecode loads increased by 44.81 percent.
There is a nice discussion of virtual machine (VM) interpreters that discusses some fine points of coding in C. The authors assert that VM interpreters spend more time with instruction dispatch than in the intended computation. This discussion is quite good and includes an explanation of how to do threaded dispatch using the GNU compiler collection to improve efficiency.
The authors discuss how their register machine operates and how the translation from stack to register instructions is performed. This translation involves optimization based on static analysis of the control flow to convert some sequences of stack operations into a shorter sequence of register instructions.
The paper is well written and contains some useful ideas. It would be interesting to compare execution times for the alternatives considered. For example, one must wonder how much faster threaded dispatch is compared with using a switch statement. Likewise, it would be nice to know how the execution times for the register-based CVM compare to the original. The authors present an interesting case, but do not provide enough evidence to convince me that using registers is superior. The issue is complex and, most likely, good arguments and evidence could be presented for both views.