An alternative approach for speech-augmented user interfaces, compared to the standard method of using a predefined set of commands, is described in this paper. Using this system, the user can speak words while performing an action, during an interaction with a Java application. These words are then associated with the action. After some training, the action can be triggered by the voice command alone.
The results of a small experiment conducted by the authors indicate that the speech recognition performance is severely degraded when the number of actions is increased from five to 25. This is surprising, as a similar paradigm with user-specified words is for example implemented in voice dialing by mobile phones, with a larger set of names well discriminated (but without a mapping to words as described in the paper).
The basic idea is therefore not new, but the exploitation of the Java abstract window toolkit (AWT) is a convincing application. The recognition results, however, are comparably low. Whether the use of speech is able to improve the handling of a program has regrettably not been tested.