Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Dynamic coalescing for 16-bit instructions
Krishnaswamy A., Gupta R. ACM Transactions on Embedded Computing Systems4 (1):3-37,2005.Type:Article
Date Reviewed: Aug 15 2006

It seems that several important processor designs have been done without consideration of the poor compiler, in charge of exploiting the capabilities of the processor to reach its peak performance. I’m thinking, for example, of the IA-64 Intel processor. This paper addresses this problem, discussing both hardware processor design and compiling techniques. Here, the processor is the ARM family. This is one of the most ubiquitous microprocessors, used in embedded products like the iPod, the Playstation, mobile phones, camcorders, pocket personal computers (PCs), and so on.

If one remembers that “more than 98 percent of all microprocessors are used in embedded products,” obviously it is extremely important to improve their performance as much as possible. The constraints, in comparison to processors used in computers, are mostly related to energy and memory savings. However, these savings should not be attained at the expense of speed.

The ARM family uses a 32-bit instruction set, but in order to save memory and energy, it also uses a 16-bit instruction set, properly named “Thumb.” As the authors demonstrate, using Thumb code results in a code size reduction of about 30 percent, but also in a three-fold increase in the number of instructions to execute. Thus, the code is slower, and the energy savings is much lower than expected.

In order to correct this, the authors have designed an enhancement to the Thumb instruction set called augmenting extensions (AX). These instructions are handled in the decode stage of the processor, and thus they don’t use a cycle in the pipeline. Every one is coalesced with the following Thumb instruction, yielding an ARM instruction. This has the advantage of reducing the number of Thumb instructions to be generated by the compiler and executed by the processor (an ARM instruction does more work than a Thumb one). Thus, there are gains in speed, energy savings, and memory usage.

The bulk of the paper is devoted to explaining needed modifications to the hardware, as well as to the compiling techniques needed to generate the code. For example, in some cases, the instructions in the two branches of an if-then-else construct must be generated by pairs, one for the true part and one for the false part, which is uncommon.

Despite a few typographical errors, the paper is well written and pleasant to read. The presented results are convincing. Whether the ideas will actually be used remains to be seen.

Reviewer:  O. Lecarme Review #: CR133183
Bookmark and Share
  Featured Reviewer  
 
Processor Architectures (C.1 )
 
 
Compilers (D.3.4 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Processor Architectures": Date
Computer design and architecture (3rd ed.)
Shiva S., Marcel Dekker, Inc., New York, NY, 2000.  718, Type: Book (9780824703684)
Aug 1 2000
Understanding the energy efficiency of simultaneous multithreading
Li Y., Brooks D., Hu Z., Skadron K., Bose P.  Low power electronics and design (Proceeding of the 2004 Iinternational Symposium on Low Power Electronics and Design, Newport Beach, California, USA, Aug 9-11, 2004)44-49, 2004. Type: Proceedings
Nov 23 2005
Understanding co-running behaviors on integrated CPU/GPU architectures
Zhang F., Zhai J., He B., Zhang S., Chen W. IEEE Transactions on Parallel and Distributed Systems 28(3): 905-918, 2017. Type: Article
Aug 8 2017
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy