The authors discuss an algorithm for high radix division, and provide a hardware implementation.
Division by digit recurrence produces one digit of the quotient per iteration. Thus, for a given precision, a higher radix results in fewer iterations and a potentially faster execution. However, as the radix increases, digit selection becomes more complicated, which increases the cycle time and can overcome the reduction in the execution time. One way to achieve speedup with larger radixes is to simplify the digit selection function by prescaling the divisor.
The authors present an algorithm, and implementation, that increases the effective radix of the very high radix approach presented in their earlier work [1]. They accomplish this by obtaining a few additional bits of the quotient per iteration without increasing either the complexity of the module used to obtain the scaling factor or the iteration delay. For some values of the effective radix, their approach results in a significant reduction in the area of the module to compute the prescaling factor, compared to their original scheme. Estimations of the execution time and area are given for 54-bit and 114-bit quotients.
These results are interesting and practical.